My current thorn in the side has to deal with resource leaks in a large Java code base. My initial thoughts were, with efficient Garbage collection, wouldn’t any Java application have a low probability of leaking resources? The current GC implementation is efficient enough to keep up with typical loads on any application running within the JVM. I ran a small test the other day to determine how fast the GC in 1.4.2 can collect threads that an internet email server allocated and discarded. I am still collecting the metrics.
While I found that the GC is no doubt, efficient in terms of going about its task of collecting objects that are no longer referenced, It still cannot determine if a resource is no longer of any use to the application. The underlying assumption, that a resource maybe discarded if it has 0 references pointing to it, is often inadequate when dealing with human programmers.
Often down the usual code path, resource leaks are maintained at a bare minimum as developers take care all the file descriptors are shut and the objects are dereferenced. Down the error path, things start to get uglier. Thinking deeper into what might happen in a failure probably comes with experience. There is a praxis for this in “Practical Java Programming – Hagar”.
The resource leak I found was significant, it only happend when a failure caused the internet service to abruptly abandon an ongoing client session. This resulted in an input and output buffer remaining in the VM’s heap, unconsumed.
I may have found one leak, a couple of open questions still remain. Can I be sure that this was the last resource leak by design? The VM will never attempt to allocate that resource to another instance because it will never be freed. So just expecting a crash followed by a core dump is not going to to happen. The true face of the problem is encountered when I have to restart my application simply because it ends up consuming the system memory over time.
The best metric to measure the behavior of the application is to expect constant memory behavior from the VM as a whole. In other words, just hope that the GC should keep up with resource allocation. For example, in the case of the Internet email service, the bytes referenced would ideally behave as O(n) where n is the number of client sessions opened.
On having acheived some resolution on a theoretical bound on the memory allocated, I could also question resource over-allocation. Creating and destroying objects is an expensive operation and ideally, the number of bytes allocated over the lifetime of an application is almost always = number of bytes alive + some small constant. This is perhaps a longer term goal for any high-performance service that promotes the reusability of objects and is also probably too expensive to obtain in terms of programming time.