A Java memory leak isn’t about Java using too much memory; it’s about Java holding onto memory it no longer needs, preventing the garbage collector from reclaiming it.
Let’s look at a real-world scenario. Imagine a web application that caches user sessions. If sessions aren’t properly invalidated after a user logs out or after a timeout, the HttpSession objects, along with all the data they hold, will remain in memory indefinitely. The garbage collector sees these objects are still reachable (because the cache still holds a reference to them) and thus cannot free the memory. Over time, this accumulation leads to OutOfMemoryError: Java heap space.
Here’s how we diagnose and fix these leaks, starting with the most common culprits.
Common Cause 1: Unbounded Collections
A classic leak pattern is adding objects to a collection (like a HashMap, ArrayList, or ConcurrentHashMap) without ever removing them. This often happens in caches, logging buffers, or temporary data stores.
Diagnosis:
- Trigger the leak: Let the application run until it exhibits slow performance or throws an
OutOfMemoryError. - Take a heap dump: Use
jmapor your application server’s built-in tools.jmap -dump:format=b,file=heapdump.hprof <pid> - Analyze with Eclipse Memory Analyzer Tool (MAT): Load
heapdump.hprofinto MAT. - Identify large collections: Look for
java.util.HashMap,java.util.ArrayList,java.util.concurrent.ConcurrentHashMapinstances that are unusually large. Right-click on a suspect collection and select "List objects" -> "with outgoing references". Look for the number of elements. - Find the dominator tree: In MAT, use the "Dominator Tree" view. Sort by size. If a large collection is at the top, it’s a strong candidate. Expand it to see what objects it’s holding.
Fix:
Implement a mechanism to limit the size of the collection or to remove stale entries. For caches, this might involve using a LinkedHashMap with an accessOrder of true and overriding removeEldestEntry to remove items when a size limit is reached, or using a dedicated caching library like Guava Cache or Caffeine with proper eviction policies (e.g., expireAfterAccess, maximumSize).
// Example using LinkedHashMap for a simple size-limited cache
public class LRUCache<K, V> extends LinkedHashMap<K, V> {
private final int maxSize;
public LRUCache(int maxSize) {
super(maxSize, 0.75f, true); // accessOrder = true for LRU
this.maxSize = maxSize;
}
@Override
protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
return size() > maxSize;
}
}
Why it works: The removeEldestEntry method is called after an insert. If true is returned, the eldest entry (the least recently accessed in an LRU cache) is removed, preventing unbounded growth.
Common Cause 2: Static Field References
Static fields belong to the class, not to any particular object instance, and they live for the entire duration of the application’s lifecycle. If a static field holds a reference to an object that is no longer needed, that object (and anything it references) will never be garbage collected.
Diagnosis:
- Heap dump analysis (as above).
- Look for static fields: In MAT, examine the "Leak Suspects" report. It often points directly to objects held by static fields. Alternatively, you can manually inspect objects that are not being garbage collected. Find a large object that shouldn’t persist and use MAT’s "Path to GC Roots" feature. If the path includes a static field, that’s your leak.
Fix:
Ensure that static fields are cleared when the referenced object is no longer needed. This might involve setting the static field to null or replacing the referenced object with a new instance. If the static field is intended to hold a collection of temporary objects, ensure those objects are removed from the static collection when they are no longer required.
// Example of a problematic static collection
public class DataHolder {
public static List<String> temporaryData = new ArrayList<>();
public static void addData(String data) {
temporaryData.add(data);
}
// Missing: A method to clear temporaryData when no longer needed.
}
// Corrected approach:
public class DataHolder {
public static List<String> temporaryData = new ArrayList<>();
public static void addData(String data) {
temporaryData.add(data);
}
public static void clearData() {
temporaryData.clear(); // Or temporaryData = new ArrayList<>();
}
}
Why it works: By explicitly clearing the static collection or setting the static reference to null, you break the strong reference held by the class, allowing the garbage collector to reclaim the memory associated with the previously referenced objects.
Common Cause 3: Listener and Callback Leaks
When you register listeners or callbacks with objects that have a longer lifecycle, but forget to unregister them, the listener object (and anything it holds) can be kept alive by the object it’s listening to.
Diagnosis:
- Heap dump analysis (as above).
- Identify listener objects: In MAT, look for instances of your listener classes.
- Trace references: Use "Path to GC Roots" on a listener object. If the path leads through an object with a longer lifecycle (e.g., a singleton service, a UI component that persists), you’ve found a leak. Look for fields like
listeners,observers,callbackQueue.
Fix:
Always implement an unregistration mechanism. When a component that registered a listener is disposed or destroyed, it must call the corresponding removeListener or unregisterCallback method. If the listener itself is no longer needed, ensure it’s explicitly dereferenced.
// Example: A UI component registers a listener with a service
public class MyUIComponent {
private MyService service;
private MyListener listener;
public MyUIComponent(MyService service) {
this.service = service;
this.listener = new MyListener();
service.addListener(listener); // Registration
}
public void dispose() {
service.removeListener(listener); // Unregistration - CRITICAL
// Dereference other resources
}
private class MyListener { /* ... */ }
}
Why it works: Explicitly calling removeListener breaks the reference chain from MyService back to MyUIComponent (via the listener), allowing MyUIComponent and its associated listener to be garbage collected when they are no longer otherwise referenced.
Common Cause 4: ThreadLocals
ThreadLocal variables are designed to provide thread-specific values. However, if a ThreadLocal holds a reference to an object and the thread itself is long-lived (like in a thread pool), the ThreadLocal map can hold onto objects indefinitely if not properly cleaned up.
Diagnosis:
- Heap dump analysis (as above).
- Look for
ThreadLocalMap: In MAT, search forjava.lang.ThreadLocal.ThreadLocalMap. - Examine entries: Expand
ThreadLocalMapinstances. Look forEntryobjects whosevalueis large or holds references to objects you expect to be short-lived. Thekeyin theEntryis theThreadLocalinstance itself.
Fix:
Always call threadLocalVariable.remove() in a finally block when you are finished with the ThreadLocal value, especially in threads managed by a thread pool.
// Example within a thread pool task
ExecutorService executor = Executors.newFixedThreadPool(10);
try {
executor.submit(() -> {
try {
// Use ThreadLocal
MyObject data = myThreadLocal.get();
// ... do work ...
} finally {
myThreadLocal.remove(); // Always clean up
}
});
} finally {
executor.shutdown();
}
Why it works: The remove() method clears the entry from the ThreadLocalMap associated with the current thread. If the thread is reused, the stale reference is gone, and the garbage collector can reclaim the object.
Common Cause 5: Unclosed Resources (Streams, Connections, etc.)
While not strictly a "memory leak" in the sense of holding onto Java objects, unclosed resources like InputStream, OutputStream, Connection, ResultSet, Statement, and Socket can hold onto native memory and system resources, leading to depletion and eventual OutOfMemoryError (often OutOfMemoryError: Direct buffer memory for NIO buffers, or failures in resource allocation).
Diagnosis:
- Heap dump analysis (as above). Look for objects that are supposed to be closed but are still referenced, or for native memory issues.
- Use
lsof(Linux/macOS): Check for open file descriptors associated with your Java process.
A continuously increasing number of open files/sockets indicates a resource leak.lsof -p <pid> | wc -l - Monitor JMX metrics: Track metrics related to connection pools, file handles, etc.
Fix:
Always ensure resources are closed. The try-with-resources statement in Java 7+ is the idiomatic and safest way to do this.
// Using try-with-resources for automatic closing
try (InputStream is = new FileInputStream("myfile.txt");
OutputStream os = new FileOutputStream("output.txt")) {
// ... read from is, write to os ...
} catch (IOException e) {
// Handle exception
}
Why it works: The try-with-resources statement automatically calls the close() method on all resources declared in the try header (if they implement AutoCloseable) when the block is exited, whether normally or due to an exception.
Common Cause 6: Finalizers
Objects with finalize() methods are a performance and memory management nightmare. They are not eligible for garbage collection until the finalizer has been run, which is a background process. If an object needs finalization, it’s put on a queue to be finalized. If the queue backs up, or if the finalizer takes too long, these objects can remain in memory, consuming heap space.
Diagnosis:
- Heap dump analysis (as above).
- Look for
java.lang.ref.Finalizer: In MAT, search forjava.lang.ref.Finalizer. These objects represent objects that are pending finalization. Their "referent" is the object that needs finalizing. - Check for long-running finalizers: If you see many
Finalizerobjects, it suggests a backlog.
Fix:
Avoid finalize() methods. If you must clean up resources, use try-with-resources or explicit close() methods. If you absolutely must use a finalizer-like pattern for resource cleanup, consider using java.lang.ref.PhantomReference with a ReferenceQueue for more predictable cleanup, but this is advanced and often overkill.
Why it works: By avoiding finalize(), you ensure objects are eligible for garbage collection as soon as they are no longer reachable, rather than waiting for a separate, non-deterministic finalization process.
After fixing these, the next error you might encounter is OutOfMemoryError: Metaspace if you have a leak in class loading/unloading.