Tune JVM Heap Size and GC Flags for Production Performance (2026)

Tuning JVM heap size and GC flags for production performance isn’t about making your application faster, it’s about making it predictable.

Let’s look at how a typical web application might handle requests, and how heap and GC play into that. Imagine a simple REST API service.

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RestController;
import java.util.ArrayList;
import java.util.List;
import java.util.UUID;

@RestController
public class ItemController {

    private List<Item> items = new ArrayList<>(); // This list will grow

    @GetMapping("/items")
    public List<Item> getItems() {
        // Simulate fetching and processing some data
        if (items.isEmpty()) {
            for (int i = 0; i < 1000; i++) {
                items.add(new Item(UUID.randomUUID().toString(), "Sample Data " + i));
            }
        }
        return items;
    }

    @GetMapping("/create")
    public String createItem() {
        // Simulate creating a new item, adding it to memory
        items.add(new Item(UUID.randomUUID().toString(), "New Item"));
        return "Item created. Total items: " + items.size();
    }

    // Simple Item class
    private static class Item {
        String id;
        String data;

        Item(String id, String data) {
            this.id = id;
            this.data = data;
        }
    }
}

When this application starts, the JVM allocates a portion of memory called the "heap." Objects created by your application (like the Item objects in our example) live on this heap. As requests come in and new Item objects are created, the heap fills up.

The Garbage Collector (GC) is the JVM’s mechanism for reclaiming memory occupied by objects that are no longer referenced by the application. When the heap gets full, the GC has to run. This process can be computationally expensive and can pause your application threads, leading to latency.

The core trade-off is between heap size and GC frequency/duration. A larger heap means fewer GCs, but each GC might take longer. A smaller heap means more frequent GCs, but they might be shorter. The goal is to find the sweet spot that minimizes overall latency and maximizes throughput for your specific workload.

Heap Size Tuning

The most fundamental parameters are -Xms (initial heap size) and -Xmx (maximum heap size).

Diagnosis: Monitor your application’s memory usage and GC activity. Use tools like jstat -gc <pid> <interval> or JMX metrics (e.g., via Prometheus JMX Exporter). Look for high heap utilization leading to frequent or long GC pauses.

# Example: Monitor GC for PID 12345 every 5 seconds
jstat -gc 12345 5000

Common Causes & Fixes:

Heap too small, causing frequent GCs: If jstat shows the heap (specifically S0, S1, E columns for young generation, and O for old generation) consistently near 100% before a Full GC (indicated by FGCT increasing), your heap is likely too small for the workload’s peak demand.
- Fix: Increase -Xmx. For instance, if you have -Xmx2g, try -Xmx4g.
- Why it works: A larger heap provides more breathing room, allowing more objects to be created before a GC cycle is triggered, thus reducing GC frequency.
Heap too large, causing long GC pauses: If your heap is massive (e.g., -Xmx32g) and you observe very long real time pauses in your GC logs (often from -Xlog:gc*=debug), the GC has a lot of memory to scan.
- Fix: Decrease -Xmx. This is less common for performance issues unless you’re over-allocating drastically or experiencing issues with specific GCs that scale poorly with heap size (like older GCs).
- Why it works: A smaller heap means the GC has less memory to traverse, potentially leading to shorter pause times, even if GCs occur more often.
Initial heap size too small (-Xms): If your application experiences a sudden spike in load shortly after startup, and the heap needs to grow significantly from a small -Xms to meet the demand, this growth process itself can cause minor pauses and resource contention.
- Fix: Set -Xms equal to -Xmx for production deployments. For example, -Xms4g -Xmx4g.
- Why it works: By pre-allocating the full heap size on startup, you eliminate the overhead and potential pauses associated with the heap dynamically resizing during peak load.
Object allocation patterns: If your application creates very large numbers of short-lived objects, the Young Generation (Eden space) might fill up rapidly, triggering frequent Minor GCs.
- Diagnosis: Monitor the YGCT (Young Generation GC Time) in jstat. If it’s high relative to GCT (Total GC Time), this is a clue.
- Fix: While heap size is a factor, this often points to code-level optimization. However, you can influence Young Gen size. If your total heap is -Xmx8g, the Young Gen is often dynamically sized (e.g., 1/3 of total heap). You can sometimes tune this indirectly via GC-specific flags (e.g., -XX:NewRatio for Parallel GC, or letting G1/ZGC manage it). For G1, you might tune -XX:MaxGCPauseMillis which influences its Young GC sizing.
- Why it works: Optimizing object creation or tuning Young Gen sizing can reduce the frequency of Minor GCs, which are typically less impactful than Full GCs.
PermGen/Metaspace issues (older JVMs/recent ones): While not strictly "heap," issues in Metaspace (for Java 8+) or PermGen (Java 7 and below) can lead to OutOfMemoryError or GC thrashing if classes are loaded/unloaded excessively.
- Diagnosis: Monitor Metaspace usage (via JMX or jstat -gcutil <pid>).
- Fix: For Java 8+, use -XX:MaxMetaspaceSize=<size>. For example, -XX:MaxMetaspaceSize=256m. For older JVMs, tune -XX:MaxPermSize.
- Why it works: Prevents the JVM from running out of space for class metadata, which can otherwise cause application instability.

Garbage Collector Flags

The choice of GC algorithm and its specific flags are critical. Modern JVMs (Java 11+) default to G1 (Garbage-First) GC, which is generally a good balance for server applications. For very low-latency requirements, ZGC or Shenandoah might be considered.

Diagnosis: Use -Xlog:gc*=info (or debug for more detail) to get GC logs. Analyze these logs for pause times, frequency, and memory reclaim. The jcmd <pid> GC.heap_info command can also provide a snapshot.

Common Causes & Fixes (Focusing on G1 GC):

G1’s target pause time too aggressive/too lenient: G1 aims to meet the pause time goal set by -XX:MaxGCPauseMillis=<ms>. If this is set too low (e.g., 10ms), G1 might trigger more frequent, smaller GCs to meet the goal, potentially impacting throughput. If set too high (e.g., 1s), pauses might become unacceptable.
- Fix: Adjust -XX:MaxGCPauseMillis. For many applications, 200ms or 500ms is a reasonable starting point. Example: -XX:MaxGCPauseMillis=200.
- Why it works: This flag guides G1’s heuristics in deciding when to start a GC cycle and how much work to do, balancing latency and throughput.
G1’s region size not optimal: G1 divides the heap into regions. Its region size is calculated based on heap size (-XX:G1HeapRegionSize=<size>). If this is not optimal, it can lead to fragmentation or inefficient collection.
- Diagnosis: Look for messages in GC logs about region size calculations or excessive humongous allocations (allocations larger than half a region).
- Fix: This is usually best left to JVM defaults. However, if you have very large objects, you might need to increase the heap size, which will automatically increase region size. Explicitly setting -XX:G1HeapRegionSize is rarely needed and can be complex.
- Why it works: Ensures that G1 can efficiently manage memory, especially for large objects, and minimizes wasted space within regions.
Promotion failures / Old Gen too small: If objects that should be long-lived are being promoted to the Old Generation too quickly and then becoming garbage there, or if the Old Gen fills up, it can lead to excessive Full GCs.
- Diagnosis: Monitor O (Old Gen utilization) in jstat -gcutil and look for Full GC events.
- Fix: This often means increasing -Xmx (as covered in Heap Size) or tuning -XX:G1NewSizePercent and -XX:G1MaxNewSizePercent to control the initial and maximum size of the Young Generation. For example, -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=60.
- Why it works: By allowing the Young Generation to be larger, short-lived objects have more time to be collected there, and longer-lived objects have a better chance of surviving to the Old Gen without premature promotion.
Excessive Humongous Allocations: G1 handles allocations larger than half a region specially (as "Humongous objects"). These are allocated directly into the Old Gen and are expensive to collect.
- Diagnosis: GC logs will explicitly mention "Humongous object allocation" or "Humongous reclaim."
- Fix: This is almost always a code-level issue. Applications should avoid creating extremely large, single objects (e.g., huge byte arrays, large serialized objects). If unavoidable, ensure your heap is large enough, and the G1HeapRegionSize is appropriate.
- Why it works: Reduces the burden on the GC by avoiding these special, costly allocations.
Using the wrong GC algorithm: While G1 is the default, if you have extreme latency requirements (sub-millisecond pauses), G1 might not be sufficient.
- Diagnosis: Persistent, unacceptable pause times even after tuning G1.
- Fix: Switch to ZGC (-XX:+UseZGC) or Shenandoah (-XX:+UseShenandoahGC). These are concurrent collectors designed for very low pause times but may have slightly higher CPU overhead.
- Why it works: These GCs perform almost all their work concurrently with the application threads, drastically reducing or eliminating stop-the-world pauses.

The next thing you’ll likely encounter after tuning heap and GC is understanding thread pool saturation and how it interacts with GC pauses.