Java’s garbage collector isn’t a mysterious black box; it’s a highly configurable system that you can tune to perform better for your specific application’s workload.

Let’s look at a simple Java application that simulates some work and then uses JFR and Async Profiler to analyze its performance.

import java.util.ArrayList;
import java.util.List;
import java.util.Random;
import java.util.concurrent.TimeUnit;

public class ProfilingDemo {

    private static final int OBJECT_COUNT = 100000;
    private static final int VALUE_RANGE = 1000;

    public static void main(String[] args) throws InterruptedException {
        System.out.println("Starting profiling demo...");
        List<String> data = new ArrayList<>();
        Random random = new Random();

        // Simulate some work and object creation
        for (int i = 0; i < OBJECT_COUNT; i++) {
            data.add("Item-" + random.nextInt(VALUE_RANGE));
            if (i % 10000 == 0) {
                TimeUnit.MILLISECONDS.sleep(10); // Simulate some pauses
            }
        }

        // Keep the application running for a bit to allow profiling
        System.out.println("Data generated. Application running. Press Ctrl+C to exit.");
        while (true) {
            TimeUnit.SECONDS.sleep(5);
        }
    }
}

To profile this, we’ll use Java Flight Recorder (JFR) and Async Profiler.

First, let’s run the application with JFR enabled. JFR is built into the JVM and provides low-overhead profiling. You can start it with JVM arguments.

java -XX:StartFlightRecording=duration=60s,filename=myrecording.jfr -jar ProfilingDemo.jar

This command starts the ProfilingDemo application and simultaneously begins a JFR recording for 60 seconds, saving the data to myrecording.jfr. Once the recording is done, you can analyze this file using tools like Java Mission Control (JMC). JMC is a powerful GUI that visualizes JFR data, showing you CPU usage, allocation rates, GC activity, and more. You’ll see threads consuming CPU, objects being allocated, and potentially garbage collection pauses.

Now, let’s bring in Async Profiler. It’s a versatile tool that can sample CPU and memory allocations with very low overhead. It’s particularly good at pinpointing hot code paths.

To use Async Profiler, you first download its release, then attach it to a running Java process. Let’s assume ProfilingDemo.jar is running in the background and you know its PID (e.g., 12345).

To get a CPU profile for 30 seconds:

./profiler.sh -d 30 -f cpu.svg 12345

This command attaches Async Profiler to PID 12345, records CPU activity for 30 seconds, and outputs an SVG flame graph to cpu.svg. Flame graphs are excellent for visualizing CPU hotspots; wider bars indicate more time spent in that function.

To get a memory allocation profile for 30 seconds:

./profiler.sh -d 30 -f alloc.svg -e alloc 12345

This command does the same but profiles memory allocations (-e alloc), showing you where objects are being created. This is crucial for understanding memory pressure and potential GC issues.

The most surprising thing about garbage collection is how often it’s not the bottleneck. Many developers blame GC for performance issues when, in reality, the problem lies in inefficient object allocation patterns or excessive CPU usage by the application logic itself. Profiling tools like JFR and Async Profiler help you distinguish between these. For example, you might see high GC activity in JMC, but if Async Profiler’s allocation profile shows that most allocations are short-lived and expected for the workload, the GC might be doing its job efficiently, and the real fix is in the application code.

When analyzing JFR data in JMC, pay close attention to the "Allocation Statistics" and "Garbage Collection" tabs. You’ll see metrics like "Total Allocations," "Live Generations," and "Pause Times." For Async Profiler’s flame graphs, look for functions that consume the widest parts of the graph, indicating they are the most time-consuming or allocate the most memory.

A common pitfall is to focus solely on GC pauses shown in JMC. While long GC pauses are bad, they might be a symptom, not the root cause. If your application is creating millions of small, short-lived objects, the GC will be busy. Profiling allocations with Async Profiler will reveal this, and optimizing the code to reduce unnecessary object creation is often more effective than tuning GC parameters.

The next step in deep-diving into performance is often understanding thread contention and lock profiling.

Want structured learning?

Take the full Java course →