The Java Virtual Machine’s Just-In-Time (JIT) compilers are the unsung heroes of Java performance, and they don’t just compile your code once; they can compile it multiple times for increasingly better performance.

Let’s see this in action. Imagine a simple Java method:

public class Adder {
    public int add(int a, int b) {
        return a + b;
    }
}

When you first run this code, the JVM’s interpreter executes it. But the JVM is watching. If add is called frequently, the JIT compiler steps in.

The first compiler, C1 (often called the "client" compiler), is fast. It quickly compiles the add method into native machine code. This makes subsequent calls much faster than interpretation. C1 focuses on getting code running quickly.

Here’s what C1 might generate (conceptually, not actual assembly):

; C1 compiled code for Adder.add
; Registers:
;   %rdi: first argument (a)
;   %rsi: second argument (b)
;   %rax: return value

    mov %rdi, %rax     ; Copy a to rax
    add %rsi, %rax     ; Add b to rax
    ret                ; Return

But the JVM keeps monitoring. If the add method proves to be a "hot spot" – meaning it’s called very often and is critical to performance – the second compiler, C2 (the "server" compiler), gets involved. C2 is slower to start but produces much more optimized code. It performs aggressive optimizations like inlining, dead code elimination, and loop unrolling, tailored specifically to how the method is actually being used.

Consider C2’s potential optimization: If the JVM notices that add is always called with positive numbers, it might remove checks for overflow. Or, if add is called within a tight loop, C2 might inline the add method directly into the loop’s compiled code, eliminating the overhead of a method call entirely.

Here’s a conceptual glimpse of C2’s output, assuming it was inlined into a loop and the JVM determined a and b are always small enough not to overflow:

; C2 compiled code for a loop calling Adder.add
; Registers:
;   %r8: loop counter
;   %r9: base address of array 'data'

.loop_start:
    mov (%r9, %r8, 4), %eax  ; Load data[i] into eax
    add $1, %eax             ; Add 1 (assuming 'b' was always 1)
    mov %eax, (%r9, %r8, 4)  ; Store result back
    inc %r8                  ; Increment loop counter
    cmp $100, %r8            ; Loop 100 times
    jl .loop_start           ; Jump if less than

This is where the "multiple compilations" come in. C1 gets code running fast now. C2 takes that compiled code, analyzes its runtime behavior, and generates highly optimized code for sustained high performance. The JVM can even deoptimize code – if a method compiled by C2 is no longer "hot" or its assumptions (like integer overflow) are violated, the JVM can revert it back to interpreted code or recompile it with a different strategy.

The most surprising true thing is that the "best" compiler isn’t always C2. For applications with many short-lived methods or those that start up quickly, C1’s faster compilation can provide better overall throughput. The JVM dynamically chooses between C1 and C2 based on profiling data, a sophisticated dance of interpreting, compiling, and recompiling.

A more recent addition is GraalVM’s native image, which takes this a step further. Instead of JIT compilation at runtime, it performs Ahead-Of-Time (AOT) compilation. This means your Java application is compiled directly into a standalone executable binary before runtime, eliminating the JVM startup and JIT compilation phases entirely. This results in incredibly fast startup times and lower memory footprints, though it comes with its own set of challenges, particularly around dynamic language features and reflection.

The next hurdle you’ll likely encounter is understanding how to tune these compilers, especially when dealing with very specific performance bottlenecks or unusual application patterns.

Want structured learning?

Take the full Jvm course →