Java’s virtual threads, born from Project Loom, let you write concurrent code as if you were writing sequential code, but with the scalability of lightweight, multiplexed OS threads.
Let’s see this in action. Imagine we have a service that needs to fetch data from three external APIs, and the total time is dominated by the slowest API call.
// Traditional Platform Threads (blocking I/O)
public class TraditionalService {
public String fetchData() {
String result1 = callApi1();
String result2 = callApi2();
String result3 = callApi3();
return result1 + result2 + result3;
}
private String callApi1() {
// Simulate blocking I/O
try { Thread.sleep(2000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data1 ";
}
private String callApi2() {
// Simulate blocking I/O
try { Thread.sleep(3000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data2 ";
}
private String callApi3() {
// Simulate blocking I/O
try { Thread.sleep(1000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data3 ";
}
public static void main(String[] args) {
long startTime = System.currentTimeMillis();
TraditionalService service = new TraditionalService();
String result = service.fetchData();
long endTime = System.currentTimeMillis();
System.out.println("Traditional Result: " + result);
System.out.println("Traditional Time: " + (endTime - startTime) + "ms");
}
}
If you run this, it will take approximately 6 seconds (2s + 3s + 1s) because each callApiX blocks its platform thread. If you had many such requests, you’d quickly exhaust your thread pool, leading to poor scalability.
Now, let’s rewrite this using virtual threads.
// Virtual Threads (non-blocking I/O with virtual threads)
public class VirtualThreadService {
public String fetchData() {
// Use ExecutorService to manage virtual threads
try (var executor = Executors.newVirtualThreadPerTaskExecutor()) {
var future1 = executor.submit(this::callApi1);
var future2 = executor.submit(this::callApi2);
var future3 = executor.submit(this::callApi3);
return future1.get() + future2.get() + future3.get();
} catch (InterruptedException | ExecutionException e) {
throw new RuntimeException(e);
}
}
private String callApi1() {
// Simulate blocking I/O
try { Thread.sleep(2000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data1 ";
}
private String callApi2() {
// Simulate blocking I/O
try { Thread.sleep(3000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data2 ";
}
private String callApi3() {
// Simulate blocking I/O
try { Thread.sleep(1000); } catch (InterruptedException e) { throw new RuntimeException(e); }
return "Data3 ";
}
public static void main(String[] args) throws ExecutionException, InterruptedException {
long startTime = System.currentTimeMillis();
VirtualThreadService service = new VirtualThreadService();
String result = service.fetchData();
long endTime = System.currentTimeMillis();
System.out.println("Virtual Thread Result: " + result);
System.out.println("Virtual Thread Time: " + (endTime - startTime) + "ms");
}
}
When you run the virtual thread version, the output will be around 3 seconds, matching the longest API call. This is because Thread.sleep() (and other blocking I/O operations) on a virtual thread unmounts the virtual thread from its underlying OS thread. The OS thread is then immediately available to run another virtual thread. When the sleep finishes, the virtual thread is remounted onto an available OS thread to continue its execution. This allows thousands, or even millions, of virtual threads to run concurrently on a small pool of OS threads.
The core problem virtual threads solve is the mismatch between the cost of traditional, OS-backed threads and the common pattern of I/O-bound workloads. In traditional Java, each Thread is a 1:1 mapping to an OS thread. OS threads are expensive; they have significant memory overhead (e.g., 1MB stack size by default) and context switching between them is costly for the CPU. This limits the number of concurrent requests your application can handle efficiently, forcing developers to use complex asynchronous programming models (like CompletableFuture or reactive frameworks) for I/O-bound tasks to avoid blocking threads.
Virtual threads are cheap. They have a small, dynamically sized stack that grows and shrinks as needed, typically only a few kilobytes. They are managed by the JVM and multiplexed onto a small pool of OS threads called the "carrier threads." When a virtual thread performs a blocking operation (like Thread.sleep(), InputStream.read(), or network socket operations), the JVM detects this, saves the state of the virtual thread, and unmounts it from its carrier thread. The carrier thread is then free to run another virtual thread. When the blocking operation completes, the virtual thread is remounted onto an available carrier thread, and execution resumes. This "fork-join" pool (ForkJoinPool.commonPool() by default for Executors.newVirtualThreadPerTaskExecutor()) is typically sized based on the number of CPU cores, making it highly efficient.
The key to understanding their scalability is recognizing that the number of concurrent tasks is no longer limited by the number of OS threads, but by the number of virtual threads the JVM can manage, which is practically limited only by heap memory. This allows you to write simple, sequential-style code for concurrent operations, even when dealing with massive numbers of concurrent I/O-bound requests, without resorting to callback hell or complex reactive streams.
When a virtual thread performs a blocking I/O operation, the JVM doesn’t just pause the virtual thread; it actively unmounts it from its underlying OS thread (the "carrier thread"). The state of the virtual thread is captured, and the carrier thread is released to do other work. This is crucial. It’s not simply a matter of the OS thread waiting; the OS thread is freed up entirely, allowing the JVM to schedule another virtual thread onto it. This process is managed by the VirtualThread’s scheduler, which uses a ForkJoinPool of carrier threads.
The next step in mastering virtual threads is understanding how to properly manage their lifecycle and integrate them with existing blocking libraries that might not be designed for non-blocking I/O, often involving the use of Thread.ofVirtual().unstarted().join().