The first thing the CSV makes clear is that low concurrency does not
tell the story. At 50 and 200 concurrent clients, the classic pool
and the virtual-thread server are essentially tied. That is expected
because the fixed pool still has enough capacity to absorb the
blocked requests.
The interesting behavior starts at 1000 concurrency. At that point,
the classic thread pool has effectively run out of runway. Throughput
stays near 1,000 requests per second, but p95 latency jumps above
one second because requests begin waiting in line. Virtual threads
keep the same blocking code path alive, but they remove the pool as
the immediate ceiling. Throughput rises to almost 4,808 requests per
second while p95 latency stays close to the downstream delay.
At 2000 concurrency, the gap is no longer subtle. The classic server
remains trapped around 958 requests per second, with p95 latency
above two seconds. The virtual-thread server reaches roughly 9,569
requests per second and still keeps p95 latency close to 214 ms. In
other words, the virtual-thread version is now close to the
downstream delay itself, which means the queueing penalty has been
pushed out of the hot path.
The 5000-concurrency run is the most revealing one. The classic pool
hits saturation hard: throughput remains under 1,000 requests per
second, and more than 15,000 requests fail. Virtual threads still
process more than 263,000 successful requests, but p95 latency rises
to 588 ms and a small number of failures appear. That does not mean
virtual threads stopped working. It means the benchmark has finally
found the next bottleneck, most likely in the downstream path,
connection pressure, or timeout boundaries.
The shape of the results matters as much as the raw totals. Classic
threads show a plateau. Virtual threads show continued growth until
the surrounding system becomes the limiting factor. That is the
practical promise of Java 21: blocking code can scale much further
before the architecture has to become more complex.