k6 Bottleneck Detection: Find Slow Services Under Load (2026)

The most surprising thing about k6’s bottleneck detection is that it doesn’t actually detect bottlenecks itself; it exposes the symptoms of bottlenecks in the services you’re testing, allowing you to pinpoint them.

Let’s see k6 in action. Imagine we have a simple API service running on http://localhost:8080 that processes user requests. We want to see how it behaves under load and identify if any part of the request lifecycle is slowing us down.

Here’s a basic k6 script:

import http from 'k6/http';
import { sleep } from 'k6';

export const options = {
  vus: 100,
  duration: '1m',
};

export default function () {
  http.get('http://localhost:8080/users/123');
  sleep(1);
}

When we run this with k6 run script.js, k6 will simulate 100 virtual users for one minute, each making a GET request to /users/123 and then sleeping for one second.

The key to bottleneck detection here lies in k6’s output, specifically the Request Distribution and HTTP Response Codes metrics.

Request Distribution shows how your requests are spread across different URLs. If you see a specific endpoint consuming a disproportionately large amount of time, that’s your first clue.

      █
   ▄█████▓
  ██████████
██████████████
█▄▄▄▄▄▄▄▄▄▄▄▄█

request_duration: min: 12.34ms p(25): 55.67ms p(50): 98.76ms p(75): 150.12ms p(90): 210.45ms p(95): 280.78ms p(99): 400.90ms max: 850.11ms

HTTP Response Codes tells you if errors are creeping in under load. A sudden spike in 5xx errors, for instance, indicates your backend is struggling to keep up.

      http_req_failed{error_code:500}: 1.56%  │ ████████████████████████████████████████
      http_req_failed{error_code:503}: 0.88%  │ ████████████████████████
      http_req_failed{error_code:429}: 0.22%  │ ████

Beyond these, k6 provides a wealth of metrics: http_req_duration (the total time for a request, including DNS, connection, and response time), http_req_sending, http_req_waiting, and http_req_receiving. Spikes in http_req_waiting often point to the server processing the request, while http_req_receiving could indicate network or server-side buffering issues.

The true power comes from correlating these metrics. If http_req_duration is high, and http_req_waiting is the dominant component of that duration, and simultaneously you’re seeing 5xx errors, you’ve strongly indicated a server-side processing bottleneck.

The levers you control in k6 are your load profile (vus, duration, stages) and the specific endpoints you target. By systematically increasing load and observing where latency and error rates climb first, you can isolate the problematic service or even specific code paths within that service.

Many users focus solely on the p(99) of http_req_duration. However, it’s often more insightful to look at the percentile distribution of the components of that duration. For example, if p(99) of http_req_waiting is high, but p(99) of http_req_receiving is low, it strongly suggests the bottleneck is in the server’s application logic or database queries, rather than network bandwidth or response body size. You can access these component durations directly in k6’s output or via its Checks API.

Understanding how k6’s metrics map to the actual stages of an HTTP request is crucial for effective bottleneck diagnosis.