k6 thresholds are the silent guardians of your application’s performance, ensuring your load tests don’t just run, but mean something.

Let’s see them in action. Imagine you’re testing a critical API endpoint that should respond in under 100ms 95% of the time. Here’s how you’d set that up in k6:

import http from 'k6/http';
import { sleep } from 'k6';
import { Trend } from 'k6/metrics';

const responseTime = new Trend('http_req_duration'); // k6 already tracks this, but we can alias it for clarity

export let options = {
  vus: 10,
  duration: '30s',
  thresholds: {
    'http_req_duration{group:my_api_tests}': ['p(95)<100'], // 95th percentile response time for requests tagged with 'my_api_tests' must be < 100ms
    'http_req_failed': ['rate<0.01'], // Error rate must be less than 1%
    'http_req_duration': ['avg<200'], // Average response time must be < 200ms
  },
};

export default function () {
  const res = http.get('https://your-api.com/endpoint');
  responseTime.add(res.timings.duration, { group: 'my_api_tests' }); // Tagging the request
  sleep(1);
}

When you run this, k6 will continuously monitor the http_req_duration metric. If, at any point, the 95th percentile of response times for requests tagged with my_api_tests exceeds 100ms, or if the overall error rate (http_req_failed) goes above 1%, k6 will immediately abort the test and report a failure. This isn’t just a nice-to-have; it’s a hard stop, telling you that your service is already misbehaving under load.

The fundamental problem k6 thresholds solve is the ambiguity of raw load test output. A load test can generate gigabytes of data: request counts, response times, error codes, etc. Without thresholds, you’re left sifting through this data manually, trying to infer whether the test was "good" or "bad." Thresholds automate this decision-making process, turning raw metrics into a clear pass/fail indicator based on predefined Service Level Objectives (SLOs).

Internally, k6 collects metrics during the test execution. For metrics like response times (which are continuous values), it maintains running aggregates. When a threshold is defined, k6 constantly checks the current aggregate against the specified condition. For percentile thresholds like p(95)<100, k6 is essentially maintaining a sorted list (or a probabilistic approximation for very large datasets) of recent response times and checking the value at the 95th position. For rate thresholds like rate<0.01, it’s tracking the number of failed requests against the total number of requests. If any threshold is breached, k6 flags the test as failed.

The exact levers you control are the metric names, the aggregation functions (like p(95), avg, rate, count), the operator (<, >, <=, >=), and the target value. You can also scope thresholds to specific groups of requests using tags, as shown with {group:my_api_tests}. This allows you to define different SLOs for different parts of your application under test. For instance, a high-priority checkout API might have stricter response time thresholds than a less critical user profile endpoint.

A common misconception is that thresholds are only checked at the end of a test. This is not true for k6. Thresholds are evaluated continuously throughout the test’s lifecycle. This means that if a single request causes a threshold to be breached, k6 can detect it and fail the test immediately, rather than waiting for the entire test duration to complete. This provides much faster feedback on performance regressions.

The next logical step after defining these core performance thresholds is to explore how to integrate k6 into your CI/CD pipeline for automated performance gatekeeping.

Want structured learning?

Take the full K6 course →