k6 for SRE: Define SLOs and Enforce Them in Tests (2026)

Defining and enforcing Service Level Objectives (SLOs) in your k6 performance tests is how you translate abstract goals into concrete, actionable checks that keep your services reliable.

Here’s a k6 test that simulates user traffic, checks response times against an SLO, and fails if the SLO is breached:

import http from 'k6/http';
import { sleep } from 'k6';
import { check } from 'k6';

// Define SLO parameters
const TARGET_AVAILABILITY_PERCENT = 99.9;
const TARGET_RESPONSE_TIME_MS = 200;
const WINDOW_SECONDS = 60; // Check against a 1-minute window

export let options = {
    vus: 100,
    duration: '5m',
};

let successfulRequests = 0;
let totalRequests = 0;
let windowStartTime = Date.now();

export default function () {
    const res = http.get('http://your-service.example.com');

    totalRequests++;

    // Check for successful requests (status code 2xx or 3xx)
    if (res.status >= 200 && res.status < 400) {
        successfulRequests++;
    }

    // Check response time against SLO
    const isFastEnough = res.timings.duration < TARGET_RESPONSE_TIME_MS;

    // Perform checks and report any failures
    let SLO_checks = {
        'response time < 200ms': isFastEnough,
        'request is successful': res.status >= 200 && res.status < 400,
    };
    check(res, SLO_checks);

    // Periodically check and enforce SLO
    const now = Date.now();
    if (now - windowStartTime >= WINDOW_SECONDS * 1000) {
        const availability = (successfulRequests / totalRequests) * 100;

        console.log(`Availability in last ${WINDOW_SECONDS}s: ${availability.toFixed(2)}%`);

        // Fail the test if availability SLO is breached
        if (availability < TARGET_AVAILABILITY_PERCENT) {
            console.error(`AVAILABILITY SLO BREACHED: ${availability.toFixed(2)}% < ${TARGET_AVAILABILITY_PERCENT}%`);
            // You could also use fail() here to immediately stop the test with an error message
            // fail(`Availability SLO breached: ${availability.toFixed(2)}%`);
        }

        // Reset for the next window
        successfulRequests = 0;
        totalRequests = 0;
        windowStartTime = now;
    }

    sleep(1);
}

The most surprising truth about SLOs is that they are not about preventing all errors, but about defining an acceptable rate of errors and a maximum acceptable duration for them. This shifts the focus from a perfect system to a predictably reliable one.

Let’s look at how this test works. The http.get('http://your-service.example.com') line simulates a user fetching a resource. res.timings.duration gives you the total time for that request, including DNS, connection, and content download. The check(res, SLO_checks) function is k6’s built-in way to assert conditions during a test. If any of the conditions in SLO_checks are false, k6 logs a failure for that specific iteration.

The core SLO enforcement happens in the windowed check. We’re tracking successfulRequests and totalRequests over a WINDOW_SECONDS interval. When that interval elapses, we calculate the availability percentage. If it dips below TARGET_AVAILABILITY_PERCENT (e.g., 99.9%), we log a critical error. For immediate test failure on SLO breach, you’d uncomment the fail() line. This ensures that if your service is consistently failing to meet its availability target for a sustained period, the test run itself will fail, alerting you to a critical issue.

Beyond simple availability, you’d typically define SLOs for latency (e.g., 99% of requests served in under 200ms) and throughput. In the example above, we’re checking if individual requests are below TARGET_RESPONSE_TIME_MS. For a true SLO check on latency, you’d aggregate response times over a window and check percentiles, similar to how availability is checked. k6’s metrics module is invaluable here, allowing you to define custom metrics and then use aggregation functions to calculate percentiles or other statistical measures over windows.

Consider the definition of "successful request." In this example, we’re using res.status >= 200 && res.status < 400. This covers typical success codes (2xx) and redirects (3xx). Depending on your application, you might have specific error codes that are acceptable within a certain rate, or you might want to treat certain client errors (like 400 Bad Request) differently from server errors (like 500 Internal Server Error). The key is to align these definitions with your business’s understanding of service reliability.

The options block defines the virtual users (vus) and the total duration of the test. These are crucial for simulating realistic load and ensuring your SLOs are tested under pressure. A common practice is to run these tests against production or a production-like staging environment with a load that represents your expected peak traffic.

The most impactful, yet often overlooked, aspect of SLO enforcement in k6 is how you handle exceptions to the SLO. Your SLO is a target, not an absolute guarantee. You need a strategy for when an SLO is breached: is it an immediate test failure? Is it a warning that triggers an alert but allows the test to continue for a defined period? The fail() function in k6 provides a hard stop, while logging an error and continuing allows for more nuanced alerting and analysis of transient issues.

The next logical step after defining and enforcing SLOs in your tests is to integrate these tests into your CI/CD pipeline, failing builds when SLOs are breached, and to set up external monitoring that also tracks your SLOs against real user traffic.