Read and Interpret Gatling HTML Performance Reports (2026)

The Gatling HTML report isn’t just a summary; it’s a diagnostic tool that tells you why your application is slow, not just that it is slow.

Let’s dive into a simulated Gatling run. Imagine you’ve just executed a Gatling test against a simple API endpoint, POST /users, that creates a new user.

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BasicUserSimulation extends Simulation {

  val httpProtocol = http
    .baseUrl("http://localhost:8080")
    .doNotTrackHeader("DNT")
    .acceptLanguageHeader("en-US,en;q=0.5")
    .contentTypeHeader("application/json")

  val scn = scenario("User Creation")
    .exec(http("Create User")
      .post("/users")
      .body(StringBody("""{"name": "Test User", "email": "test@example.com"}"""))
      .check(status.is(201))
    )

  setUp(scn.inject(atOnceUsers(100))).protocols(httpProtocol)
}

After running this, Gatling generates an HTML report. You’ll find this report in your Gatling output directory, typically target/gatling/. Open index.html in your browser.

The first thing you see is the Dashboard. This gives you a high-level overview: total requests, successful requests, failed requests, minimum/maximum/average response times, and throughput. But this is just the tip of the iceberg. The real insights come from drilling down.

Click on the Requests tab. Here, you’ll see a breakdown by request name. In our case, it’s "Create User". You’ll see metrics like:

Total # of requests: 100
Mean Response Time: 523 ms
Percentiles (50th, 75th, 95th, 99th): 400 ms, 600 ms, 950 ms, 1500 ms
Throughput: 15 requests/sec
Fails: 5 (which is 5% of total requests)

This immediately tells you that 5% of your user creation requests are failing, and the 99th percentile response time is a full second and a half. This is where you start asking why.

Now, click on the "Create User" request itself. This brings you to the Request Details page. Here, you’ll find even more granular data.

The Response Time (ms) Over Time graph is crucial. It plots response times against simulation time. Look for patterns:

Spikes: Did response times suddenly jump at a specific point in the simulation? This could indicate a resource exhaustion (CPU, memory, network) or a garbage collection pause in your application.
Steadily Increasing Trend: If response times are consistently getting worse over the simulation, it suggests a resource leak or a database connection pool filling up.
Plateauing: If response times are high but stable, it might be that your application is simply operating at its current capacity limit.

The Distribution chart shows how many requests fall into different response time buckets. This helps visualize where the bulk of your latency is coming from. Are most requests fast, but a few are extremely slow? Or are most requests moderately slow?

The Fails section on this page is critical. It will list the specific errors encountered. For our POST /users endpoint, you might see 500 Internal Server Error or 503 Service Unavailable. Gatling categorizes these:

KO (Client errors): Typically 4xx status codes. In our example, if the API rejected the request due to invalid input.
NO_RESPONSE: The server didn’t respond at all within the timeout. This is a critical indicator of a server-side problem.
malformed response: The response wasn’t valid HTTP.
exception: A Java exception occurred on the client (Gatling) side, which is rare for basic tests.

If you see NO_RESPONSE or server-side errors like 500 or 503, the problem is almost certainly not in your Gatling script, but in the application under test.

To debug a 500 error during user creation, you’d look at your application logs (e.g., catalina.out for Tomcat, application.log for Spring Boot). The Gatling report’s 500 count points you to when it happened, and your application logs will tell you why. Common causes for 500s in a user creation endpoint:

Database constraint violation: Trying to insert a user with a duplicate email address if the email field is unique in your database. Check your database logs or run SELECT count(*) FROM users WHERE email = 'test@example.com'; before and during the test.
Null pointer exception: A bug in your application code where a variable is unexpectedly null. The stack trace in your application logs will pinpoint this. For example, if you try to call .toString() on a null object.
External service failure: If user creation depends on another service (e.g., an email verification service) that is down or returning errors. Check the logs of those dependent services.
Resource exhaustion in the application: The application server running out of memory or CPU, leading to unexpected crashes or errors. Monitor your application server’s resource utilization (e.g., using top, htop, or APM tools).
Configuration error: A missing or incorrect database connection string, API key, or other critical configuration. Verify your application’s configuration files.
Concurrency issues: A race condition in your application code that only manifests under load, leading to data corruption or crashes. This is harder to debug and might require more advanced application-level profiling.

For NO_RESPONSE errors, the issue is usually that the application server is too busy to even send a response, or it crashed before it could. This points to severe resource contention.

The Requests per Second (Throughput) graph shows how many requests your system is handling over time. If this graph starts to flatten out while you’re still injecting users, it means your system has hit its capacity limit. The Active Users graph shows how many virtual users are actively making requests. If active users are high but throughput is flat, again, capacity is reached.

The Scenarios tab provides a similar breakdown but focuses on the overall scenario execution. If your scenario involves multiple steps (e.g., login, create user, fetch user), this tab helps you see which part of the workflow is the bottleneck.

The Global Information section at the bottom of the Dashboard shows the simulation duration, the number of users injected, and the duration of the simulation. This helps contextualize the throughput numbers.

One thing most people miss is that Gatling’s percentile calculations are based on completed requests. If a request times out and Gatling never receives a response, it might not be included in the response time percentiles, but it will be counted as a NO_RESPONSE failure. This can mask the true extent of unresponsiveness if timeouts are frequent.

After fixing the 5% of failed user creation requests and optimizing the code to handle the load, the next thing you’ll likely encounter is a need to test more complex user journeys or introduce more realistic think times.