Gatling doesn’t actually ramp up virtual users; it just starts them as fast as the system can handle them, and the "ramp" is an emergent property of your test and the system’s capacity.

Let’s see what that looks like.

Here’s a Gatling simulation that looks like a linear ramp:

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class LinearRampSimulation extends Simulation {

  val httpProtocol = http
    .baseUrl("http://localhost:8080")
    .doNotTrackHeader("DNT")
    .inferHtmlResources()

  val scn = scenario("LinearRamp")
    .exec(
      http("request_1")
        .get("/")
        .check(status.is(200))
    )

  setUp(
    scn.inject(
      rampUsers(100) during (10 seconds)
    )
  ).protocols(httpProtocol)
}

When you run this, Gatling doesn’t magically parcel out 10 users per second. Instead, it tells the simulation engine, "Okay, I need to reach 100 users within 10 seconds." The engine then starts spawning users, and the rate at which they actually become active is dictated by how quickly your Gatling injector can create them and how quickly the target system can respond to their initial requests. If your system is slow, you’ll see a very shallow ramp. If your system is lightning fast, you might hit 100 users in just a couple of seconds, making the "ramp" much steeper than intended.

The problem Gatling’s rampUsers solves isn’t about forcing a specific injection rate, but about defining the target number of concurrent users you want to reach over a given duration. The actual injection rate is a consequence of the system’s performance and the injector’s capacity.

Let’s look at the other ramp types and how they operate under the hood:

Staircase Ramp

A staircase ramp is essentially a series of small, discrete steps. Gatling interprets atOnceUsers followed by pause as a way to build up load incrementally.

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class StaircaseRampSimulation extends Simulation {

  val httpProtocol = http
    .baseUrl("http://localhost:8080")
    .doNotTrackHeader("DNT")
    .inferHtmlResources()

  val scn = scenario("StaircaseRamp")
    .exec(
      http("request_1")
        .get("/")
        .check(status.is(200))
    )

  setUp(
    scn.inject(
      atOnceUsers(10),
      rampUsers(10) during (5 seconds),
      atOnceUsers(10),
      rampUsers(10) during (5 seconds)
    )
  ).protocols(httpProtocol)
}

In this example, Gatling first injects 10 users immediately. Then, it attempts to ramp up to an additional 10 users over 5 seconds. After that, it injects another 10 users at once, followed by another ramp. The key is that rampUsers(N) during D means "add N users over duration D." Gatling doesn’t look at the total user count; it looks at the increment for each injection step. The effective rate of users entering the system is still a function of how fast Gatling can spin them up and how fast the target responds. If the system is struggling, the "steps" will blur together as new users can’t become active quickly enough.

Burst Ramp

A burst ramp is about injecting a large number of users very quickly, often to see how a system handles sudden spikes.

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class BurstRampSimulation extends Simulation {

  val httpProtocol = http
    .baseUrl("http://localhost:8080")
    .doNotTrackHeader("DNT")
    .inferHtmlResources()

  val scn = scenario("BurstRamp")
    .exec(
      http("request_1")
        .get("/")
        .check(status.is(200))
    )

  setUp(
    scn.inject(
      rampUsers(100) during (1 seconds) // This is effectively a burst
    )
  ).protocols(httpProtocol)
}

When you use rampUsers(100) during (1 seconds), Gatling tries to reach 100 users by the end of that 1-second window. The rate is calculated internally as 100 users / 1 second = 100 users/sec. However, Gatling’s actual injection mechanism has limits. It will inject as fast as it can, but if the target system is slow to respond, or if the injector machine is maxed out on CPU, the actual number of users concurrently active will lag behind the intended target. You’re not guaranteeing 100 users per second; you’re attempting to reach 100 users within that second, and the system’s response time determines how close you get.

The "ramp" you observe in Gatling is a reflection of the system’s throughput combined with the injector’s capacity, not a strict, enforced injection schedule. Gatling’s inject DSL defines the desired state (e.g., "reach 100 users within 10 seconds"), and the runtime observes the actual state based on the system’s ability to cope. This is why a slow system will show a shallow ramp even with rampUsers(1000) during (60 seconds), and a fast system might hit its target ramp number much sooner than expected.

The most common misunderstanding is thinking Gatling is pushing users at a precise rate. It’s actually pulling users into existence as fast as the system allows them to be sustained. If your Gatling injector machine becomes the bottleneck, you’ll see a plateau in the number of active users, even if your setUp configuration specifies a higher target.

Want structured learning?

Take the full Gatling course →