Design Gatling Injection Profiles for Realistic Load Patterns (2026)

Gatling’s injection profiles are the secret sauce to simulating realistic user behavior, not just brute force.

Let’s see it in action. Imagine we’re testing an e-commerce site. We don’t want to just slam it with 1000 users all at once. Real users trickle in, ramp up during peak hours, and then taper off.

import io.gatling.core.Predef._
import io.gatling.http.Predef._
import scala.concurrent.duration._

class ECommerceSimulation extends Simulation {

  val httpProtocol = http
    .baseUrl("http://localhost:8080") // Your application's base URL
    .inferHtmlResources()
    .acceptHeader("text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8")
    .acceptEncodingHeader("gzip, deflate")
    .acceptLanguageHeader("en-US,en;q=0.5")
    .userAgentHeader("Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:100.0) Gecko/20100101 Firefox/100.0")

  val scn = scenario("E-Commerce Browse")
    .exec(http("Homepage")
      .get("/"))
    .pause(2 seconds, 5 seconds) // Simulate thinking time
    .exec(http("Search Products")
      .get("/products?q=gadgets"))
    .pause(1 second, 3 seconds)
    .exec(http("View Product Detail")
      .get("/products/123"))
    .pause(3 seconds, 7 seconds)
    .exec(http("Add to Cart")
      .post("/cart/add/123"))

  // Realistic Injection Profile
  setUp(
    scn.inject(
      rampUsers(100) during (30 seconds), // Start with 0 users, ramp up to 100 over 30 seconds
      constantUsersPerSec(50) during (60 seconds), // Maintain a steady rate of 50 users/sec for 60 seconds
      rampUsersPerSec(10) to 50 during (45 seconds), // Gradually increase rate from 10 to 50 users/sec over 45 seconds
      nothingFor(15 seconds), // A quiet period
      atOnceUsers(200) // A sudden spike of 200 users
    ).protocols(httpProtocol)
  ).maxDuration(180 seconds) // Limit simulation duration
}

This setUp block defines the heart of our load pattern. We’re not just throwing users at the system; we’re orchestrating how they arrive and behave.

The problem this solves is the "all at once" or "perfectly steady state" load test, which rarely reflects real-world traffic. Real systems experience:

Ramp-up: Users gradually arriving as a service becomes popular or a marketing campaign kicks off.
Peak Loads: Sustained periods of high activity.
Spikes: Sudden, short bursts of traffic (e.g., flash sales, breaking news).
Tapering: Traffic naturally decreasing.
Quiet Periods: Lulls in activity.

Gatling’s setUp block, specifically the inject method, allows us to model these with precision. Let’s break down the common injection profiles:

atOnceUsers(n): The simplest. n users start immediately. Good for simulating a sudden surge, like a website launch or a flash sale announcement.
rampUsers(n) during (duration): Starts with 0 users and gradually increases to n users over the specified duration. This is your basic "getting started" phase.
rampUsersPerSec(rate1) to rate2 during (duration): This is incredibly powerful. It starts with rate1 users arriving per second and ramps up to rate2 users per second over the duration. This is excellent for simulating growth in user concurrency.
constantUsersPerSec(rate) during (duration): Maintains a steady arrival rate of rate users per second for the specified duration. Perfect for simulating sustained peak loads.
nothingFor(duration): Introduces a pause in the simulation, meaning no new users are injected during this time. Useful for observing system behavior under low load or between distinct traffic patterns.

Combining these, as shown in the example, lets you craft complex, multi-stage load profiles. You can model a morning trickle, a midday peak, an afternoon spike, and an evening slowdown all within a single simulation run. The setUp block defines a sequence of these injection steps. Gatling executes them in order, respecting the durations specified for each.

The most surprising thing about Gatling’s injection profiles is how granular you can get with rampUsersPerSec and constantUsersPerSec without needing complex scripting. You’re not just saying "X users," you’re saying "X users arriving per second," which directly influences the concurrency and the load on your application’s request handling. This distinction is critical because it models the rate at which new requests are initiated, not just the total number of simulated users active at any given moment. This is why a simulation using constantUsersPerSec(100) might generate significantly more load than rampUsers(100) during (10 seconds), even if both eventually reach 100 users.

Once you’ve mastered these core injection profiles, you’ll want to explore how to combine them with Gatling’s pause statements to simulate realistic think times between user actions.