Locust’s gevent integration is the secret sauce for handling tens of thousands of concurrent users on a single machine, and it does it by not using threads or processes for concurrency.

Here’s a Locust test running with gevent, simulating 10,000 users hitting an endpoint:

from locust import HttpUser, task, between
import time

class WebsiteUser(HttpUser):
    wait_time = between(0.1, 0.5)
    host = "http://localhost:8080"
    pool_size = 10000 # This is the key for gevent!

    @task
    def index(self):
        self.client.get("/index.html")

    @task
    def about(self):
        self.client.get("/about.html")

# To run this:
# 1. Make sure you have locust and gevent installed:
#    pip install locust gevent
# 2. Start a simple web server on http://localhost:8080 that serves index.html and about.html.
#    For example, using Python's http.server:
#    cd /path/to/your/html/files
#    python -m http.server 8080
# 3. Run locust from your terminal in the same directory as this script:
#    locust -f your_script_name.py
# 4. Open your browser to http://localhost:8089, set the number of users to 10000, and start swarming.

You’ll see Locust’s web UI report a massive number of users, and your single machine will be busy, but not necessarily maxed out on CPU in the way you might expect if it were using traditional threads.

The magic here is cooperative multitasking via gevent. Instead of relying on the operating system to preemptively switch between threads, gevent uses coroutines. Think of coroutines as lightweight, user-level threads that voluntarily yield control back to the gevent event loop. When one coroutine is waiting for an I/O operation (like an HTTP request to complete), it yields, and the event loop can immediately switch to another coroutine that’s ready to run. This is incredibly efficient for I/O-bound tasks like making many HTTP requests, as the CPU isn’t sitting idle waiting for network responses.

Locust’s HttpUser is built on top of gevent.monkey_patch(). This patches standard Python libraries like socket, ssl, and threading to make them gevent-compatible. When you make a request using self.client.get(), the underlying network call is intercepted by gevent. If that call would block (e.g., waiting for a TCP connection or a response), gevent seamlessly switches to another user’s coroutine. The pool_size parameter in the HttpUser is not about managing a pool of threads; it’s about telling gevent how many concurrent "greenlets" (gevent’s term for coroutines) you want to potentially run.

The core problem gevent solves for Locust is scaling I/O-bound concurrency. Traditional threading in Python has overhead (memory per thread, context switching cost) and is limited by the Global Interpreter Lock (GIL), which prevents multiple threads from executing Python bytecode simultaneously. Gevent bypasses the GIL for I/O-bound tasks because the context switching happens cooperatively within a single OS thread. This allows a single Locust worker process to manage thousands of concurrent HTTP requests with relatively low resource consumption compared to a multi-threaded approach.

The exact levers you control are primarily:

  • pool_size: This is the most direct knob for gevent concurrency. A higher pool_size allows more greenlets to be active. You’ll often set this to the total number of users you intend to simulate on that worker.
  • wait_time: While not directly gevent-related, this controls the rate at which users initiate new requests, influencing the overall load and how often gevent needs to switch contexts.
  • host and client: These define the target system and how requests are made. Gevent’s monkey-patching ensures that the network operations performed by self.client are cooperative.

Here’s the one detail that trips people up: gevent’s cooperative nature means that if a coroutine performs a CPU-bound task without yielding, it can block the entire event loop for that worker process. This is why Locust tests should primarily focus on I/O-bound operations. If you need to do heavy computation within a user task, you’d typically offload it to a separate process or use a library designed for asynchronous CPU-bound work.

The next step is understanding how gevent’s patching interacts with other asynchronous libraries or custom I/O that might not be automatically patched.

Want structured learning?

Take the full Locust course →