Locust Distributed Mode: Scale Load with Master-Worker (2026)

Locust distributed mode lets you run load tests across multiple machines, which is crucial when your single-machine capacity isn’t enough.

Let’s see it in action. Imagine you have a web service at http://localhost:8080 that you want to hammer with 10,000 users.

First, you need to install Locust on all machines you’ll use:

pip install locust

On your "master" machine (the one that orchestrates everything), you’ll start Locust in master mode. You’ll need a locustfile.py that defines your load test. Here’s a simple one:

# locustfile.py
from locust import HttpUser, task, between

class WebsiteUser(HttpUser):
    wait_time = between(1, 5)  # Wait 1-5 seconds between tasks

    @task
    def index(self):
        self.client.get("/")

    @task
    def about(self):
        self.client.get("/about")

Now, start the master:

locust -f locustfile.py --master

This will start a web UI on port 8089. You can access it via http://localhost:8089.

Next, on your "worker" machines (the ones that will actually generate load), you start Locust in worker mode, pointing it to your master:

locust -f locustfile.py --worker --master-host localhost

You can start as many workers as you need. Each worker will connect to the master and wait for instructions.

Back on the master’s web UI (http://localhost:8089), you’ll see the "Number of users to simulate" and "Spawn rate" fields. For our 10,000 user goal, you’d enter:

Number of users to simulate: 10000
Spawn rate: 100 (This means 100 users will be spawned per second)

Then click "Start Swarming".

The master will then instruct all connected workers to start spawning users and sending requests to your target http://localhost:8080. The results from all workers will be aggregated and displayed in the master’s web UI.

The core problem Locust distributed mode solves is overcoming the CPU and memory limitations of a single machine. A single machine can only realistically simulate a few thousand concurrent users before its network stack, CPU, or memory become saturated. By distributing the load generation across multiple machines, you can simulate tens of thousands, hundreds of thousands, or even millions of concurrent users, provided you have enough machines and network bandwidth. Each worker process is responsible for a subset of the total simulated users. The master process aggregates statistics from all workers, calculates averages, and presents a unified view in its web UI. It also handles the coordination of starting and stopping the swarm.

The master-host flag is critical for workers to find the master. If your master is running on a different IP address, say 192.168.1.100, you’d use --master-host 192.168.1.100. The master process itself doesn’t run any load tests; its sole job is to coordinate and collect data. The workers are the ones actually executing the locustfile.py and making HTTP requests.

When you’re running a large distributed test, the master’s web UI might not immediately reflect the total number of users you intended if not all workers have connected yet. The "Total users" count on the master UI will only increase as workers successfully connect and report their status. If you start 10 workers and set a target of 10,000 users, and each worker is configured to spawn 1,000 users (this is implicit based on the total target and number of workers), you’ll see the user count climb as each worker comes online and starts its allocated user spawning.

The locust command line arguments are key to this. The master runs locust --master, and workers run locust --worker --master-host <master_ip>. The locustfile.py is the same for both. The master distributes the user count and spawn rate configured in its UI to all connected workers. Each worker then aims to spawn its share of users and report its individual statistics back to the master. The master aggregates these, so you get a single, cohesive view of the entire distributed load test.

What most people don’t realize is that the master itself doesn’t "know" how many users each worker should spawn. It simply broadcasts the total desired user count and spawn rate. The workers then divide this workload among themselves. If you start 10 workers and set a target of 10,000 users, each worker will effectively try to spawn 1,000 users. If one worker fails to start, the total simulated users will be less than your target unless you manually adjust the target on the master or start another worker.

The next step after mastering distributed execution is exploring advanced statistics and reporting, such as using custom listeners to hook into events or integrating with external monitoring systems.