GitHub Actions can orchestrate load tests using Locust, but its execution often fails due to network isolation and resource constraints.
Here’s a breakdown of what’s really happening and how to fix it:
The Core Problem: Network Isolation and Resource Limits
GitHub Actions runners are ephemeral VMs. They exist in isolated network environments and have finite CPU/memory. When Locust tries to spin up its distributed mode (a master and multiple workers), these limitations become critical. The master needs to communicate with workers, and workers need to reach your application under test. If the network can’t connect or the runner runs out of juice, your load tests will fail.
Common Causes and Solutions
-
Master and Worker Network Connectivity Issues
- Diagnosis: Check the Locust master logs for messages like
Could not connect to <worker_ip>. On the runner, usedocker network inspect <network_name>(if using Docker) orip addrto see IP addresses and available networks. - Cause: By default, Docker containers (which Locust workers often run in) are on their own isolated bridge network. The master, also potentially in a container, can’t see them unless explicitly told to join the same network.
- Fix: Ensure master and workers are on the same Docker network. If you’re running Locust directly on the runner, ensure they can
pingeach other.
And in yourjobs: load_test: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v3 - name: Set up Docker Compose uses: docker/setup-docker-compose@v1 - name: Start Locust Master and Workers run: | docker-compose up -d locust-master locust-workerdocker-compose.yml:
This works because Docker Compose creates a network (version: '3.7' services: locust-master: image: locustio/locust ports: - "8089:8089" volumes: - ./:/mnt/locust command: -f /mnt/locust/locustfile.py --master -H http://your-app-url.com networks: - locust-net locust-worker: image: locustio/locust volumes: - ./:/mnt/locust environment: LOCUST_MASTER_HOST: locust-master # Use service name for internal DNS command: -f /mnt/locust/locustfile.py --worker networks: - locust-net networks: locust-net: driver: bridgelocust-net) and allows services to resolve each other by their service names (e.g.,locust-master). - Why it Works: Explicitly placing master and workers on the same user-defined Docker network allows them to discover and communicate with each other using Docker’s internal DNS resolution.
- Diagnosis: Check the Locust master logs for messages like
-
Application Under Test (AUT) Accessibility
- Diagnosis: Locust workers log errors like
HTTPError: HTTPConnectionPool(host='your-app-url.com', port=80): Max retries exceeded with url: / (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at ...>: Failed to establish a new connection: [Errno -2] Name or service not known')). Usecurl your-app-url.comfrom within the runner or a worker container. - Cause: The GitHub Actions runner (or its Docker network) cannot resolve or reach the hostname of your application under test. This is common if your AUT is only accessible from your internal network or a specific IP range.
- Fix:
- Expose AUT: If possible, deploy a temporary instance of your AUT in a publicly accessible cloud environment (e.g., a small EC2 instance, a Render service) and point Locust to its public URL.
- VPN/Tunneling: If your AUT is internal, establish a VPN connection or use tools like
ngrokto create a secure tunnel from your CI runner to your internal network. - Runner IP Whitelisting: Whitelist the outbound IP addresses of GitHub Actions runners in your AUT’s firewall. You can find these IPs in GitHub’s documentation (they change, so this is less reliable for long-term use).
jobs: load_test: runs-on: ubuntu-latest steps: # ... other steps - name: Run Locust run: locust --host http://your-public-app-url.com --users 100 --spawn-rate 10 --run-time 5m --headless - Why it Works: Ensures the Locust workers have a clear network path and DNS resolution to reach the target application.
- Diagnosis: Locust workers log errors like
-
Insufficient Runner Resources (CPU/Memory)
- Diagnosis: Locust master or worker logs show
Killedmessages, or the GitHub Actions job times out with an error indicating resource exhaustion. Monitor CPU and memory usage in the Actions UI if available. - Cause: Running a large number of users or high spawn rates on a standard
ubuntu-latestrunner can quickly consume all available CPU and RAM. - Fix:
- Increase User Load Gradually: Start with a lower user count and spawn rate to establish a baseline.
- Use Larger Runners: If available, switch to a runner with more CPU and memory. For example, use a self-hosted runner with more powerful hardware.
- Optimize Locustfile: Ensure your
locustfile.pyis efficient. Avoid heavy computation within theon_startorwait_timemethods if possible. - Limit Distributed Workers: If running many workers, consider if they are all necessary. Sometimes fewer, more powerful workers are better than many struggling ones.
jobs: load_test: runs-on: ubuntu-latest # Consider a more powerful runner if available # ... # If using Docker, ensure your docker-compose.yml doesn't overcommit resources # For direct execution: run: locust --host ... --users 500 --spawn-rate 50 # Start lower and increase - Why it Works: Provides the necessary computational power and memory for Locust to manage the simulated users and their requests without being OOM-killed or CPU-throttled.
- Diagnosis: Locust master or worker logs show
-
Docker Daemon Issues or Configuration
- Diagnosis: Errors related to Docker not starting, containers failing to pull images, or network interfaces not being created. Check
docker infoanddocker system df. - Cause: The Docker daemon on the GitHub Actions runner might be in a bad state, or its configuration (e.g., storage driver, network settings) might be incompatible with Locust’s needs.
- Fix:
- Restart Docker: Sometimes a simple restart helps. In a workflow, you might not have direct control, but ensuring a clean runner environment helps.
- Clean Docker Cache:
docker system prune -afcan clear out unused images, containers, and networks. - Specify Docker Network Driver: Ensure your
docker-compose.ymluses a standard network driver likebridge.
jobs: load_test: runs-on: ubuntu-latest steps: # ... - name: Clean Docker Cache if: always() # Run even if previous steps fail run: docker system prune -af # ... - name: Start Locust with Docker Compose run: docker-compose up -d - Why it Works: Resets the Docker environment, removing potential conflicts or corrupted states that prevent proper container networking and operation.
- Diagnosis: Errors related to Docker not starting, containers failing to pull images, or network interfaces not being created. Check
-
Firewall/Security Group Restrictions
- Diagnosis: Locust workers report connection refused or timeouts when trying to reach the AUT. Network scans from the runner fail to connect to the AUT’s port.
- Cause: Network firewalls, security groups (AWS, Azure, GCP), or host-based firewalls on the AUT server are blocking incoming connections from the IP addresses of GitHub Actions runners.
- Fix:
- Add Runner IPs to Whitelist: Identify the outbound IP addresses of GitHub Actions runners and add them to your AUT’s firewall rules.
- Use a Proxy: Route traffic through a known proxy server whose IP is already whitelisted.
- Disable Firewall Temporarily (for testing): As a temporary measure, disable the firewall on the AUT only for testing purposes and ensure it’s re-enabled afterwards.
- Why it Works: Allows direct network traffic from the Locust workers to the application under test by explicitly permitting it at the network perimeter.
-
Incorrect
locustfile.pyPath or Execution- Diagnosis: Locust master fails to start, reporting
No such file or directory: /mnt/locust/locustfile.pyor similar. - Cause: The
locustfile.pyis not present in the expected directory on the runner or within the Docker container, or thecommandargument indocker-compose.ymlor therunstep points to the wrong path. - Fix: Ensure your
locustfile.pyis checked out correctly and the path in yourcommandorrunstep matches its location relative to the execution context. Uselsorpwdin your workflow steps to verify.jobs: load_test: runs-on: ubuntu-latest steps: - name: Checkout code uses: actions/checkout@v3 - name: Verify locustfile exists run: ls -l locustfile.py # Or wherever your locustfile is - name: Start Locust run: locust --host http://your-app-url.com -f locustfile.py # Ensure -f points correctly - Why it Works: Guarantees that Locust can find and load the test script it needs to execute.
- Diagnosis: Locust master fails to start, reporting
The next error you’ll likely encounter after fixing these is a 5xx error from your application under test, indicating that while Locust can reach your app, your app itself is struggling under the load.