Locust’s database load testing often reveals that the database itself is the bottleneck, not your application code’s ability to make the requests.

Let’s look at a typical scenario. You’ve got a Locust script that’s churning through requests, hitting your API, and your API is dutifully translating those into database queries. You see your Locust users reporting high response times, but your application logs show the API handler is super fast. It’s a classic "finger-pointing" situation.

The underlying issue is usually that the database, under concurrent load from many Locust users, is struggling to execute the queries efficiently. This isn’t about your application’s logic; it’s about how the database engine is handling parallel execution, resource contention, and query optimization at scale.

Here’s how to diagnose and fix common database bottlenecks during Locust load tests:

1. Inefficient Queries (The Usual Suspect)

Diagnosis: The most common culprit is a query that performs poorly when executed many times concurrently. Use your database’s built-in performance analysis tools. For PostgreSQL, this is EXPLAIN ANALYZE. For MySQL, it’s EXPLAIN. Run the exact query your Locust script is executing against the database directly, but with a high concurrency simulated.

Example PostgreSQL command:

EXPLAIN ANALYZE SELECT * FROM users WHERE email = 'user@example.com';

Look for:

  • Sequential Scans (Seq Scan) on large tables where an index should be used.
  • High cost or actual time values, especially when compared to other operations in the EXPLAIN ANALYZE output.
  • Nested Loop joins that are inefficient for large datasets.

Fix:

  • Add appropriate indexes: If you see a Seq Scan on a WHERE clause, add an index on that column. PostgreSQL: CREATE INDEX idx_users_email ON users (email); MySQL: ALTER TABLE users ADD INDEX idx_users_email (email); This allows the database to quickly locate specific rows without scanning the entire table.
  • Rewrite queries: Sometimes, a query can be simplified or rewritten to use more efficient join strategies or fewer subqueries. For example, avoiding SELECT * and only fetching necessary columns can reduce I/O.

Why it works: Indexes create a lookup structure, like a book’s index, allowing the database to jump directly to the relevant data instead of reading every page. Rewriting queries reduces the computational work the database needs to do per request.

2. Connection Pooling Exhaustion

Diagnosis: Your application (or Locust’s database client) might not be using connection pooling effectively, or the pool size is too small for the number of concurrent Locust users. Each database connection has overhead.

Check your database’s connection count. PostgreSQL: SELECT count(*) FROM pg_stat_activity; MySQL: SHOW PROCESSLIST;

If the connection count is maxing out your database’s configured limit, or if your application logs show "too many connections" errors, this is likely the issue.

Fix:

  • Configure connection pooling on your application side: Ensure your ORM or database driver is configured with an adequate connection pool size. For example, in SQLAlchemy (Python), you might set pool_size=20 and max_overflow=10.
  • Increase database connection limit: If your database server itself has a max_connections setting that’s too low, increase it. For PostgreSQL, edit postgresql.conf: max_connections = 100. For MySQL, my.cnf: max_connections = 200. Caution: Each connection consumes memory; don’t set this arbitrarily high without understanding your server’s resources.

Why it works: Connection pooling reuses existing database connections, avoiding the expensive setup and teardown for each request. A larger pool allows more concurrent requests to be serviced without waiting for a connection to become available.

3. Insufficient Database Resources (CPU/Memory/IOPS)

Diagnosis: The database server itself is overloaded. Monitor your database server’s CPU, RAM, and disk I/O during the Locust test.

  • CPU: Consistently high CPU usage (e.g., > 80-90%) indicates the processor is a bottleneck.
  • Memory: If your database is constantly swapping to disk (high swap usage), it means it doesn’t have enough RAM to cache data and query execution plans.
  • Disk I/O: High disk read/write latency or queue depth suggests the storage subsystem can’t keep up with the database’s demands.

Tools: top, htop, vmstat, iostat on Linux; Cloud provider monitoring dashboards (AWS CloudWatch, GCP Monitoring, Azure Monitor).

Fix:

  • Scale up the database server: Increase the CPU cores, RAM, or upgrade to faster storage (e.g., SSDs over HDDs, NVMe).
  • Optimize database configuration: Tune parameters like shared_buffers (PostgreSQL) or innodb_buffer_pool_size (MySQL) to better utilize available RAM for caching. PostgreSQL: shared_buffers = 2GB (adjust based on total RAM) MySQL: innodb_buffer_pool_size = 4GB (adjust based on total RAM)
  • Offload read replicas: For read-heavy workloads, direct read traffic to replica databases to reduce load on the primary.

Why it works: More powerful hardware can process more queries. Proper memory allocation allows the database to keep frequently accessed data and execution plans in RAM, drastically reducing disk I/O. Read replicas distribute the query load.

4. Lock Contention

Diagnosis: When multiple transactions try to access and modify the same rows or tables simultaneously, they can block each other, leading to timeouts or severe performance degradation. Check for active locks. PostgreSQL: SELECT pid, usename, datname, query, wait_event_type, wait_event FROM pg_stat_activity WHERE wait_event IS NOT NULL; MySQL: SHOW ENGINE INNODB STATUS; (look for TRANSACTIONS and LOCKS sections)

Fix:

  • Optimize transactions: Keep transactions as short as possible. Commit or rollback quickly.
  • Reduce lock granularity: Ensure your queries are not locking entire tables when only specific rows are needed. Use row-level locking where possible.
  • Review isolation levels: Understand the transaction isolation level your application is using. Higher isolation levels (like SERIALIZABLE) provide stronger guarantees but increase lock contention. Consider if a lower level (like READ COMMITTED) is sufficient.

Why it works: Shorter, more granular transactions and appropriate isolation levels minimize the time windows where data is locked, allowing more concurrent access.

5. Network Latency Between App and DB

Diagnosis: While less common if your app and DB are co-located, significant network latency or packet loss between your application servers and the database server can add up, especially for chatty queries or applications with many small, frequent DB calls. Use ping and traceroute from your application server to the database server. Look for high RTT (Round Trip Time) or packet loss.

Fix:

  • Co-locate application and database servers: Place them in the same availability zone or datacenter.
  • Optimize network configuration: Ensure sufficient bandwidth and a stable network path.

Why it works: Reducing the physical distance and network hops decreases the time it takes for data packets to travel between the application and the database.

6. Database Configuration Tuning (Beyond Memory)

Diagnosis: Default database configurations are often conservative. Parameters related to query planning, buffer management, and background worker processes might be suboptimal for high-throughput loads.

Fix:

  • PostgreSQL: Tune work_mem (for sorting and hashing), effective_cache_size (helps query planner estimate available memory), max_worker_processes and max_parallel_workers_per_gather (for parallel query execution). work_mem = 32MB (start here, increase if EXPLAIN ANALYZE shows sorts spilling to disk) effective_cache_size = 4GB (approx. 50-75% of total RAM) max_parallel_workers_per_gather = 4 (adjust based on CPU cores)
  • MySQL: Tune tmp_table_size, max_heap_table_size (for temporary tables), query_cache_size (though often disabled in newer versions), and thread-related settings like thread_cache_size.

Why it works: These parameters influence how the database engine plans and executes queries, manages internal buffers, and utilizes available CPU for parallel operations.

After addressing these, the next challenge you’ll likely encounter is effectively managing and analyzing the sheer volume of performance metrics generated by your database under load.

Want structured learning?

Take the full Locust course →