The Nginx worker process is failing to accept new connections because it has hit its configured limit for concurrent connections.
Common Causes and Fixes
-
worker_connectionsis too low:- Diagnosis: Check your
nginx.conffile for theworker_connectionsdirective. You can also see the current open file descriptors for an Nginx worker process.sudo ss -s | grep 'Nginx:' # Or, to see file descriptors for a specific worker process: sudo lsof -p $(pgrep -f 'nginx: worker process') | wc -l - Fix: Increase the
worker_connectionsvalue in yournginx.conffile. For example, to allow 4096 connections per worker:
This directive sets the maximum number of simultaneous connections that a single worker process can handle. Each connection consumes a file descriptor.events { worker_connections 4096; } - Why it works: By raising this limit, you’re giving each worker process the capacity to manage more incoming connections before it starts rejecting them.
- Diagnosis: Check your
-
System-wide file descriptor limit is too low:
- Diagnosis: Nginx workers, like all processes, are limited by the operating system’s maximum number of open file descriptors. Check the current limit for the Nginx user.
You can also check the system-wide limit:sudo su - <nginx_user> -c 'ulimit -n' # Common nginx_user is 'www-data' or 'nginx'cat /proc/sys/fs/file-max - Fix: Increase the
worker_connectionsinnginx.confto be less than or equal to your system’sulimit -n(andfile-max). Then, increase the system-wideulimit -nfor the Nginx user. Edit/etc/security/limits.conf:
Replace<nginx_user> soft nofile 65536 <nginx_user> hard nofile 65536<nginx_user>with your Nginx user (e.g.,www-data). You’ll need to restart Nginx or reload its configuration for these limits to take effect. For immediate effect without a full restart, you might useprlimit:sudo prlimit --pid $(pgrep -f 'nginx: worker process' | head -n 1) --nofile 65536:65536 - Why it works: This ensures that the operating system itself doesn’t prevent Nginx workers from opening the number of file descriptors specified by
worker_connections.worker_connectionsis capped by the available file descriptors.
- Diagnosis: Nginx workers, like all processes, are limited by the operating system’s maximum number of open file descriptors. Check the current limit for the Nginx user.
-
worker_processesis too high relative toworker_connections:- Diagnosis: If you have many
worker_processesand a lowworker_connections, the total possible connections across all workers might still be insufficient. However, this is less common than the other causes for the "too many open files" error. More typically,worker_connectionsis the direct bottleneck. - Fix: While not directly fixing the "not enough connections" error, ensure your
worker_processessetting is reasonable for your CPU cores. A common setting isworker_processes auto;orworker_processes <number_of_cpu_cores>;. If you must have many workers, ensureworker_connectionsis scaled appropriately.events { worker_connections 4096; } http { worker_processes auto; # Or set to the number of CPU cores # ... } - Why it works: This ensures a balanced distribution of work and resources. However, the primary error is about individual worker capacity, not the total number of workers.
- Diagnosis: If you have many
-
Too many long-lived connections (e.g., keepalive, websockets):
- Diagnosis: If your application serves many persistent connections (like WebSockets, or clients with long
keepalive_timeoutvalues), these connections consume file descriptors even when idle. Monitor active connections.sudo ss -tan state established | grep nginx | wc -l - Fix: Reduce
keepalive_timeoutin yournginx.conf(usually in thehttporserverblock) to a more reasonable value, e.g.,keepalive_timeout 65;(default is 75). For WebSockets, ensure your application logic is efficient and closes connections when no longer needed.http { keepalive_timeout 65; # ... } - Why it works: A shorter
keepalive_timeoutcauses Nginx to close idle client connections after a specified period, freeing up file descriptors.
- Diagnosis: If your application serves many persistent connections (like WebSockets, or clients with long
-
Unclosed connections in upstream applications:
- Diagnosis: If your Nginx is proxying to backend applications, and those applications are not properly closing their connections back to Nginx or are holding connections open unnecessarily, it can appear as if Nginx is hitting its limit. Check the number of connections from Nginx to your upstream.
sudo ss -tan state established | grep '<upstream_ip>:<upstream_port>' | wc -l - Fix: Investigate your upstream application’s connection handling. Ensure it’s not leaking connections or holding them open longer than necessary. This is an application-level fix, not an Nginx configuration one.
- Why it works: Properly managed upstream connections prevent Nginx from holding onto resources for non-existent or idle backend services.
- Diagnosis: If your Nginx is proxying to backend applications, and those applications are not properly closing their connections back to Nginx or are holding connections open unnecessarily, it can appear as if Nginx is hitting its limit. Check the number of connections from Nginx to your upstream.
-
Nginx
worker_connectionsis set to a very high value without correspondingulimit:- Diagnosis: If you’ve set
worker_connectionsto an extremely high number (e.g., 100000) but your system’sulimit -nis much lower (e.g., 4096), Nginx will try to open that many, but the OS will refuse after hitting its own limit. The error message might seem confusingly about Nginx workers.# Check nginx.conf for worker_connections # Check ulimit -n for the Nginx user - Fix: Ensure
worker_connectionsis less than or equal to theulimit -nfor your Nginx user. For example, ifulimit -nis 65536, settingworker_connections 32768;is safe.events { worker_connections 32768; } - Why it works: This aligns Nginx’s request for resources with the system’s actual available resources, preventing the OS from denying connection attempts.
- Diagnosis: If you’ve set
After fixing these, you might encounter connect() failed (111: Connection refused) if your upstream applications are now overloaded or not running.