Nginx worker processes are exiting because the operating system is terminating them due to a segmentation fault, meaning they tried to access memory they shouldn’t have.
Common Causes and Fixes for Nginx Worker Process Exited on Signal 11
A segmentation fault (Signal 11) in an Nginx worker process is a serious issue that indicates a bug or a misconfiguration leading to memory corruption or illegal memory access. This often manifests as worker processes crashing and being restarted by the master process, leading to intermittent service disruptions. Here are the most common causes and how to diagnose and fix them:
1. Corrupted Module or Third-Party Module Issues
Third-party Nginx modules, or even core modules that have become corrupted, are a frequent source of segmentation faults. These modules interact directly with Nginx’s internal memory structures, and a bug in their code can easily lead to memory violations.
-
Diagnosis:
- Check Nginx error logs: Look for specific messages preceding the "worker process exited on signal 11" that might indicate which module was being processed or where the fault occurred.
- Recompile Nginx: If you suspect a specific module, try recompiling Nginx without that module. If the crashes stop, you’ve found your culprit.
- Check module source: If it’s a third-party module, examine its bug tracker or recent commits for known issues.
-
Fix:
- Update or remove the module: If the module is outdated, update it to the latest stable version. If it’s a known buggy module, consider removing it or finding an alternative.
- Recompile Nginx with correct module flags: Ensure you are compiling Nginx with the correct
--add-module=/path/to/moduleflags, and that the module’s source code is compatible with your Nginx version. - Example recompilation (if
ngx_http_fancyindex_moduleis suspected):cd /usr/local/src/nginx-1.22.1 # Your Nginx source directory ./configure --prefix=/etc/nginx \ --sbin-path=/usr/sbin/nginx \ --modules-path=/usr/lib/nginx/modules \ # ... other configure options ... # --add-module=/usr/local/src/ngx_http_fancyindex_module # Temporarily remove or comment this out make sudo make install sudo systemctl restart nginx - Why it works: Recompiling with the correct flags ensures the module is integrated properly. Removing a faulty module eliminates the source of the memory corruption.
2. Insufficient System Resources (Memory/Swap)
While less common for Signal 11 specifically (more common for OOM killer), severe memory pressure can sometimes lead to unexpected behavior and memory access violations within Nginx worker processes.
-
Diagnosis:
- Monitor system memory and swap: Use
top,htop, orfree -hto check current memory and swap usage. - Check
dmesgfor OOM killer messages: Although Signal 11 is not directly the OOM killer, a system under extreme memory stress might exhibit related issues.
sudo dmesg | grep -i "out of memory" - Monitor system memory and swap: Use
-
Fix:
- Increase RAM: The most direct solution is to add more physical memory to the server.
- Increase swap space: If adding RAM is not feasible, increasing swap space can provide a temporary buffer.
# Create a 2GB swap file (adjust size as needed) sudo fallocate -l 2G /swapfile sudo chmod 600 /swapfile sudo mkswap /swapfile sudo swapon /swapfile # Make it permanent by adding to /etc/fstab echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab - Optimize Nginx configuration: Reduce worker_connections, tune buffer sizes, or offload static content.
- Why it works: Providing more memory or swap space prevents processes from being starved of resources, reducing the likelihood of them accessing invalid memory regions due to memory contention.
3. Incorrectly Configured client_body_buffer_size or large_client_header_buffers
These directives control how Nginx buffers client request bodies and headers. If set too low for the requests being processed, Nginx might attempt to write data beyond allocated buffer boundaries.
-
Diagnosis:
- Analyze typical request sizes: Use Nginx access logs to understand the typical size of client request bodies and headers.
- Examine
client_body_buffer_sizeandlarge_client_header_buffers: Check yournginx.confand any included server/location blocks for these directives.
-
Fix:
- Increase buffer sizes: Gradually increase these values. A common starting point for
client_body_buffer_sizeis128kor256k, and forlarge_client_header_buffers,2 8kor4 16k. - Example configuration:
http { client_body_buffer_size 256k; large_client_header_buffers 4 16k; # ... other http settings ... } - Why it works: Larger buffers provide sufficient space for Nginx to store incoming request data without overflowing allocated memory, preventing memory corruption.
- Increase buffer sizes: Gradually increase these values. A common starting point for
4. Kernel or System Library Bugs
In rare cases, the segmentation fault might be caused by a bug in the operating system kernel or a critical system library that Nginx relies upon (e.g., libc).
-
Diagnosis:
- Check
dmesgand system logs: Look for any kernel-level errors or messages that coincide with the Nginx crashes.
sudo dmesg | grep -i "segfault" sudo journalctl -xe | grep -i "segfault"- Reproduce the issue with a minimal configuration: Temporarily revert to a very basic Nginx configuration to rule out application-level issues.
- Check for system updates: See if there are known kernel or library bugs for your specific OS version.
- Check
-
Fix:
- Update the operating system and kernel: Apply the latest stable updates for your Linux distribution.
# For Debian/Ubuntu sudo apt update && sudo apt upgrade -y # For RHEL/CentOS/AlmaLinux sudo yum update -y # Reboot after kernel updates sudo reboot - Why it works: OS and kernel updates often include patches for memory management bugs and stability fixes that can resolve underlying issues causing segmentation faults.
- Update the operating system and kernel: Apply the latest stable updates for your Linux distribution.
5. Incorrectly Compiled Nginx or Dependencies
If Nginx was compiled from source, an incorrect configuration during the ./configure step or issues with build dependencies can lead to unstable binaries.
-
Diagnosis:
- Review
./configureoutput: If you compiled Nginx yourself, examine the output of the./configurescript for any warnings or errors related to missing libraries or incompatible features. - Check
nginx -V: Ensure the reported build flags and modules match your expectations.
nginx -V - Review
-
Fix:
- Recompile Nginx carefully: Ensure all necessary development headers and libraries are installed (e.g.,
libpcre3-dev,zlib1g-dev,libssl-devon Debian/Ubuntu). Run./configureagain with all required options and thenmake && sudo make install. - Example dependencies for Debian/Ubuntu:
sudo apt install build-essential libpcre3 libpcre3-dev zlib1g zlib1g-dev libssl-dev - Why it works: A clean and correct compilation process ensures that Nginx is built with all its dependencies properly linked and configured, preventing runtime errors due to missing or incompatible components.
- Recompile Nginx carefully: Ensure all necessary development headers and libraries are installed (e.g.,
6. Malformed or Malicious Client Requests
Although less common for Signal 11 directly (more often resulting in specific error codes), extremely malformed or crafted requests designed to exploit buffer overflows or other memory vulnerabilities in Nginx’s parsing logic could theoretically trigger a segmentation fault.
-
Diagnosis:
- Scrutinize access logs for unusual requests: Look for requests with excessively long headers, unusual character encodings, or strange request methods.
- Enable verbose error logging: Temporarily increase Nginx’s error log level to
debug(use with caution in production) to capture more detailed information about request processing.
error_log /var/log/nginx/error.log debug;- Use a Web Application Firewall (WAF): Tools like ModSecurity can help identify and block suspicious requests before they reach Nginx.
-
Fix:
- Update Nginx: Ensure you are running a recent, stable version of Nginx, as many security vulnerabilities are patched over time.
- Implement WAF rules: Configure a WAF to detect and reject malformed or malicious requests.
- Why it works: Updating Nginx addresses known vulnerabilities. A WAF acts as a gatekeeper, filtering out potentially harmful inputs that could trigger memory safety issues.
After addressing these potential causes, the next error you might encounter, if your configuration was previously too aggressive or resource-starved, is related to socket errors or connection timeouts if worker processes are still struggling to keep up.