This is happening because your load balancer is successfully receiving requests but is unable to establish a healthy connection with any of its configured backend servers, leading to upstream timeouts and 502 Bad Gateway errors.

Common Causes and Fixes

1. Backend Server Unreachable (Network)

  • Diagnosis: From the load balancer’s perspective, can it ping or connect to the backend IP/port?
    nc -vz <backend_ip> <backend_port>
    
    If this fails, the load balancer cannot even initiate a TCP connection.
  • Fix: Ensure firewall rules on the backend server (e.g., iptables, ufw, cloud provider security groups) explicitly allow inbound traffic from the load balancer’s IP address (or subnet) on the application port.
    # Example for iptables on backend server
    sudo iptables -A INPUT -s <load_balancer_ip> -p tcp --dport <backend_port> -j ACCEPT
    sudo iptables -A INPUT -p tcp --dport <backend_port> -j DROP # Drop others if restrictive
    
    This works by allowing the fundamental network handshake between the load balancer and the backend.
  • Fix: Verify that the backend server’s network interface is active and has the correct IP address configured.
    ip addr show
    
    This ensures the server is actually listening on the network interface the load balancer is trying to reach.

2. Backend Application Not Running or Listening

  • Diagnosis: Is the application process actually running on the backend server and listening on the expected port?
    sudo netstat -tulnp | grep <backend_port>
    # Or on newer systems
    sudo ss -tulnp | grep <backend_port>
    
    You should see a process listening on 0.0.0.0:<backend_port> or <backend_ip>:<backend_port>.
  • Fix: Start or restart the application service. For example, if using systemd:
    sudo systemctl restart <your_app_service_name>
    
    This ensures the application is active and ready to accept incoming connections.

3. Incorrect Backend Server IP/Port in Load Balancer Config

  • Diagnosis: Double-check the server directives in your load balancer configuration file (e.g., Nginx nginx.conf or HAProxy haproxy.cfg).
    • Nginx:
      # In http block or server block
      upstream backend_servers {
          server 192.168.1.100:8080;
          server 192.168.1.101:8080;
      }
      
    • HAProxy:
      backend my_app
          server app1 192.168.1.100:8080 check
          server app2 192.168.1.101:8080 check
      
  • Fix: Correct any typos or outdated IP addresses/ports in the load balancer’s upstream or backend definitions. Reload the load balancer configuration.
    # Nginx
    sudo nginx -s reload
    # HAProxy
    sudo systemctl reload haproxy
    
    This ensures the load balancer is trying to connect to the correct destinations.

4. Health Check Failures

  • Diagnosis: Load balancers often perform health checks. If all backends fail their health checks, the load balancer will route no traffic. Check the load balancer’s status page or logs for health check details.
    • Nginx (with nginx-plus-module-vts or similar): Check the VTS dashboard for upstream status.
    • HAProxy: Access the stats socket or port:
      echo "show stat" | sudo socat stdio /var/run/haproxy/admin.sock
      
      Look for backend servers in a DOWN or N/A state.
  • Fix: Ensure the health check endpoint on the backend application is correctly configured, accessible, and returns a success code (e.g., HTTP 200 OK).
    • Nginx:
      location /healthz {
          access_log off;
          return 200 "OK";
      }
      
    • HAProxy:
      backend my_app
          option httpchk GET /healthz
          http-check expect status 200
          server app1 192.168.1.100:8080 check
      
    This ensures the load balancer accurately determines backend availability.

5. Backend Application Crashing Under Load or Misbehaving

  • Diagnosis: Check the logs of the backend application instances for errors, stack traces, or indications of resource exhaustion (e.g., out of memory, too many open files).
    sudo journalctl -u <your_app_service_name> -f
    # Or check specific log files:
    tail -f /var/log/your_app.log
    
    Look for sudden spikes in errors correlating with the 502s.
  • Fix: Optimize the application code, increase server resources (CPU, RAM), or tune application-specific connection limits or thread pools. This addresses the root cause within the application that prevents it from responding to the load balancer’s health checks or application requests.

6. Load Balancer Resource Exhaustion

  • Diagnosis: Monitor the load balancer’s CPU, memory, and network connections. High CPU or a large number of CLOSE_WAIT or ESTABLISHED connections can indicate it’s overwhelmed.
    top
    htop
    netstat -an | grep ESTABLISHED | wc -l
    
  • Fix: Scale up the load balancer instance (more CPU/RAM) or scale out by adding more load balancer instances behind a DNS round-robin or another higher-level load balancer. This ensures the load balancer itself has the capacity to process requests and manage connections.

7. Incorrect proxy_pass or backend configuration (e.g., missing port, wrong protocol)

  • Diagnosis: Review the load balancer’s configuration for the directives responsible for forwarding traffic.
    • Nginx: proxy_pass http://backend_servers; or proxy_pass http://192.168.1.100:8080;
    • HAProxy: server app1 192.168.1.100:8080 Ensure the protocol (http/https) and port match what the backend application is expecting. If the backend is listening on HTTPS, the load balancer needs to be configured for SSL termination or pass-through correctly.
  • Fix: Correct the proxy_pass or server directive to specify the correct protocol and port.
    # Example: If backend is on HTTPS
    proxy_pass https://backend_servers;
    
    # Example: If backend is on HTTPS
    server app1 192.168.1.100:8443 ssl verify none
    
    This ensures the load balancer speaks the correct "language" to the backend.

Once these are resolved, you’ll likely encounter errors related to application-level issues, such as missing dependencies or incorrect request routing within your application.

Want structured learning?

Take the full Load-balancing course →