Beyond Round Robin: Smarter Load Balancing

Nginx’s default load balancing algorithm, Round Robin, is often less effective than you’d think for even moderately trafficked sites.

Let’s see Nginx in action. Imagine we have three backend web servers (web1, web2, web3) and Nginx is acting as the load balancer.

http {
    upstream backend_servers {
        server web1.example.com;
        server web2.example.com;
        server web3.example.com;
    }

    server {
        listen 80;
        server_name yourdomain.com;

        location / {
            proxy_pass http://backend_servers;
        }
    }
}

When a request comes in for yourdomain.com/, Nginx will send it to web1. The next request goes to web2, then web3, and then it cycles back to web1. This is Round Robin. Simple, predictable, and often problematic.

The problem arises because Round Robin doesn’t care how busy a server is. If web1 is busy handling a complex, long-running request, it still gets the next incoming request. This can lead to uneven load distribution, with some servers struggling while others are idle.

This is where ip_hash and least_conn come in.

IP Hash: Sticky Sessions Without the Hassle

ip_hash is Nginx’s way of implementing "sticky sessions" – ensuring that requests from a particular client IP address always go to the same backend server.

upstream backend_servers {
    ip_hash;
    server web1.example.com;
    server web2.example.com;
    server web3.example.com;
}

With ip_hash, Nginx calculates a hash based on the client’s IP address. This hash determines which server the client will be directed to. If another request comes from the same IP, the same hash is calculated, and the client is sent to the same server.

This is useful for applications that store session state locally on the web server. If a user is logged in on web1, they’ll continue to be directed to web1 for the duration of their session, preventing them from being logged out unexpectedly.

However, ip_hash can lead to uneven distribution if you have many clients behind a single NAT gateway (like a corporate proxy or mobile network). All those clients will appear to Nginx as coming from the same IP address, and they’ll all be shunted to one server, potentially overwhelming it.

Least Conn: Sending Traffic to the Least Busy Server

least_conn is Nginx’s most intelligent load balancing method. It directs traffic to the server with the fewest active connections.

upstream backend_servers {
    least_conn;
    server web1.example.com;
    server web2.example.com;
    server web3.example.com;
}

When a request arrives, Nginx checks how many active connections each backend server currently has. It then sends the new request to the server with the smallest number of active connections. This dynamically balances the load, ensuring that no single server gets overloaded while others sit idle.

This is generally the preferred method for most web applications, especially those with varying request processing times. It naturally handles situations where some requests take longer than others.

Nginx’s load balancing directives are powerful, but their effectiveness depends heavily on your application’s characteristics. Round Robin is simple but naive, ip_hash provides sticky sessions but can cause its own imbalances, and least_conn offers dynamic, intelligent load distribution.

The subtle point about least_conn is that it considers all active connections, not just new ones. If a server is busy processing long-lived requests, its connection count will remain high, and new requests will be diverted elsewhere, even if it initially received fewer connections than others.

The next step is often exploring health checks to automatically remove unresponsive servers from the pool.