The most surprising thing about HAProxy load balancing is that "least connections" doesn’t actually mean the server with the fewest active connections.

Let’s see it in action. Imagine we have a simple backend pool of two web servers, web1 and web2, serving a popular API.

frontend http_frontend
    bind *:80
    default_backend web_backend

backend web_backend
    balance roundrobin
    server web1 192.168.1.10:80 check
    server web2 192.168.1.11:80 check

When HAProxy is configured with balance roundrobin, it simply cycles through the servers in the order they are listed. If web1 is first, it gets the first request, web2 gets the second, web1 the third, and so on. This is the default and often sufficient for evenly distributed traffic.

Now, let’s switch to least-conn:

backend web_backend
    balance leastconn
    server web1 192.168.1.10:80 check
    server web2 192.168.1.11:80 check

With leastconn, HAProxy tracks the number of established connections to each server. When a new request arrives, it’s sent to the server with the lowest current connection count. This is incredibly useful when requests have vastly different processing times. A long-running API call might tie up a server for minutes, while a simple GET request might complete in milliseconds. leastconn ensures that servers aren’t overloaded by requests that take a long time to complete.

Finally, source:

backend web_backend
    balance source
    server web1 192.168.1.10:80 check
    server web2 192.168.1.11:80 check

The source algorithm hashes the client’s IP address (or other source information) to determine which server it should go to. This means that a specific client IP will consistently be directed to the same backend server. This is crucial for applications that maintain session state on the server-side. If a user’s requests were bounced between web1 and web2, their session data might be lost, leading to a broken experience. source guarantees stickiness.

The mental model for load balancing algorithms is about how HAProxy decides which server in a pool gets the next incoming connection. roundrobin is purely sequential. leastconn is dynamic, reacting to the current load based on established connections. source is static, based on the client’s identity.

Here’s the nuance about leastconn: it’s based on the number of established connections, not necessarily the number of active requests being processed. A server might have many established connections but be actively processing very few requests if those connections are idle. HAProxy also has a weighted-leastconn option, which combines the leastconn logic with server weights, allowing you to direct more traffic to more powerful servers.

Understanding these algorithms is key to preventing overloaded servers and ensuring a smooth user experience, especially with stateful applications or uneven request processing times.

The next logical step is to explore how HAProxy handles server health checks and what happens when a server becomes unhealthy.

Want structured learning?

Take the full Haproxy course →