Random load balancing is the simplest distribution strategy, but its effectiveness hinges on a surprising truth: it’s often the best choice when your servers have wildly different performance characteristics and you can’t easily predict which one will be fastest.

Let’s see it in action. Imagine you have three web servers: web-01, web-02, and web-03. We’re using HAProxy, a popular load balancer, to distribute traffic.

Here’s a snippet of its configuration:

frontend http_front
    bind *:80
    mode http
    default_backend http_back

backend http_back
    mode http
    balance random
    server web-01 192.168.1.10:80 check
    server web-02 192.168.1.11:80 check
    server web-03 192.168.1.12:80 check

When a request comes in on port 80, HAProxy looks at the http_back backend. The balance random directive tells it to pick one of the listed servers (web-01, web-02, web-03) completely at random, with equal probability for each. It doesn’t care about how busy they are, how fast they are, or how many connections they currently have.

This seems counterintuitive. Why would you want to distribute traffic randomly when some servers might be struggling and others are idle? The answer lies in the limitations of other strategies.

Consider "round robin," where each server gets a request in turn. If web-01 is a super-fast machine and web-02 is a potato, round robin will send traffic to the potato just as often as the super-fast one. This leads to uneven load and slow response times for users hitting the potato.

"Least connections" is another common strategy. It sends the next request to the server with the fewest active connections. This works well when all servers are roughly equal in performance. However, if web-01 can handle 1000 connections while web-02 chokes at 100, "least connections" will still try to send traffic to web-02 when it’s already overloaded, simply because it has fewer active connections than web-01 might have at its peak. The system doesn’t inherently know the capacity of each server.

Random distribution bypasses these issues by accepting that perfect distribution is impossible or too complex to achieve. Instead of trying to guess which server is currently best or will be best, it simply picks one at random. Over a large number of requests, this random selection tends to even out the load across all servers, even if their individual capacities vary. The "fast" servers will process their randomly assigned requests faster, and the "slow" servers will process theirs slower, but each server gets its fair chance at the requests. This is especially useful in scenarios with unpredictable or highly variable request processing times where predicting which server will be "least loaded" in the immediate future is difficult.

The check directive in the HAProxy configuration is important here. It means HAProxy will periodically ping each server (e.g., by making a HEAD request to /) to ensure it’s healthy. If a server fails its health check, HAProxy will temporarily remove it from the pool of random choices until it comes back online. This prevents you from sending traffic to dead machines, which is a fundamental requirement regardless of your load balancing strategy.

The true power of random balancing emerges when you have a fleet of servers where performance isn’t uniform and you don’t have sophisticated real-time monitoring of each server’s actual processing capability. If you have servers with vastly different hardware specs, or some running more background tasks than others, and you can’t easily quantify their current performance differential, random distribution can surprisingly achieve better overall throughput than more complex algorithms that might make incorrect assumptions about server capacity. It relies on the law of large numbers: over time, the random choices will distribute the work roughly evenly across the available resources, allowing each server to perform at its own maximum capacity without being unfairly burdened or starved of requests.

The next logical step after mastering random distribution is exploring weighted random balancing, where you can assign different probabilities to servers based on their known performance characteristics.

Want structured learning?

Take the full Load-balancing course →