Round robin load balancing is often presented as a simple, even distribution, but its reality is far more nuanced, often leading to uneven resource utilization if you’re not careful.
Let’s see it in action. Imagine we have a simple web server cluster behind a load balancer.
# Example configuration snippet (hypothetical, syntax varies by LB)
load_balancer {
name = "my-web-lb"
frontend_port = 80
backend_protocol = "http"
algorithm = "round_robin"
backend_servers {
server {
address = "192.168.1.101"
port = 80
}
server {
address = "192.168.1.102"
port = 80
}
server {
address = "192.168.1.103"
port = 80
}
}
}
When a request comes in to my-web-lb:80, the load balancer picks the next server in sequence: 192.168.1.101, then 192.168.1.102, then 192.168.1.103, then back to 192.168.1.101, and so on. This seems perfectly even.
But what happens when requests aren’t uniform? Suppose 192.168.1.101 is a much faster server, or its requests are consistently shorter. It might complete its work and be ready for the next request in the round robin cycle before 192.168.1.102 or 192.168.1.103 have even finished their current one. By the time the load balancer sends a new request to 192.168.1.101, it’s already been idle for a bit, while 192.168.1.102 and 192.168.1.103 might still be busy. The "even" distribution becomes uneven in practice because it doesn’t account for how long each server takes to process a request.
This is where the concept of weighted round robin comes into play, a common enhancement. Instead of each server getting one request in turn, you assign weights. A server with weight 2 gets two requests for every one request a server with weight 1 gets. This allows you to compensate for differences in server capacity or performance.
# Example weighted round robin configuration
load_balancer {
# ... other settings ...
algorithm = "weighted_round_robin"
backend_servers {
server {
address = "192.168.1.101" # Faster server
port = 80
weight = 3
}
server {
address = "192.168.1.102" # Standard server
port = 80
weight = 1
}
server {
address = "192.168.1.103" # Standard server
port = 80
weight = 1
}
}
}
In this setup, the sequence might look like: 101 (w3), 101 (w3), 102 (w1), 101 (w3), 103 (w1), 101 (w3), 101 (w3), 102 (w1), etc. The exact sequence depends on the implementation, but the ratio of requests will approach 3:1:1 for servers 101, 102, and 103 respectively. This is how you achieve a more practical evenness, aligning traffic with capability.
The core problem round robin solves is the need to avoid sending all traffic to a single point of failure and to distribute load across multiple instances. The simple version is easy to implement and understand, making it a go-to for basic setups. It’s also stateless from the load balancer’s perspective, meaning it doesn’t need to track which server handled which connection for sticky sessions, simplifying its own architecture.
However, the most surprising aspect for many is how quickly the "even distribution" breaks down under real-world conditions. Network latency, varying request complexities, and differing server processing speeds all conspire to make simple round robin a potentially poor choice for anything beyond very homogeneous and predictable workloads. The load balancer blindly cycles through its list, oblivious to the actual state or capacity of the backend servers. It’s like a waiter taking orders in a strict circle of tables, regardless of whether table 3 is still chewing its first bite or has already finished its meal and is ready for dessert.
What most people don’t realize is that the order in which servers are listed in the configuration can sometimes influence the initial distribution patterns, especially when combined with server weights and the specific algorithm implementation. Some weighted round robin implementations use a least-connections-based approach as a tie-breaker if multiple servers have the same weight and are ready to receive a request.
The next logical step after mastering weighted round robin is understanding health checks and how they dynamically remove unhealthy servers from the rotation.