Round Robin, Least Connection, and IP Hash are the most common load balancing algorithms, but they’re not interchangeable tools; each has a distinct impact on how traffic flows and which component eventually buckles under pressure.
Let’s see them in action. Imagine we have three backend web servers: web-01, web-02, and web-03. A load balancer sits in front of them.
Round Robin
This is the simplest. The load balancer just cycles through the list of servers, sending each new request to the next server in line.
-
Config (Conceptual):
load_balancer.algorithm = round_robin load_balancer.servers = [web-01, web-02, web-03] -
Request 1: Goes to
web-01 -
Request 2: Goes to
web-02 -
Request 3: Goes to
web-03 -
Request 4: Goes to
web-01(back to the start)
Least Connection
This algorithm is smarter. It tracks the number of active connections to each server and sends the new request to the server with the fewest active connections. This is great for requests that take varying amounts of time to complete.
-
Config (Conceptual):
load_balancer.algorithm = least_connection load_balancer.servers = [web-01, web-02, web-03] -
Scenario:
- Initially, all servers have 0 connections. Request 1 goes to
web-01. web-01has 1 connection.web-02has 0.web-03has 0. Request 2 goes toweb-02.web-01has 1.web-02has 1.web-03has 0. Request 3 goes toweb-03.web-01has 1.web-02has 1.web-03has 1. Request 4 goes toweb-01(arbitrarily, or based on tie-breaking rules).- Now, imagine
web-01gets a long-running request. It now has 2 connections. If a new request comes in, it will go toweb-02orweb-03because they still have only 1 connection.
- Initially, all servers have 0 connections. Request 1 goes to
IP Hash
This algorithm uses the client’s IP address to determine which server receives the request. It hashes the IP address and uses the hash value to select a server. The key benefit is stickiness: all requests from the same IP address will consistently go to the same backend server.
-
Config (Conceptual):
load_balancer.algorithm = ip_hash load_balancer.servers = [web-01, web-02, web-03] -
Scenario:
- Client IP
192.168.1.10hashes to a value that maps toweb-01. All requests from192.168.1.10go toweb-01. - Client IP
10.0.0.5hashes to a value that maps toweb-02. All requests from10.0.0.5go toweb-02. - Client IP
172.16.30.200hashes to a value that maps toweb-03. All requests from172.16.30.200go toweb-03.
- Client IP
The Mental Model
Load balancing is about distributing incoming network traffic across multiple backend servers. The goal is to prevent any single server from becoming overwhelmed, thus improving responsiveness, availability, and scalability.
- Round Robin is like a waiter taking orders: "You, you, you, then back to you." It’s simple and ensures everyone gets a turn, but doesn’t account for how long each order takes.
- Least Connection is like a busy restaurant manager directing customers to the table with the fewest diners. It tries to keep the load evenly distributed at that moment, assuming that fewer active connections mean less work for the server.
- IP Hash is like a host assigning each regular customer to their favorite table. It guarantees that a returning customer (from the same IP) always gets the same experience, which is crucial for applications that maintain session state on the server.
What Happens Under the Hood (and Why It Matters)
When a load balancer uses IP Hash, it’s not just picking a server randomly. It’s performing a mathematical operation (a hash function) on the source IP address. This function consistently produces the same output for the same input. The output is then mapped to an index or a server identifier. For example, a common hashing scheme might look at the last octet of an IPv4 address. If the last octet is between 0-85, it goes to server 1; 86-170, server 2; 171-255, server 3. This deterministic mapping is what ensures "stickiness." The load balancer doesn’t need to store session state itself; the client’s IP acts as the key.
The crucial point about IP Hash is that it’s highly sensitive to Network Address Translation (NAT). If multiple clients behind a single NAT device (like a home router or a corporate firewall) are accessing your service, they will all appear to the load balancer as having the same source IP address. This means all those clients will be directed to the same backend server, potentially overwhelming it while other servers sit idle.
The choice of algorithm directly impacts how well your servers handle varying workloads and how gracefully your application scales. Round Robin can lead to uneven load if requests have drastically different processing times. Least Connection is better for that but can still suffer if a server is momentarily overloaded by a single long-running request that hasn’t finished yet. IP Hash is excellent for stateful applications but can create "sticky" overload scenarios with NAT.
Understanding the underlying hashing mechanism and its interaction with network topology like NAT is key to troubleshooting why certain servers might appear overloaded while others are underutilized, even with seemingly balanced traffic.
The next challenge you’ll face is handling server failures and ensuring your load balancer can detect and react to them automatically.