Azure Load Balancer’s default distribution isn’t what you’d expect, and that’s precisely why you need to understand its mechanics.
Let’s see it in action. Imagine a simple setup: two backend VMs, vm1 and vm2, both running a web server on port 80. An Azure Load Balancer, myLoadBalancer, is fronting them, with a public IP 20.10.20.30.
# Simulate client requests to the public IP
curl http://20.10.20.30/
curl http://20.10.20.30/
curl http://20.10.20.30/
curl http://20.10.20.30/
curl http://20.10.20.30/
If you were to monitor the logs on vm1 and vm2, you’d see a pattern emerge. It’s not a simple round-robin. Azure Load Balancer uses a five-tuple hash to determine which backend instance receives a given connection. This tuple consists of:
- Source IP Address
- Source Port
- Destination IP Address
- Destination Port
- Protocol (TCP or UDP)
When a new connection request arrives, the Load Balancer calculates this hash. The resulting value then dictates which backend pool member will receive the traffic for that specific connection. This means all subsequent packets within the same connection will go to the same backend instance.
This "sticky session" behavior is crucial for many applications that maintain state within a single TCP connection. However, it can also lead to uneven distribution if you have clients with a very limited range of source IPs or ports, or if a single client initiates many connections.
The problem this solves is providing a single point of access for a pool of backend resources, ensuring high availability and scalability. Without it, clients would need to know the IP addresses of individual backend VMs, making it impossible to scale or replace servers transparently.
Internally, the Azure Load Balancer operates at Layer 4 (Transport Layer). It doesn’t inspect the application-level payload (like HTTP headers). This makes it very fast and efficient, but also means it has no awareness of application-specific routing rules.
The primary lever you control is the backend pool configuration and, to some extent, the health probes. The backend pool defines which VMs are eligible to receive traffic. The health probes tell the Load Balancer which VMs are healthy and capable of processing requests. If a VM fails its health probe, the Load Balancer stops sending new connections to it.
You can influence distribution indirectly by manipulating the source IP for clients. For example, if clients are coming from a NAT gateway, all their traffic will appear to originate from the single IP of the NAT gateway. This would cause all connections to be hashed based on that single source IP, potentially leading to an imbalance.
The default distribution algorithm, while consistent for a given connection, is not a simple round-robin across all incoming connections. It’s a hash of the full 5-tuple. This means if you have one client constantly opening new connections, all those connections might land on the same backend if the source port happens to be the same or within a range that hashes to the same backend.
For scenarios requiring more granular control, like session affinity based on something other than the 5-tuple (e.g., cookie-based affinity for HTTP), you’d typically look to Azure Application Gateway, which operates at Layer 7. However, for pure L4 distribution and ensuring that a single TCP connection always goes to the same backend, Azure Load Balancer’s default hashing is what you work with.
The next thing you’ll likely encounter is how to manage health probes effectively to ensure only healthy instances receive traffic.