L4 load balancing distributes network traffic across multiple servers based on Layer 4 information, primarily TCP and UDP ports.
Imagine a busy restaurant. The host at the front is your L4 load balancer. When a new customer (a network request) arrives, the host doesn’t care what the customer wants to order (the specific application logic). They just look at the table number (the destination port) and direct the customer to an available table (a server) that’s serving that type of meal.
Here’s a simplified view of how it works with a web server farm.
Scenario: A Cluster of Web Servers
Let’s say you have three web servers: webserver-1 (192.168.1.10), webserver-2 (192.168.1.11), and webserver-3 (192.168.1.12). They all serve HTTP traffic on port 80.
Your L4 load balancer, let’s call it lb-front (10.0.0.1), will receive all incoming requests on port 80.
Configuration Snippet (Conceptual)
# This is a conceptual representation; actual syntax varies by load balancer
stream {
server {
listen 80; # The port the load balancer listens on
proxy_pass webserver_pool; # Directs traffic to the defined pool
}
upstream webserver_pool {
server 192.168.1.10:80;
server 192.168.1.11:80;
server 192.168.1.12:80;
}
}
When a client connects to lb-front:80, the load balancer examines the incoming TCP SYN packet. It sees the destination port is 80. Based on its configured algorithm (e.g., Round Robin, Least Connections), it selects one of the servers in the webserver_pool and forwards the SYN packet to that server’s IP and port. The client then establishes a direct TCP connection to the selected backend server.
Algorithms in Action
- Round Robin: The simplest. It just cycles through the servers in order:
webserver-1, thenwebserver-2, thenwebserver-3, then back towebserver-1.- Client 1 connects to
lb-front:80. Load balancer pickswebserver-1. - Client 2 connects to
lb-front:80. Load balancer pickswebserver-2. - Client 3 connects to
lb-front:80. Load balancer pickswebserver-3. - Client 4 connects to
lb-front:80. Load balancer pickswebserver-1again.
- Client 1 connects to
- Least Connections: The load balancer tracks active TCP connections to each backend server. It sends new connections to the server with the fewest active connections. This is better for long-lived connections or when server processing times vary.
- If
webserver-1has 5 connections,webserver-2has 3, andwebserver-3has 7, the next connection goes towebserver-2.
- If
- IP Hash: The load balancer calculates a hash based on the client’s source IP address. This hash determines which backend server receives the connection. This ensures that a specific client’s requests consistently go to the same backend server, which is crucial for applications that maintain session state on the server.
- Client from
1.2.3.4might consistently hash towebserver-1. - Client from
5.6.7.8might consistently hash towebserver-2.
- Client from
UDP Traffic
The same principles apply to UDP, but without the connection state. For UDP, the load balancer typically uses Round Robin or IP Hash, as there’s no "connection count" to measure. A DNS server cluster is a common example. If you have multiple DNS servers (dns-1 to dns-3) behind lb-front on port 53, the load balancer will distribute UDP DNS queries across them.
Health Checks
A critical part of L4 load balancing is knowing when a backend server is unhealthy. The load balancer periodically sends a small probe (e.g., a TCP SYN to port 80, or a UDP packet to port 53) to each backend server. If a server fails to respond within a timeout (e.g., 5 seconds) or responds with an error, the load balancer marks it as down and stops sending traffic to it until it recovers.
# Example health check configuration
stream {
server {
listen 80;
proxy_pass webserver_pool;
health_check interval=5s timeout=3s fall=3 rise=2; # Check every 5s, timeout in 3s, fail after 3 failures, recover after 2 successes
}
upstream webserver_pool {
server 192.168.1.10:80;
server 192.168.1.11:80;
server 192.168.1.12:80;
}
}
This ensures that traffic is only sent to healthy, responsive servers, preventing clients from encountering errors due to a downed backend.
The magic of L4 load balancing lies in its speed and simplicity. It operates at a lower network level, making decisions quickly based on IP addresses and port numbers. It doesn’t need to inspect the actual content of the packets (like L7 load balancers do), making it highly performant for high-throughput scenarios. This is why it’s often used for protocols like TCP and UDP where the application-level data isn’t relevant for distribution decisions.
The next step in understanding load balancing is moving up the stack to L7, where decisions are made based on application-specific data like HTTP headers or URLs.