An active-active load balancer design doesn’t actually balance load; it’s a sophisticated traffic director that ensures no single server becomes a bottleneck and that any server can take over immediately if another fails.
Let’s see it in action. Imagine a simple setup with two web servers, web1 and web2, and a load balancer, lb.
# /etc/nginx/conf.d/loadbalancer.conf on lb
http {
upstream backend_servers {
server web1.example.com:80;
server web2.example.com:80;
}
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend_servers;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
}
}
}
When a user requests example.com, Nginx on lb consults the backend_servers upstream block. By default, it uses a round-robin algorithm, sending the first request to web1, the second to web2, the third to web1, and so on.
The real magic happens when a server fails. If web1 becomes unresponsive, Nginx’s health checks will detect this. By default, these are simple TCP connection checks. If lb can’t establish a connection to web1 on port 80, it marks web1 as unhealthy. All subsequent traffic is then automatically routed only to web2. Once web1 is back online and responding to health checks, Nginx seamlessly reintroduces it into the rotation.
This active-active setup fundamentally solves two problems: scalability and availability. Scalability comes from distributing incoming requests across multiple servers, preventing any single server from being overwhelmed. Availability is achieved by having redundant servers and a mechanism (the load balancer) that can detect failures and reroute traffic, ensuring continuous service even if one or more servers go down.
The mental model for active-active load balancing involves understanding the upstream group and the health check mechanism. The upstream block defines the pool of servers that can handle requests. The load balancer then intelligently distributes traffic among these healthy servers. Health checks are the eyes and ears of the load balancer, constantly probing the backend servers to ensure they are capable of serving requests. If a server fails a health check, it’s temporarily removed from the pool.
The most common algorithms for distribution are round-robin (default), least-connected (sends requests to the server with the fewest active connections), and IP hash (routes requests from the same client IP to the same server, useful for session persistence without cookies). The choice of algorithm depends on the application’s needs. For instance, if your application doesn’t handle sessions well, IP hash is a good choice. If you want to balance the load as evenly as possible in terms of active connections, least_conn is better.
The "active-active" part isn’t about having two load balancers in a primary/secondary failover pair (that’s active-passive). Active-active means that all defined backend servers are actively serving traffic simultaneously. If you have two load balancers configured to point to the same set of backend servers, and both load balancers are actively directing traffic, that’s an active-active load balancer deployment, but the design of the backend server pool is still active-active. The term can be slightly confusing.
A common, less-obvious pitfall is sticky sessions. If you rely on a client always hitting the same server (e.g., for session state stored locally on the server), you need to configure the load balancer to ensure this. In Nginx, this is often done with sticky directives within the upstream block. Without it, a client might get an IP hash assigned to web1 for one request, but if web1 becomes unhealthy and is removed, the next request from that same IP might go to web2, potentially leading to a lost session.
The next hurdle you’ll encounter is managing state across servers when sticky sessions aren’t feasible or desired.