HAProxy’s server weights aren’t just about directing more traffic to faster servers; they’re a dynamic, real-time dial that can prevent outages and optimize performance without downtime.
Let’s see it in action. Imagine you have two identical web servers, web01 and web02, behind HAProxy. By default, they’d split traffic 50/50.
frontend http_frontend
bind *:80
default_backend http_backend
backend http_backend
balance roundrobin
server web01 192.168.1.10:80 check
server web02 192.168.1.11:80 check
If web01 is consistently handling more requests or responding faster, you might want to shift more load to it. You can adjust its weight:
backend http_backend
balance roundrobin
server web01 192.168.1.10:80 check weight 2
server web02 192.168.1.11:80 check weight 1
Now, for every three requests, web01 will get two and web02 will get one. This is real-time load balancing; you can change these weights on the fly via HAProxy’s Runtime API without interrupting existing connections.
This flexibility is crucial when dealing with servers that have different capacities, or when you’re performing rolling updates. Suppose you’re upgrading web02. You don’t want to just pull it out, as that would drop active connections. Instead, you can gracefully shift traffic away.
First, reduce the weight of web02 to 0. This tells HAProxy to stop sending new connections to it.
# Using the HAProxy Runtime API (e.g., via socat or netcat)
echo "set server http_backend/web02 192.168.1.11:80 weight 0" | sudo socat stdio /var/run/haproxy.sock
Existing connections to web02 will continue to use that server until they complete or time out. Once web02 is no longer receiving new traffic, you can safely take it offline for maintenance.
After the upgrade, bring web02 back online and gradually increase its weight. You might start by setting it back to its original weight of 1:
echo "set server http_backend/web02 192.168.1.11:80 weight 1" | sudo socat stdio /var/run/haproxy.sock
HAProxy will then begin sending traffic to web02 again, respecting the new weight ratio.
The weight parameter directly influences the server’s selection probability within the chosen load balancing algorithm. For roundrobin, it’s a simple proportion. For leastconn, it’s a weighted distribution of new connections based on available capacity. A server with weight 2 is considered twice as capable as one with weight 1.
When you set a server’s weight to 0, HAProxy marks it as "drained." This means it will no longer accept new connections, but it will allow existing connections to finish. This is the key to zero-downtime maintenance. The check directive is still active, so HAProxy knows the server’s health status, but the weight setting overrides its availability for new traffic.
The Runtime API is your primary tool for this dynamic adjustment. You connect to the HAProxy socket (typically /var/run/haproxy.sock for TCP sockets or a Unix domain socket) and send commands. The set server command is used to modify server parameters, including weight.
The most counterintuitive aspect of HAProxy’s weight system is that it doesn’t just affect new connections; it’s a continuous re-evaluation. Even if a server has a weight of 1 and another has 2, and HAProxy just sent a connection to the weight 2 server, it will still consider the weight ratio for the next connection. It’s not a fixed assignment per connection, but a dynamic probability for each incoming request.
Understanding how HAProxy queues and selects servers based on their weights, especially in conjunction with the check and draining states, unlocks sophisticated traffic management without service interruptions.
The next challenge is often managing server health checks and ensuring HAProxy itself doesn’t become a single point of failure.