HAProxy health checks aren’t just about checking if a server is "up" or "down"; they’re a sophisticated negotiation between HAProxy and your backend services about what "healthy" actually means.
Let’s watch HAProxy in action. Imagine a simple setup: a frontend listening on port 80, directing traffic to two backend web servers.
frontend http_frontend
bind *:80
default_backend web_servers
backend web_servers
balance roundrobin
server web1 192.168.1.10:80 check
server web2 192.168.1.11:80 check
Here, check tells HAProxy to perform a basic TCP connection check on web1 and web2 every few seconds. If the connection fails, HAProxy marks the server as down. This is the most basic form of an active health check.
But what if your web server is responding to TCP connections but serving garbage or error pages? That’s where more advanced checks come in, moving into passive territory and more descriptive active checks.
The core problem HAProxy health checks solve is preventing it from sending traffic to backend servers that cannot fulfill requests, whether due to a complete outage, a partial failure, or a misconfiguration. It’s about maintaining service availability and quality by dynamically adjusting the pool of available servers.
Internally, HAProxy maintains a state for each backend server. This state includes whether the server is currently considered healthy, how many consecutive health check failures it has had, and how many consecutive successes it needs to be marked healthy again. When a health check is configured, HAProxy initiates probes to the backend server at a specified interval. The response (or lack thereof) from the backend server dictates the state change.
The check directive is the gateway. Without it, HAProxy will blindly send traffic to a server until it fails to respond at all. With check, HAProxy starts actively probing. You can customize this probing extensively:
-
check port <port>: If your application listens on a different port than the server’s main port, you specify it here.server web1 192.168.1.10:80 check port 8080This tells HAProxy to establish a TCP connection to
192.168.1.10on port8080for health checking, even though it will forward application traffic to192.168.1.10:80. -
check method <HTTP_METHOD>: For HTTP backends, you can specify the HTTP method HAProxy should use for the check (e.g.,GET,POST,HEAD).HEADis often efficient as it only requests headers.server web1 192.168.1.10:80 check method HEAD -
check uri <URI>: This is crucial. It tells HAProxy to send a request to a specific URI and expect a valid HTTP response.server web1 192.168.1.10:80 check uri /healthzHAProxy will send a
GET /healthzrequest toweb1and consider the server healthy if it receives an HTTP status code between 200 and 399 (inclusive). -
check expect <PATTERN>: You can refine theuricheck by expecting specific content in the response body.server web1 192.168.1.10:80 check uri /healthz expect "OK"Now, HAProxy not only checks the status code but also verifies that the response body contains the string "OK".
-
check ssl/check crt <certfile>: For HTTPS backends, these enable SSL/TLS health checks.server web1 192.168.1.10:443 check ssl crt /etc/ssl/certs/haproxy.pem -
inter <milliseconds>: Sets the interval between health checks. Default is 2000ms (2 seconds).server web1 192.168.1.10:80 check inter 5000 -
fall <count>: The number of consecutive failed checks before marking a server down. Default is 3.server web1 192.168.1.10:80 check fall 5 -
rise <count>: The number of consecutive successful checks required to mark a server as up again after it was down. Default is 2.server web1 192.168.1.10:80 check rise 3 -
http-check connect-timeout <milliseconds>: Sets a timeout specifically for establishing the connection during an HTTP health check.server web1 192.168.1.10:80 check uri /healthz http-check connect-timeout 1000 -
http-check send-state: This is a key component for passive health checks. When enabled, HAProxy will send a special HTTP header (e.g.,X-HAProxy-Health: DOWN) to the backend server if HAProxy has marked it as down. The backend application can then see this header and potentially take corrective action or log the event. While this doesn’t directly affect HAProxy’s decision-making at that moment, it allows the application to become aware of its own unhealthy state as perceived by the load balancer.
A subtle but powerful aspect of HAProxy’s health checking is how it handles TCP connection resets initiated by the backend. If a backend server gracefully closes a TCP connection after HAProxy has established it for a health check, HAProxy interprets this as a server actively indicating it’s shutting down or unable to serve. This is different from a hard connection refusal or timeout. HAProxy treats this "graceful" closure as a health check failure, contributing to the fall count. This mechanism allows for more nuanced detection of backend service degradation beyond just outright unresponsiveness.
The next step is understanding how to aggregate these checks across multiple instances of a service for truly resilient deployments.