HAProxy doesn’t just serve requests; it can actively prevent them from ever reaching your backend servers, making your application feel faster and your infrastructure cheaper to run.
Let’s see HAProxy caching in action. Imagine we have a backend service that generates slightly expensive reports.
frontend http_in
bind *:80
acl is_report_request path_beg /reports
http-request cache use if is_report_request
http-cache-request method GET
http-cache-request domain example.com
http-cache-request cache-key hdr(Host),path,query
http-cache-request cache-validity 5m # Cache for 5 minutes
http-cache-request cache-size 100MB # 100MB cache
server webserver1 192.168.1.10:80 check
server webserver2 192.168.1.11:80 check
Here’s what’s happening:
frontend http_in: This defines our listening interface.bind *:80: HAProxy listens on port 80.acl is_report_request path_beg /reports: We define a condition. If the request URI starts with/reports, this ACLis_report_requestbecomes true.http-request cache use if is_report_request: This is the core directive. If theis_report_requestACL is true, HAProxy will attempt to serve this request from its cache. If it’s not in the cache or expired, it will fetch it from the backend and store it.http-cache-request method GET: We only cache GET requests. POST, PUT, DELETE, etc., are generally not cacheable.http-cache-request domain example.com: We associate this cache with a specific domain. This is useful if HAProxy serves multiple domains.http-cache-request cache-key hdr(Host),path,query: This defines what makes a cache entry unique. Here, it’s the combination of theHostheader, the requestedpath, and anyqueryparameters. So,example.com/reports/user/123?format=pdfandexample.com/reports/user/456?format=pdfwould be separate cache entries, butexample.com/reports/user/123?format=pdfandexample.com/reports/user/123?format=jsonwould also be separate.http-cache-request cache-validity 5m: Each cached response is considered valid for 5 minutes. After this, HAProxy will re-fetch from the backend.http-cache-request cache-size 100MB: The maximum size of the cache is 100 megabytes. HAProxy will evict older or less frequently used items if this limit is reached.server webserver1 ...: These are our actual backend servers that generate the reports.
When a client requests http://example.com/reports/user/123?format=pdf:
- HAProxy checks the ACL
is_report_request. It matches. - It then checks its cache for an entry with the key
example.com/reports/user/123?format=pdfthat is still valid according tocache-validity. - Cache Hit: If found and valid, HAProxy immediately returns the cached response to the client without bothering
webserver1orwebserver2. - Cache Miss: If not found or expired, HAProxy forwards the request to one of the backend servers (
webserver1orwebserver2). When the backend responds, HAProxy stores that response in its cache (if it fits withincache-sizeand is a cacheable response) before returning it to the client.
This configuration dramatically reduces the load on your backend servers for repetitive report requests, as subsequent requests for the same report within the 5-minute window are served directly from HAProxy’s memory or disk.
The most surprising thing about HAProxy’s caching is that it’s an extension to its core proxying logic, not a separate service. It intercepts requests, consults its local cache, and only if necessary, forwards to the backend. The configuration is declarative, meaning you describe what you want cached, and HAProxy handles the how.
The cache-key directive is your most powerful tool. By default, it uses hdr(Host),path. If your backend application uses query parameters to determine content (e.g., ?sort=asc vs. ?sort=desc), you must include query in the cache-key to differentiate these responses, or you’ll serve the wrong data from the cache. If the query parameters don’t affect the response, omit query to increase cache hit rates.
The next logical step is to understand how to manage cache invalidation more dynamically, perhaps by using http-cache-purge or integrating with backend applications to signal when cached content is stale.