HAProxy can offload request filtering to external agents, allowing for complex and dynamic filtering logic without bogging down the HAProxy process itself.
Let’s see this in action. Imagine a scenario where we want to block requests based on a dynamically updated list of malicious IP addresses. Instead of trying to manage this list within HAProxy’s ACLs (which can become unwieldy), we can use an external agent.
Here’s a simplified HAProxy configuration snippet:
frontend http_in
bind *:80
mode http
spoe_requests process_request_filter
default_backend webservers
backend webservers
mode http
server web1 192.168.1.10:80 check
server web2 192.168.1.11:80 check
And here’s a conceptual outline of what our external SPOE agent (process_request_filter) might do:
- Receive request data from HAProxy (e.g., source IP, request URI, headers).
- Consult an external data source (like a Redis cache or a database) for a list of blocked IPs.
- If the source IP is in the blocked list, instruct HAProxy to reject the request (e.g., return a 403 Forbidden).
- Otherwise, instruct HAProxy to allow the request to proceed.
The spoe_requests directive tells HAProxy to send specific request events to the named SPOE process. The process_request_filter is our custom agent.
The core problem SPOE solves here is separation of concerns. HAProxy is a high-performance network proxy. Its strength lies in efficiently routing, load balancing, and performing simple, fast checks. Complex, stateful, or dynamically changing filtering logic can quickly overwhelm it if implemented directly. SPOE allows us to delegate these heavier computations to dedicated external processes, which can be written in any language, run independently, and scale separately. This keeps HAProxy lean and fast while providing immense flexibility for sophisticated request manipulation.
Internally, the SPOE protocol is a simple, text-based protocol. HAProxy sends a series of commands (like connect, get fehdr, get srcip) to the SPOE agent, and the agent responds with commands (like set status 403, allow, reject). This communication happens over a TCP socket. The spoe-agent.conf file in HAProxy’s configuration directory defines the agents, their communication methods (TCP socket, Unix socket), and the messages they exchange.
The exact levers you control are primarily in how you configure the SPOE agent itself. You define which HAProxy events trigger the agent (e.g., spoe_requests, spoe_errors), and what information HAProxy sends to the agent for each event. Crucially, you define the logic within the agent: what data it checks, how it checks it, and what HAProxy command it sends back as a response. For instance, you can send specific HTTP headers, cookie values, or even parts of the request body to the agent for analysis.
Consider the spoe-agent.conf for our example:
global
# ... other global settings ...
agents
# Define the external agent process
process process_request_filter
# How HAProxy connects to the agent
socket /var/run/haproxy/spoe_req.sock mode 660 level admin
# The HAProxy configuration file where messages are defined
config /etc/haproxy/spoe_request_filter.conf
And the spoe_request_filter.conf:
# Define the messages exchanged between HAProxy and the agent
messages
# Message sent from HAProxy to the agent on request
# HAProxy sends the source IP address and the request URI
req_filter request
args srcip,uri
event track
# Define the actions the agent can take
actions
# If the IP is blocked, set the HTTP status to 403
set_status_403
code 403
fmt "Forbidden"
# Allow the request to proceed
allow_request
code 200
fmt "OK"
# Define the agent's behavior
agent
# When HAProxy sends the 'req_filter' message
process req_filter
# Get the source IP and URI
lua-load /etc/haproxy/scripts/request_filter.lua
# Call the Lua function to handle the logic
call process_request_filtering
And the Lua script request_filter.lua:
local blocked_ips = {
["192.168.1.100"] = true,
["192.168.1.101"] = true
}
function process_request_filtering(ha_args)
local srcip = ha_args[1]
local uri = ha_args[2]
if blocked_ips[srcip] then
return "set_status_403" -- Tell HAProxy to return 403
else
return "allow_request" -- Tell HAProxy to allow
end
end
In this setup, ha_args would contain the values sent by HAProxy. The Lua script looks up the srcip in its internal blocked_ips table. If found, it returns the string "set_status_403", which maps to the set_status_403 action defined in spoe_request_filter.conf. If not found, it returns "allow_request". This is a very basic example; in a real-world scenario, blocked_ips would be populated from an external, dynamic source.
The most surprising true thing about SPOE is that it’s not just for simple filtering; you can use it for complex request manipulation and even to implement entirely custom load balancing algorithms by influencing backend selection or modifying request data before it reaches the backend. The protocol is surprisingly expressive, and the ability to use external scripts (like Lua) or programs makes it incredibly powerful for tailoring HAProxy’s behavior to very specific application needs.
The next concept to explore is using SPOE for dynamic response modification, such as injecting custom headers or altering response bodies based on external data.