Maintaining persistent WebSocket connections across a load-balanced environment is surprisingly tricky because the protocol itself is designed to keep a single, long-lived connection open between a client and a specific server.
Let’s see it in action. Imagine a simple WebSocket server that just echoes back whatever it receives.
# echo_server.py
import asyncio
import websockets
async def echo(websocket, path):
async for message in websocket:
print(f"Received message: {message}")
await websocket.send(f"Echo: {message}")
start_server = websockets.serve(echo, "localhost", 8765)
asyncio.get_event_loop().run_until_complete(start_server)
asyncio.get_event_loop().run_forever()
And a basic client:
// client.js
const ws = new WebSocket("ws://localhost:8765");
ws.onopen = () => {
console.log("Connected!");
ws.send("Hello Server!");
};
ws.onmessage = (event) => {
console.log("Message from server:", event.data);
};
ws.onclose = () => {
console.log("Disconnected.");
};
ws.onerror = (error) => {
console.error("WebSocket Error:", error);
};
If you run this directly, the client connects to localhost:8765, and all messages go back and forth. Easy.
Now, introduce a load balancer in front of multiple instances of this echo_server.py. The moment a client establishes a WebSocket connection, the load balancer needs to stick that client’s subsequent traffic to the same server instance it initially connected to. If it doesn’t, the connection will break because the new server instance won’t have the established WebSocket session context.
The fundamental problem is that standard HTTP load balancers often operate at Layer 4 (TCP) or Layer 7 (HTTP). WebSockets start as an HTTP handshake, then "upgrade" to a persistent TCP connection. A Layer 4 load balancer might just forward TCP packets, but if the client reconnects or the connection drops and needs to re-establish, the load balancer might send it to a different backend server. A Layer 7 load balancer can inspect the HTTP upgrade request, but it needs to be configured to understand and maintain the state for the entire duration of the WebSocket connection, not just the initial handshake.
Here’s how you typically solve this. Most modern load balancers, like Nginx, HAProxy, or cloud provider load balancers (AWS ELB/ALB, Google Cloud Load Balancing), support "sticky sessions" or "session affinity" for TCP connections. For WebSockets, this means configuring the load balancer to use a method that ensures a client’s IP address (or a cookie, though less common for raw WebSockets) consistently maps to the same backend server.
If you’re using Nginx as a reverse proxy, you’d configure it like this:
# nginx.conf
http {
upstream websocket_backend {
ip_hash; # This is key for sticky sessions based on client IP
server 192.168.1.100:8765;
server 192.168.1.101:8765;
server 192.168.1.102:8765;
}
server {
listen 80;
server_name your.domain.com;
location / {
proxy_pass http://websocket_backend;
proxy_http_version 1.1; # Important for HTTP/1.1 upgrade
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
proxy_set_header Host $host;
proxy_read_timeout 86400; # Keep connection open for a long time
proxy_connect_timeout 60s;
}
}
}
The ip_hash directive in the upstream block tells Nginx to hash the client’s IP address and use that hash to consistently select a backend server. This ensures that subsequent requests from the same client IP will be directed to the same server. For the proxy_pass directive, proxy_http_version 1.1; and proxy_set_header Connection "upgrade"; are crucial to correctly handle the WebSocket upgrade handshake. Setting a long proxy_read_timeout is also good practice to prevent the load balancer from prematurely closing idle WebSocket connections.
HAProxy offers a similar capability. In its configuration, you’d define your backend servers and then use a balancing algorithm like balance roundrobin and ensure that session persistence is enabled, often implicitly through the connection tracking. For WebSockets, the http-request capture and http-response set-header directives can be used to inspect and set headers that help maintain the connection state. A common pattern is to use balance source which is analogous to Nginx’s ip_hash.
# haproxy.cfg
frontend http_frontend
bind *:80
default_backend websocket_backend
backend websocket_backend
balance source # Similar to ip_hash
option httpchk
server ws1 192.168.1.100:8765 check
server ws2 192.168.1.101:8765 check
server ws3 192.168.1.102:8765 check
The balance source directive ensures that requests from the same source IP address are consistently sent to the same server. HAProxy also handles the HTTP upgrade gracefully by default when configured for HTTP proxying.
Cloud load balancers have their own methods. AWS Application Load Balancer (ALB), for instance, supports sticky sessions. You can enable this in the ALB’s listener rules, and it uses a cookie to track sessions. For WebSockets, the ALB is designed to handle the upgrade request and maintain stickiness for the duration of the connection, effectively passing the TCP stream to the chosen target. Google Cloud Load Balancing also offers backend service configurations for session affinity.
The core mechanic is that the load balancer needs to see the entire lifecycle of the WebSocket connection, not just the initial HTTP request. It must establish an affinity between the client’s endpoint and a backend server and maintain that affinity for the life of the TCP connection that underlies the WebSocket. If the load balancer is configured for a short connection timeout or doesn’t support persistent TCP session affinity, the WebSocket connection will drop every time a packet is routed to a different server.
Once you have session affinity working, you might encounter issues with keeping the backend servers themselves healthy and available, leading to problems with the next connection.