Sticky sessions, also known as session affinity or session persistence, are the mechanism that ensures a user’s requests consistently hit the same backend server.
Let’s see this in action. Imagine a simple web application where users log in and perform actions. Without sticky sessions, the load balancer might send the first request to backend-1 and the second request to backend-2. If the user’s session state (like their login status or shopping cart contents) is only stored on backend-1, the request to backend-2 will fail because it doesn’t have that context.
# User's first request hits backend-1
curl -b cookies.txt -c cookies.txt http://your-load-balancer.com/login -d "username=user&password=password"
# Response from backend-1: Session ID is set in cookies.txt
# User's second request, intended to stay on backend-1
curl -b cookies.txt -c cookies.txt http://your-load-balancer.com/view_cart
# If load balancer sends this to backend-2, it will likely fail!
Sticky sessions solve this by instructing the load balancer to look at a specific piece of information in the incoming request – usually a cookie or an IP address – and always route subsequent requests with that same identifier to the server that originally handled it.
How it Works Internally
Load balancers typically implement sticky sessions in a few primary ways:
-
Cookie-Based Persistence (Most Common): When a user’s first request hits a backend server, that server sets a unique session cookie in its response. The load balancer, configured to observe this cookie, then adds a special "stickiness" cookie to the client’s browser or simply remembers the mapping between the original session cookie and the backend server. Subsequent requests from that client, carrying the session cookie, are routed back to the same backend server. If the load balancer itself is generating the stickiness cookie, it’s often named something like
AWSALB(for AWS ELB) orSERVERID. -
Source IP Address Persistence: The load balancer uses the client’s source IP address as the sticky identifier. All requests originating from the same IP address will be directed to the same backend server. This is simpler to implement but has limitations, especially in environments where multiple users share a single public IP (e.g., corporate networks, mobile carriers).
-
Application-Generated Cookie Persistence: Similar to cookie-based persistence, but the application itself generates a unique session ID and embeds it in a cookie. The load balancer is configured to recognize this specific cookie and use its value to determine the backend server.
The Levers You Control
When configuring sticky sessions on your load balancer (e.g., Nginx, HAProxy, AWS ELB/ALB, Azure Load Balancer), you’ll typically adjust:
- Persistence Type: Choose between cookie-based, IP-based, or other methods.
- Cookie Name (if applicable): Specify the name of the cookie the load balancer should look for or set. For example, in Nginx,
sticky_cookie_name session_id;tells it to use a cookie namedsession_id. - Cookie Lifetime/Timeout: How long the stickiness should remain active. This is crucial; if the cookie expires too quickly, stickiness is lost. If it never expires, a user might be stuck on a server that goes down.
- Insertion Method (if applicable): Whether the load balancer inserts its own stickiness cookie or relies on the application to set one.
For example, in HAProxy, you might configure it like this:
backend my_app
balance roundrobin
cookie SRVID insert indirect nocache
server app1 192.168.1.10:80 check
server app2 192.168.1.11:80 check
Here, cookie SRVID insert indirect nocache tells HAProxy to use a cookie named SRVID. It will insert this cookie into the response if it’s not present, it’s indirect (meaning the backend server sets the original session cookie, and HAProxy uses that to derive the SRVID), and nocache prevents caching of this cookie.
The most surprising thing about sticky sessions is that their implementation often degrades load balancing effectiveness for highly available systems. While they ensure session state is maintained, they can create "hot spots" on specific servers if traffic isn’t perfectly distributed, or worse, leave users stranded if the server they’re "stuck" to becomes unavailable. This is why many modern applications are designed to store session state externally (e.g., in Redis or a database) to eliminate the need for sticky sessions altogether, allowing the load balancer to truly distribute traffic evenly and make individual server failures less impactful.
The next concept you’ll likely grapple with is how to gracefully handle backend server failures when sticky sessions are still a requirement, often involving health checks and rebalancing strategies.