Load Balance HTTP/2 Traffic with Sticky Sessions and Stream Routing (2026)

HTTP/2 traffic can be load balanced with sticky sessions and stream routing, but the common understanding of "sticky sessions" doesn’t fully capture how it works here.

Let’s see it in action. Imagine a busy e-commerce site. A user browses, adds items to their cart, and proceeds to checkout. Each of these actions is a series of HTTP requests. With HTTP/2, these requests can be multiplexed over a single TCP connection.

# Example of HTTP/2 frames on a single connection
# Connection: 192.168.1.100:443
# Stream ID 1: GET /products/123
# Stream ID 1: POST /cart/add (item=456)
# Stream ID 1: GET /cart
# Stream ID 2: GET /products/789
# Stream ID 1: POST /checkout

Here, Stream ID 1 represents a single user’s session, carrying multiple requests sequentially. Stream ID 2 could be another user, or even another request from the same user that is independent of the cart operation.

The problem we’re solving is ensuring that all requests belonging to a single user’s logical session consistently hit the same backend server, even though HTTP/2 allows these requests to be interleaved. This is crucial for stateful applications, like shopping carts or user authentication, where a server needs to maintain session data.

The traditional approach to sticky sessions involves embedding a session ID in a cookie sent by the server and then having the client resend that cookie with subsequent requests. The load balancer inspects this cookie and directs traffic to the server that originally set it.

With HTTP/2, this cookie-based approach still works, but it’s less efficient. Because multiple requests are multiplexed on one connection, the load balancer only sees the cookie on the first request of a new stream. Subsequent requests on that same stream might not have the cookie readily available for inspection by the load balancer.

This is where "stream routing" comes in, and it’s not about routing based on the stream ID itself, but rather about how the load balancer identifies and preserves the session across multiple streams and connections.

The modern solution often involves the load balancer using a technique called "connection affinity" or "session persistence" that’s aware of HTTP/2’s multiplexing. Instead of just looking at cookies on every request, the load balancer establishes a mapping between a client’s connection (or a set of connections from the same IP over a short period) and a specific backend server.

When a new HTTP/2 connection comes in, the load balancer picks a backend server. It then injects a special header or uses a custom cookie that the backend server can use to identify the client’s session. Crucially, the load balancer remembers this mapping. If the client establishes a new HTTP/2 connection, but the session is still active (e.g., within a reasonable timeout), the load balancer will try to direct that new connection to the same backend server. This is because the backend server will likely recognize the session identifier from the injected header or cookie.

Here’s a simplified conceptual configuration for Nginx, a popular load balancer:

http {
    upstream backend_servers {
        server 10.0.0.1:8080;
        server 10.0.0.2:8080;
        server 10.0.0.3:8080;

        # This is the key for session persistence.
        # "sticky" directive often uses cookies, but with HTTP/2,
        # the persistence is more about mapping client IP/connection
        # to a server, and relying on the backend to maintain session state.
        # For true HTTP/2 stream routing *within* a session, application-level
        # session IDs are paramount, and the load balancer ensures
        # the *initial* connection for that session hits the right server.
        sticky cookie srv_id expires=1h httponly;
    }

    server {
        listen 80;
        server_name example.com;

        location / {
            proxy_pass http://backend_servers;
            proxy_http_version 1.1; # Crucial for HTTP/2 support
            proxy_set_header Upgrade $http_upgrade;
            proxy_set_header Connection "upgrade";
            proxy_set_header Host $host;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;

            # When using sticky sessions, the load balancer injects a cookie.
            # The backend application *must* also generate and manage session IDs
            # (e.g., via its own cookies or tokens) that align with this persistence.
        }
    }
}

In this Nginx example, sticky cookie srv_id expires=1h httponly; instructs Nginx to use a cookie named srv_id to ensure requests from the same client are sent to the same backend server. The expires=1h sets the cookie’s lifespan. The httponly flag prevents JavaScript from accessing the cookie.

The "stream routing" aspect isn’t about the load balancer reading stream IDs. Instead, it’s about the load balancer’s ability to maintain the session affinity for the entire set of HTTP/2 streams that belong to a single logical user session. The load balancer ensures that when a client reconnects or establishes a new connection that’s part of an ongoing session, it’s directed to the same backend server that holds that session’s state. The backend application, using its own session management (e.g., via application-level cookies or tokens), then correctly associates the incoming HTTP/2 streams with the existing session.

The most surprising true thing about this is that with HTTP/2, the load balancer’s "stickiness" is less about inspecting every single request for a session identifier and more about mapping a client’s connection lifecycle to a backend server. The heavy lifting of identifying individual streams as belonging to a session then falls back to the application itself, which uses session identifiers to correlate requests processed by the load balancer on that sticky connection.

The next logical problem you’ll encounter is how to manage session timeouts and gracefully handle backend server failures when stickiness is enabled.