HAProxy can load balance gRPC traffic, but it doesn’t understand gRPC’s protocol-level load balancing features.

Let’s see HAProxy in action. Imagine you have three gRPC backend servers running on 192.168.1.101:50051, 192.168.1.102:50051, and 192.168.1.103:50051. You want HAProxy to listen on 127.0.0.1:50050 and distribute traffic evenly.

Here’s a minimal HAProxy configuration:

global
    log /dev/log    local0
    maxconn 4096
    daemon

defaults
    log     global
    mode    tcp
    option  tcplog
    option  dontlognull
    timeout connect 5000
    timeout client  50000
    timeout server  50000

frontend grpc_frontend
    bind 127.0.0.1:50050
    default_backend grpc_backend

backend grpc_backend
    balance roundrobin
    option tcp-check
    tcp-check connect port 50051
    server srv1 192.168.1.101:50051 check
    server srv2 192.168.1.102:50051 check
    server srv3 192.168.1.103:50051 check

With this set up, HAProxy acts as a simple TCP proxy. When a gRPC client connects to 127.0.0.1:50050, HAProxy selects one of the backend servers (srv1, srv2, or srv3) based on the roundrobin algorithm and forwards the TCP connection to it. The tcp-check ensures that HAProxy only considers servers that are listening on port 50051 as healthy.

The problem HAProxy solves here is straightforward: distributing incoming requests across multiple backend gRPC servers to prevent any single server from being overloaded and to provide high availability. If one server fails, HAProxy will stop sending traffic to it and continue distributing to the healthy ones.

Internally, HAProxy operates at Layer 4 (TCP) for this configuration. It doesn’t inspect the gRPC messages themselves (which are typically Protocol Buffers over HTTP/2). It sees a TCP connection, forwards it to a backend, and relays data back and forth. This simplicity is its strength, but it also means it lacks gRPC-aware routing.

The balance roundrobin directive tells HAProxy to cycle through the listed servers sequentially. Other balancing algorithms like leastconn (send to the server with the fewest active connections) or source (hash the client’s source IP to consistently send a client to the same server) are also available and work at the TCP connection level.

The option tcp-check and tcp-check connect port 50051 lines are crucial for health checking. HAProxy will attempt to establish a TCP connection to port 50051 on each backend server. If the connection succeeds, the server is marked as healthy. If it fails, HAProxy will remove it from the pool of available servers.

A key limitation is that HAProxy, in this TCP mode, cannot leverage gRPC’s built-in load balancing mechanisms. gRPC itself has features for client-side load balancing and service discovery that allow clients to be aware of backend server health and distribution strategies. HAProxy, by default, is oblivious to these.

When HAProxy receives a client request on 127.0.0.1:50050, it performs a TCP connection establishment with one of the backend servers. The actual gRPC request (which is an HTTP/2 frame) is then encapsulated within this TCP connection. HAProxy simply forwards these bytes. If you’re using balance source, HAProxy will hash the client’s IP address. This means that a given client IP will always be directed to the same backend server. This can be problematic if you have many clients behind a single NAT gateway, as they will all hit the same backend.

The next challenge you’ll run into is managing sticky sessions when you need more advanced routing based on gRPC service or method names.

Want structured learning?

Take the full Haproxy course →