Load Balance and Proxy gRPC Traffic with Envoy (2026)

Envoy doesn’t just forward gRPC traffic; it actively transforms and optimizes it, acting as a sophisticated gRPC-aware proxy.

Let’s see Envoy in action. Imagine we have two simple gRPC services, greeter-a and greeter-b, both exposing a SayHello RPC. We want to load balance requests to these services using Envoy.

First, our gRPC services. They’re running on localhost:50051 and localhost:50052 respectively. Here’s a snippet of what greeter-a might look like (in Go, for illustration):

package main

import (
	"context"
	"log"
	"net"

	"google.golang.org/grpc"
	pb "path/to/your/proto" // Assuming you have a proto file
)

const (
	port = ":50051"
)

type server struct {
	pb.UnimplementedGreeterServer
}

func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
	log.Printf("Received: %v from greeter-a", in.GetName())
	return &pb.HelloReply{Message: "Hello " + in.GetName() + " from greeter-a"}, nil
}

func main() {
	lis, err := net.Listen("tcp", port)
	if err != nil {
		log.Fatalf("failed to listen: %v", err)
	}
	s := grpc.NewServer()
	pb.RegisterGreeterServer(s, &server{})
	log.Printf("server listening at %v", lis.Addr())
	if err := s.Serve(lis); err != nil {
		log.Fatalf("failed to serve: %v", err)
	}
}

greeter-b would be identical, just listening on port 50052 and returning a message from greeter-b.

Now, Envoy. We’ll configure it to listen on localhost:9090 and route gRPC requests to our two backend services. Here’s a minimal envoy.yaml:

static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 9090
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          stat_prefix: ingress_http
          codec_type: AUTO
          route_config:
            name: local_route
            virtual_hosts:
            - name: local_service
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: greeter_cluster
                  # Enable gRPC routing
                  upgrade_configs:
                  - enabled: true
                    upgrade_type: h2c
          http_filters:
          - name: envoy.filters.http.grpc_web
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
          - name: envoy.filters.http.router
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
  clusters:
  - name: greeter_cluster
    connect_timeout: 5s
    type: LOGICAL_DNS # Or STRICT_DNS, or STATIC
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: greeter_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: greeter-a # Or localhost if running locally
                port_value: 50051
        - endpoint:
            address:
              socket_address:
                address: greeter-b # Or localhost if running locally
                port_value: 50052
    # Crucial for gRPC: enable HTTP/2 connection pooling
    http2_protocol_options: {}

With this setup, you can send a gRPC request to localhost:9090/YourProto.YourService/YourRPCMethod (e.g., localhost:9090/Greeter/SayHello) using a gRPC client. Envoy will receive it, understand it’s a gRPC request because of the h2c upgrade configuration and the http2_protocol_options, and then load balance it to either greeter-a or greeter-b using the ROUND_ROBIN policy.

The upgrade_configs with upgrade_type: h2c is key. gRPC typically uses HTTP/2. When connecting to a service directly, it’s usually over a plain TCP connection that’s then upgraded to HTTP/2. Envoy handles this h2c (HTTP/2 Cleartext) upgrade for you. The http2_protocol_options: {} on the cluster tells Envoy to use HTTP/2 for upstream connections, which is essential for gRPC’s performance benefits like multiplexing. The envoy.filters.http.grpc_web filter is for clients that might not speak native gRPC (like web browsers), but for native gRPC clients, it’s often less critical but harmless. The router filter then directs the traffic.

The problem this solves is providing a single, stable entry point for your gRPC services, abstracting away the individual instances and their locations. Envoy can handle service discovery, load balancing, health checking, TLS termination, rate limiting, and even tracing, all while understanding the specifics of the gRPC protocol.

Internally, Envoy maintains connection pools to each upstream cluster. When a gRPC request arrives, it’s framed as an HTTP/2 stream. Envoy selects an upstream endpoint based on the configured lb_policy and forwards the stream over an existing or new HTTP/2 connection in its pool. The connect_timeout prevents requests from hanging indefinitely if a backend is unresponsive.

The type: LOGICAL_DNS on the cluster, combined with lb_policy: ROUND_ROBIN, means Envoy will periodically resolve the DNS name (greeter-a, greeter-b) and distribute requests evenly among the resolved IPs. If you were using a service discovery system like Kubernetes, you’d typically use type: EDS and configure Envoy to fetch endpoints dynamically.

A common misconception is that Envoy simply proxies TCP connections for gRPC. This is incorrect. Envoy speaks HTTP/2 natively with its upstream gRPC services, allowing it to leverage features like request multiplexing on a single connection. This is why http2_protocol_options: {} is non-negotiable for efficient gRPC proxying. Without it, Envoy would likely fall back to HTTP/1.1 or a less efficient multiplexing mechanism, negating many gRPC benefits.

Once you’ve mastered basic load balancing, the next logical step is to introduce health checking to automatically remove unhealthy instances from the rotation.