Envoy doesn’t just forward gRPC traffic; it actively transforms and optimizes it, acting as a sophisticated gRPC-aware proxy.
Let’s see Envoy in action. Imagine we have two simple gRPC services, greeter-a and greeter-b, both exposing a SayHello RPC. We want to load balance requests to these services using Envoy.
First, our gRPC services. They’re running on localhost:50051 and localhost:50052 respectively. Here’s a snippet of what greeter-a might look like (in Go, for illustration):
package main
import (
"context"
"log"
"net"
"google.golang.org/grpc"
pb "path/to/your/proto" // Assuming you have a proto file
)
const (
port = ":50051"
)
type server struct {
pb.UnimplementedGreeterServer
}
func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
log.Printf("Received: %v from greeter-a", in.GetName())
return &pb.HelloReply{Message: "Hello " + in.GetName() + " from greeter-a"}, nil
}
func main() {
lis, err := net.Listen("tcp", port)
if err != nil {
log.Fatalf("failed to listen: %v", err)
}
s := grpc.NewServer()
pb.RegisterGreeterServer(s, &server{})
log.Printf("server listening at %v", lis.Addr())
if err := s.Serve(lis); err != nil {
log.Fatalf("failed to serve: %v", err)
}
}
greeter-b would be identical, just listening on port 50052 and returning a message from greeter-b.
Now, Envoy. We’ll configure it to listen on localhost:9090 and route gRPC requests to our two backend services. Here’s a minimal envoy.yaml:
static_resources:
listeners:
- name: listener_0
address:
socket_address:
address: 0.0.0.0
port_value: 9090
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: ingress_http
codec_type: AUTO
route_config:
name: local_route
virtual_hosts:
- name: local_service
routes:
- match:
prefix: "/"
route:
cluster: greeter_cluster
# Enable gRPC routing
upgrade_configs:
- enabled: true
upgrade_type: h2c
http_filters:
- name: envoy.filters.http.grpc_web
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.grpc_web.v3.GrpcWeb
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
clusters:
- name: greeter_cluster
connect_timeout: 5s
type: LOGICAL_DNS # Or STRICT_DNS, or STATIC
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: greeter_cluster
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: greeter-a # Or localhost if running locally
port_value: 50051
- endpoint:
address:
socket_address:
address: greeter-b # Or localhost if running locally
port_value: 50052
# Crucial for gRPC: enable HTTP/2 connection pooling
http2_protocol_options: {}
With this setup, you can send a gRPC request to localhost:9090/YourProto.YourService/YourRPCMethod (e.g., localhost:9090/Greeter/SayHello) using a gRPC client. Envoy will receive it, understand it’s a gRPC request because of the h2c upgrade configuration and the http2_protocol_options, and then load balance it to either greeter-a or greeter-b using the ROUND_ROBIN policy.
The upgrade_configs with upgrade_type: h2c is key. gRPC typically uses HTTP/2. When connecting to a service directly, it’s usually over a plain TCP connection that’s then upgraded to HTTP/2. Envoy handles this h2c (HTTP/2 Cleartext) upgrade for you. The http2_protocol_options: {} on the cluster tells Envoy to use HTTP/2 for upstream connections, which is essential for gRPC’s performance benefits like multiplexing. The envoy.filters.http.grpc_web filter is for clients that might not speak native gRPC (like web browsers), but for native gRPC clients, it’s often less critical but harmless. The router filter then directs the traffic.
The problem this solves is providing a single, stable entry point for your gRPC services, abstracting away the individual instances and their locations. Envoy can handle service discovery, load balancing, health checking, TLS termination, rate limiting, and even tracing, all while understanding the specifics of the gRPC protocol.
Internally, Envoy maintains connection pools to each upstream cluster. When a gRPC request arrives, it’s framed as an HTTP/2 stream. Envoy selects an upstream endpoint based on the configured lb_policy and forwards the stream over an existing or new HTTP/2 connection in its pool. The connect_timeout prevents requests from hanging indefinitely if a backend is unresponsive.
The type: LOGICAL_DNS on the cluster, combined with lb_policy: ROUND_ROBIN, means Envoy will periodically resolve the DNS name (greeter-a, greeter-b) and distribute requests evenly among the resolved IPs. If you were using a service discovery system like Kubernetes, you’d typically use type: EDS and configure Envoy to fetch endpoints dynamically.
A common misconception is that Envoy simply proxies TCP connections for gRPC. This is incorrect. Envoy speaks HTTP/2 natively with its upstream gRPC services, allowing it to leverage features like request multiplexing on a single connection. This is why http2_protocol_options: {} is non-negotiable for efficient gRPC proxying. Without it, Envoy would likely fall back to HTTP/1.1 or a less efficient multiplexing mechanism, negating many gRPC benefits.
Once you’ve mastered basic load balancing, the next logical step is to introduce health checking to automatically remove unhealthy instances from the rotation.