A gRPC connection is surprisingly stateful and expensive to establish, which is why you want to avoid creating a new one for every single request in a high-throughput service.
Let’s watch a service do this. Imagine service-a needs to call service-b 1000 times per second.
// service-a/main.go
package main
import (
"context"
"log"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
pb "your_module/proto" // Assume this is your protobuf generated code
)
func main() {
// This is the naive, inefficient way
for i := 0; i < 1000; i++ {
conn, err := grpc.Dial("localhost:50051", grpc.WithTransportCredentials(insecure.NewCredentials()))
if err != nil {
log.Fatalf("did not connect: %v", err)
}
defer conn.Close() // This defer is problematic inside a loop
client := pb.NewYourServiceClient(conn)
_, err = client.SomeRPC(context.Background(), &pb.YourRequest{Message: "hello"})
if err != nil {
log.Printf("could not greet: %v", err)
}
// In a real app, you'd have a small sleep or other work here
// time.Sleep(1 * time.Millisecond)
}
}
Running this, you’d see service-b’s logs fill up with connection establishment messages, and service-a would be busy with TLS handshakes and other connection setup overhead on every single request. This limits throughput dramatically.
The problem is that establishing a gRPC connection involves significant overhead. It’s not just a TCP connection; it’s a handshake, TLS negotiation (even with insecure credentials, there’s still a negotiation phase), and setting up HTTP/2 streams. Doing this thousands of times per second is a recipe for disaster. The solution is connection pooling.
Instead of creating a new connection for each request, you maintain a pool of open connections. When service-a needs to make a call to service-b, it grabs an available connection from the pool, uses it, and then returns it to the pool.
Here’s how you’d typically implement this using a library like grpchan:
First, add the dependency:
go get github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/logging
go get github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/retry
go get github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/timeout
go get google.golang.org/grpc
go get google.golang.org/grpc/credentials/insecure
Then, modify service-a:
// service-a/main.go
package main
import (
"context"
"log"
"sync"
"time"
"google.golang.org/grpc"
"google.golang.org/grpc/credentials/insecure"
pb "your_module/proto" // Assume this is your protobuf generated code
"github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/logging"
"github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/retry"
"github.com/grpc-ecosystem/go-grpc-middleware/v2/interceptors/timeout"
// This is a simplified example, real pooling might use a dedicated library
// or a more robust custom implementation.
// For demonstration, we'll manage a few connections manually in a pool.
)
// Global pool of connections
var (
grpcPool []*grpc.ClientConn
poolMutex sync.Mutex
maxPoolSize = 10 // Example pool size
connectionTTL = 5 * time.Minute // Example TTL for connections
)
// setupConnectionPool establishes and returns a new gRPC client connection.
func setupConnectionPool(target string) (*grpc.ClientConn, error) {
conn, err := grpc.Dial(target,
grpc.WithTransportCredentials(insecure.NewCredentials()),
grpc.WithDefaultCallOptions(
grpc.CallContentSubtype("proto"), // Example: if your protos are JSON
// Add other default options like interceptors here
grpc.PerRPCCredentials(nil), // Example: No per-RPC credentials
),
// Add connection-level interceptors
grpc.WithChainUnaryInterceptor(
timeout.UnaryClientInterceptor(10*time.Second), // Timeout for each RPC
retry.UnaryClientInterceptor(retry.WithMax(3)), // Retry up to 3 times
logging.UnaryClientInterceptor(nil), // Basic logging interceptor
),
grpc.WithChainStreamInterceptor(
timeout.StreamClientInterceptor(10*time.Second),
retry.StreamClientInterceptor(retry.WithMax(3)),
logging.StreamClientInterceptor(nil),
),
)
if err != nil {
return nil, err
}
return conn, nil
}
// getConnectionFromPool gets a connection from the pool or creates a new one.
func getConnectionFromPool(target string) (*grpc.ClientConn, error) {
poolMutex.Lock()
defer poolMutex.Unlock()
// Try to find an available connection
for i, conn := range grpcPool {
if conn != nil && !isConnectionClosed(conn) { // Basic check, real check is more complex
// Move this connection to the front of the pool for faster access
grpcPool = append([]*grpc.ClientConn{conn}, append(grpcPool[:i], grpcPool[i+1:]...)...)
return conn, nil
}
}
// If pool is not full, create a new one
if len(grpcPool) < maxPoolSize {
newConn, err := setupConnectionPool(target)
if err != nil {
return nil, err
}
grpcPool = append(grpcPool, newConn)
return newConn, nil
}
// If pool is full, try to use the oldest one (which is at the end)
// In a real pool, you'd have more sophisticated eviction strategies (e.g., LRU)
// and potentially a way to gracefully close and replace stale connections.
oldestConn := grpcPool[len(grpcPool)-1]
if !isConnectionClosed(oldestConn) {
// We'll reuse this one, but it's better to close and replace if stale.
// For this example, we assume it's okay to reuse.
grpcPool = append([]*grpc.ClientConn{oldestConn}, grpcPool[:len(grpcPool)-1]...)
return oldestConn, nil
} else {
// The oldest connection is closed, replace it with a new one.
newConn, err := setupConnectionPool(target)
if err != nil {
return nil, err
}
grpcPool[len(grpcPool)-1] = newConn
return newConn, nil
}
}
// isConnectionClosed is a simplified check. In reality, you'd monitor health.
func isConnectionClosed(conn *grpc.ClientConn) bool {
// grpc.ConnState() can give some info, but it's not a perfect indicator of readiness.
// Real pools might send a health check ping.
return conn.GetState() == grpc.Shutdown || conn.GetState() == grpc.TransientFailure
}
// releaseConnectionToPool returns a connection to the pool (or closes it if stale).
func releaseConnectionToPool(conn *grpc.ClientConn) {
// In a real pool, you'd mark this connection as available.
// For this simple example, we don't explicitly "return" it to a specific slot,
// but getConnectionFromPool handles finding available ones.
// You'd also add logic here to close stale connections periodically.
}
func main() {
targetServiceB := "localhost:50051" // Address of service-b
// Pre-warm the pool (optional, but good for predictable startup)
for i := 0; i < maxPoolSize; i++ {
conn, err := setupConnectionPool(targetServiceB)
if err != nil {
log.Fatalf("failed to pre-warm connection pool: %v", err)
}
grpcPool = append(grpcPool, conn)
}
log.Println("Connection pool initialized.")
var wg sync.WaitGroup
numRequests := 1000
for i := 0; i < numRequests; i++ {
wg.Add(1)
go func(reqID int) {
defer wg.Done()
conn, err := getConnectionFromPool(targetServiceB)
if err != nil {
log.Printf("Request %d: failed to get connection from pool: %v", reqID, err)
return
}
// In a real scenario, you might have a mechanism to return connections
// to the pool after use, or the pool manages this internally.
// For this example, we assume getConnectionFromPool gives us a reusable one.
// releaseConnectionToPool(conn) // Not explicitly called here as pool management is simplified
client := pb.NewYourServiceClient(conn)
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) // RPC timeout
defer cancel()
_, err = client.SomeRPC(ctx, &pb.YourRequest{Message: "hello from pooled connection"})
if err != nil {
log.Printf("Request %d: RPC failed: %v", reqID, err)
// If RPC fails due to connection issues, you might want to invalidate the connection in the pool
// For simplicity, we don't show that here.
} else {
log.Printf("Request %d: RPC succeeded", reqID)
}
}(i)
}
wg.Wait()
log.Println("All requests processed.")
// Clean up: Close all connections in the pool when the application exits
poolMutex.Lock()
for _, conn := range grpcPool {
if conn != nil {
conn.Close()
}
}
poolMutex.Unlock()
log.Println("Connection pool closed.")
}
The core idea is that getConnectionFromPool tries to reuse existing connections. When a connection is acquired, it’s not immediately closed. It stays open, ready for the next request. The grpc.Dial call (which establishes the connection) is now called only a limited number of times (up to maxPoolSize in this example), not for every single RPC.
The most surprising thing about gRPC connection pooling is that the standard grpc.Dial function doesn’t inherently do it for you; you have to manage it yourself, or use a library that abstracts it.
Internally, grpc.Dial sets up a grpc.ClientConn. This ClientConn is the object that manages the underlying network connections (HTTP/2 streams). When you make multiple RPC calls using the same ClientConn, they reuse the same underlying network resources. The problem is that creating many ClientConn instances, each with its own network connection, is what causes the overhead. True pooling means managing a set of these ClientConn objects.
The levers you control are primarily:
- Pool Size (
maxPoolSize): Too small, and you’ll still have contention for connections. Too large, and you waste resources. It should generally be related to your expected concurrency and the capacity of the downstream service. - Connection Lifetime/Health Checks: In a real-world scenario, connections can become stale or broken. A robust pool needs mechanisms to detect this (e.g., sending a periodic health check ping) and replace unhealthy connections. The
connectionTTLin the example hints at this, but a full implementation is more complex. - Load Balancing: If you have multiple instances of
service-b, your pool should ideally distribute connections across them. Libraries likegrpc.Dialsupport built-in load balancing policies (e.g., round robin) when connecting to a name resolver that returns multiple addresses. - Interceptors: As shown with
grpc.WithChainUnaryInterceptor, you can add cross-cutting concerns like timeouts, retries, and logging that apply to all requests made through that connection. These are configured once perClientConn, saving overhead.
The one thing most people don’t realize is that a gRPC client connection is a heavyweight resource, and simply calling grpc.Dial repeatedly within a request loop is a fundamentally flawed pattern for high-throughput applications. The defer conn.Close() inside a loop, as seen in the naive example, is particularly insidious because it closes the connection after each iteration, guaranteeing a new connection is made for every single request.
The next step after implementing connection pooling is often to consider more advanced load balancing strategies, especially for distributed systems.