The gRPC Health Checking Protocol allows load balancers to determine if a gRPC service is healthy and ready to receive traffic.

Let’s see it in action. Imagine we have two backend gRPC servers, backend-1 and backend-2, both running on localhost:50051 and localhost:50052 respectively. A load balancer needs to know which one to send requests to.

First, we need a gRPC server that implements the health checking service. This involves adding the grpc.health.v1.Health service definition to your .proto file and implementing the Check and Watch methods.

Here’s a simplified Go example of a server implementing health checking:

package main

import (
	"context"
	"log"
	"net"

	"google.golang.org/grpc"
	healthpb "google.golang.org/grpc/health/grpc_health_v1"
)

const (
	port = ":50051"
)

// server is used to implement healthpb.HealthServer.
type server struct{}

// Check implements the health checking service
func (s *server) Check(ctx context.Context, req *healthpb.HealthCheckRequest) (*healthpb.HealthCheckResponse, error) {
	// In a real application, you'd check your service's actual health status here.
	// For this example, we'll assume it's always healthy.
	log.Printf("Received Check request for service: %s", req.GetService())
	return &healthpb.HealthCheckResponse{Status: healthpb.HealthCheckResponse_SERVING}, nil
}

// Watch implements the health checking service
func (s *server) Watch(req *healthpb.HealthCheckRequest, stream healthpb.Health_WatchServer) error {
	log.Printf("Received Watch request for service: %s", req.GetService())
	// In a real implementation, you'd stream health status changes.
	// For simplicity, we'll just send one initial status and then block.
	if err := stream.Send(&healthpb.HealthCheckResponse{Status: healthpb.HealthCheckResponse_SERVING}); err != nil {
		return err
	}
	// Keep the stream open indefinitely or until context is cancelled
	<-stream.Context().Done()
	return nil
}

func main() {
	lis, err := net.Listen("tcp", port)
	if err != nil {
		log.Fatalf("failed to listen: %v", err)
	}
	s := grpc.NewServer()
	healthpb.RegisterHealthServer(s, &server{})
	log.Printf("server listening at %v", lis.Addr())
	if err := s.Serve(lis); err != nil {
		log.Fatalf("failed to serve: %v", err)
	}
}

To use this with a load balancer, you’d configure the load balancer to periodically send a Check request to a specific endpoint on each backend server. The common practice is to check the health of the entire server, so the service field in HealthCheckRequest is often left empty. A SERVING status in the HealthCheckResponse indicates the server is healthy.

The load balancer might also use the Watch method for more efficient, stream-based health updates, subscribing to changes in the server’s health status.

Here’s how you might test the health endpoint directly using grpcurl:

# Check health of backend-1
grpcurl -plaintext localhost:50051 grpc.health.v1.Health.Check

# Expected Output:
# {"status":0}

The 0 here corresponds to healthpb.HealthCheckResponse_SERVING.

The fundamental problem this solves is decoupling the availability of the gRPC service from the availability of the underlying server process. A load balancer needs to know if a server is not just running, but also capable of processing requests. Without a dedicated health check, a load balancer might send traffic to a server that has started but is stuck in initialization, or has encountered an internal error and can no longer serve requests, even though the network port is open.

The Watch method is particularly interesting because it moves away from a pull-based model (load balancer polling) to a push-based model (server notifying of changes). This can significantly reduce the overhead on both the load balancer and the backend servers, especially in large deployments with many servers. The server maintains the state and streams updates only when necessary.

The gRPC Health Checking Protocol is designed to be lightweight and efficient. The Check RPC is a simple unary call, and Watch is a server-streaming RPC. The protocol itself doesn’t carry any application-specific data, only the health status. This makes it easy to integrate into existing infrastructure.

Most people don’t realize that the health check itself can be configured to check the health of specific services running on the server, not just the server as a whole. By providing a non-empty service name in the HealthCheckRequest, you can query the health of a particular gRPC service. This is useful if a single server instance hosts multiple distinct gRPC services, and you need finer-grained control over which services are considered healthy. For example, requesting {"service": "MyUserService"} would only return SERVING if MyUserService is healthy, even if other services on the same server are not.

The next concept to explore is how to integrate this with various load balancing solutions, such as Kubernetes Ingress, service meshes like Istio, or cloud provider load balancers.

Want structured learning?

Take the full Grpc course →