Expose gRPC Server Metrics to Prometheus (2026)

Prometheus doesn’t just scrape metrics; it actively pulls them from your services, which fundamentally changes how you think about service discovery and observability.

Let’s see what that looks like with a gRPC server. Imagine we have a simple gRPC service, Greeter, with a SayHello method. We’ll use Go for this example.

// main.go
package main

import (
	"context"
	"log"
	"net"
	"net/http"

	"github.com/grpc-ecosystem/go-grpc-prometheus"
	"github.com/prometheus/client_golang/prometheus/promhttp"
	"google.golang.org/grpc"
	pb "google.golang.org/grpc/examples/helloworld/helloworld" // Assuming this is your proto-generated package
)

const (
	grpcPort = ":50051"
	httpPort = ":9091" // Prometheus metrics endpoint
)

// server is used to implement helloworld.GreeterServer.
type server struct {
	pb.UnimplementedGreeterServer
}

// SayHello implements helloworld.GreeterServer.
func (s *server) SayHello(ctx context.Context, in *pb.HelloRequest) (*pb.HelloReply, error) {
	log.Printf("Received: %v", in.GetName())
	return &pb.HelloReply{Message: "Hello " + in.GetName()}, nil
}

func main() {
	// Start gRPC server
	lis, err := net.Listen("tcp", grpcPort)
	if err != nil {
		log.Fatalf("failed to listen: %v", err)
	}
	s := grpc.NewServer()
	pb.RegisterGreeterServer(s, &server{})

	// Initialize Prometheus metrics for gRPC
	grpcMetrics := grpc_prometheus.NewServerMetrics()
	grpcMetrics.InitializeCounter.Inc() // Increment the registered server metric
	s.Use(grpcMetrics.UnaryServerInterceptor())

	// Register Prometheus metrics handler
	http.Handle("/metrics", promhttp.Handler())

	// Start HTTP server for Prometheus metrics
	go func() {
		log.Printf("HTTP server listening on %s", httpPort)
		if err := http.ListenAndServe(httpPort, nil); err != nil {
			log.Fatalf("failed to start HTTP server: %v", err)
		}
	}()

	log.Printf("gRPC server listening on %v", lis.Addr())
	if err := s.Serve(lis); err != nil {
		log.Fatalf("failed to serve: %v", err)
	}
}

In this code:

We set up a standard gRPC server using google.golang.org/grpc.
We import github.com/grpc-ecosystem/go-grpc-prometheus. This library provides pre-built interceptors for Prometheus metrics.
grpc_prometheus.NewServerMetrics() creates a collector for common gRPC server metrics (request duration, count, errors, etc.).
s.Use(grpcMetrics.UnaryServerInterceptor()) hooks these metrics into our gRPC server’s request handling pipeline. Every gRPC call will now be observed by the Prometheus interceptor.
We start a separate HTTP server on :9091 that exposes the /metrics endpoint using promhttp.Handler(). This is where Prometheus will scrape.

Now, when Prometheus scrapes http://localhost:9091/metrics, it will see output like this:

# HELP go_gc_duration_seconds A collection of samples that allow you to track the pause time of your Go application's garbage collector.
# TYPE go_gc_duration_seconds histogram
# HELP go_goroutines A metric which counts the number of goroutines that currently exist.
# TYPE go_goroutines gauge
# HELP go_info Information about the Go application.
# TYPE go_info gauge
# HELP go_memstats_alloc_bytes Number of bytes allocated and still in use.
# TYPE go_memstats_alloc_bytes gauge
# HELP go_memstats_frees_total Total number of memory blocks freed.
# TYPE go_memstats_frees_total counter
# HELP grpc_server_handled_total Total number of RPCs handled by the server, partitioned by grpc_code and grpc_method.
# TYPE grpc_server_handled_total counter
grpc_server_handled_total{grpc_code="OK",grpc_method="SayHello"} 1
grpc_server_handled_total{grpc_code="OK",grpc_method="SayHello"} 2
# HELP grpc_server_msg_received_total Total number of RPC messages received by the server, partitioned by grpc_method.
# TYPE grpc_server_msg_received_total counter
grpc_server_msg_received_total{grpc_method="SayHello"} 1
grpc_server_msg_received_total{grpc_method="SayHello"} 2
# HELP grpc_server_msg_sent_total Total number of RPC messages sent by the server, partitioned by grpc_method.
# TYPE grpc_server_msg_sent_total counter
grpc_server_msg_sent_total{grpc_method="SayHello"} 1
grpc_server_msg_sent_total{grpc_method="SayHello"} 2
# HELP grpc_server_request_duration_seconds Histogram of request latencies in seconds for the server, partitioned by grpc_method.
# TYPE grpc_server_request_duration_seconds histogram
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.001"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.002"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.004"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.006"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.008"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.01"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.02"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.04"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.06"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.08"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.1"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.2"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.4"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.6"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="0.8"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="1"} 0
grpc_server_request_duration_seconds_bucket{grpc_method="SayHello",le="+Inf"} 2
grpc_server_request_duration_seconds_sum{grpc_method="SayHello"} 0.000012345
grpc_server_request_duration_seconds_count{grpc_method="SayHello"} 2
# HELP grpc_server_started_total Total number of RPCs started by the server.
# TYPE grpc_server_started_total counter
grpc_server_started_total{grpc_method="SayHello"} 2

The grpc_prometheus library automatically registers collectors for grpc_server_handled_total, grpc_server_started_total, grpc_server_msg_received_total, grpc_server_msg_sent_total, and grpc_server_request_duration_seconds. These metrics are partitioned by grpc_code and grpc_method, giving you granular visibility into your service’s performance.

The key mental model shift is that Prometheus is a pull-based system. You don’t push metrics to Prometheus; you expose an HTTP endpoint, and Prometheus scrapes it at regular intervals. This means your gRPC server needs to be reachable by the Prometheus server, and you need to configure Prometheus to know where to find it.

In a real-world scenario, you’d likely use a service discovery mechanism (like Kubernetes, Consul, or DNS) to tell Prometheus where your gRPC servers are running. Prometheus would then periodically query the /metrics endpoint of each discovered instance.

A common pattern is to run your gRPC server and the Prometheus metrics exporter on the same host, often on different ports. The gRPC server handles your application logic, and the HTTP server on a separate port exposes the metrics for Prometheus. This separation ensures that metrics collection doesn’t interfere with your primary application traffic.

The grpc_prometheus library is highly configurable. You can customize the metric names, labels, and even register your own custom metrics that are specific to your application logic. For instance, you might want to track the number of successful vs. failed business operations within your SayHello method, not just gRPC-level errors.

The fundamental insight here is that Prometheus operates on a time-series database model where data points are timestamped. When Prometheus scrapes your /metrics endpoint, it’s not just getting a snapshot; it’s getting a set of values that represent the state of your application at that exact moment. The grpc_server_request_duration_seconds metric, for example, is a histogram, meaning it collects data points into buckets to approximate a distribution. Prometheus then uses these buckets to calculate percentiles and other statistical measures over time.

If you’re not seeing any metrics for your gRPC methods, the most common culprit is that the Prometheus HTTP server isn’t actually running or is failing to bind to its port.