Linkerd’s linkerd-proxy is actually the one generating and exposing per-route latency metrics, not some separate "routes stat" component.
The core idea is that Linkerd’s sidecar proxy sits next to your application and intercepts all network traffic. It’s capable of inspecting individual HTTP requests and responses to identify specific routes (like /api/users or /health) and then precisely measure the time taken for each one.
Here’s how it works in practice. Imagine you have a simple Go web server:
package main
import (
"fmt"
"net/http"
"time"
)
func main() {
http.HandleFunc("/hello", func(w http.ResponseWriter, r *http.Request) {
time.Sleep(100 * time.Millisecond) // Simulate work
fmt.Fprintf(w, "Hello, world!")
})
http.HandleFunc("/health", func(w http.ResponseWriter, r *http.Request) {
fmt.Fprintf(w, "OK")
})
fmt.Println("Starting server on :8080")
http.ListenAndServe(":8080", nil)
}
When you run this application within a Linkerd service mesh, Linkerd injects a linkerd-proxy container into the pod. This proxy is configured to listen on the same port as your application (or an adjacent one, depending on the configuration) and forward traffic to it. Crucially, the proxy also observes the traffic.
To see the per-route metrics, you’d typically query the Linkerd control plane’s metrics API or use the linkerd tap command. Let’s look at linkerd tap. If your hello-world service is deployed and linkerd is installed, you can run:
linkerd tap deploy/hello-world --to deploy/hello-world
This will show you a stream of requests and responses. But to get the aggregated metrics, you’d query Prometheus, which Linkerd exports metrics to. Linkerd’s default Prometheus configuration scrapes metrics from the linkerd-proxy sidecars.
The key metric you’re looking for is response_latency_ms. This metric is automatically labeled by the linkerd-proxy with details about the request, including the HTTP route.
Here’s a sample PromQL query to get the p95 latency for requests to the /hello route for your hello-world deployment:
histogram_quantile(0.95, sum by (le, http_route) (rate(response_latency_ms{namespace="default", service="hello-world", http_route="/hello"}[5m])))
Let’s break this down:
response_latency_ms: This is the raw histogram metric collected by thelinkerd-proxy. It records latencies for each request.namespace="default", service="hello-world": These are common labels Linkerd adds to identify the source of the metric.http_route="/hello": This is the magic label. Thelinkerd-proxyautomatically extracts the HTTP route from the request path and adds it as a label to theresponse_latency_msmetric.rate(...[5m]): We’re looking at the rate of requests over the last 5 minutes to smooth out short-term spikes.sum by (le, http_route): We sum up the histogram buckets, keeping thele(less than or equal to) label for quantile calculation and thehttp_routelabel to group by route.histogram_quantile(0.95, ...): This function calculates the 95th percentile latency from the histogram data.
When you run this query, you’d see output like:
{http_route="/hello"} 123.45
This tells you that the 95th percentile latency for requests to the /hello route for your hello-world service is 123.45 milliseconds.
The most surprising thing about how Linkerd handles per-route metrics is that it doesn’t require any application code changes or special annotations to start collecting them. The linkerd-proxy is smart enough to parse HTTP and derive meaningful route information from the request path itself, making it a powerful, zero-instrumentation observability tool for HTTP services.
The linkerd-proxy uses a technique called "route inference" based on the request path. For standard HTTP, it often uses the full path (e.g., /api/v1/users/123). However, for more complex routing scenarios or to get more granular metrics (e.g., distinguishing between /users/123 and /users/456), you might need to configure Linkerd to use a more specific route matching strategy, often via ServiceProfile resources. This allows you to define regular expressions or exact path matches to create custom route labels that better represent your application’s API structure.
Once you’ve mastered per-route latency, the next logical step is to explore how Linkerd can help you manage traffic based on these metrics, leading you into concepts like traffic splitting and retries.