Rate-Limit Requests with Istio and Envoy Rate Limiter (2026)

The most surprising thing about rate limiting is that it’s not about preventing abuse, but about managing it gracefully.

Let’s see this in action. Imagine a critical microservice, user-service, that handles user profile lookups. We want to ensure that no single client IP address can hammer it with more than 100 requests per minute.

Here’s how we’d configure Istio to enforce this. First, we need a RateLimitConfig resource that defines our policy:

apiVersion: networking.istio.io/v1alpha3
kind: RateLimitConfig
metadata:
  name: user-service-ratelimit
spec:
  domain: "user-service"
  rateLimits:
    - rate:
        requestsPerUnit: 100
        unit: MINUTE
      match:
        - key: "source.ip"
          value: "*"

This config tells Envoy (Istio’s proxy) that for the user-service domain, we want to limit requests to 100 per minute, and we’re basing this limit on the source IP address. The value: "*" means this rule applies to all source IPs.

Now, we need to apply this policy to our user-service using an AuthorizationPolicy:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: enforce-user-service-ratelimit
  namespace: default
spec:
  selector:
    matchLabels:
      app: user-service
  action: ALLOW
  rules:
    - to:
        - operation:
            methods: ["GET"]
      when:
        - key: "request.headers[x-source-ip]" # Envoy uses this header for source IP
          values: ["*"]

This AuthorizationPolicy is attached to pods with the label app: user-service. It allows requests by default (action: ALLOW) but crucially, it also tells Envoy to consider rate limits. The when clause is a bit of a trick: it ensures that the rule is evaluated, and the key: "request.headers[x-source-ip]" is how Envoy accesses the client’s IP address after it has been potentially proxied.

When a request hits an Istio-enabled user-service pod, Envoy intercepts it. It checks the RateLimitConfig associated with the user-service domain. If the request matches a rate limit rule (in this case, any request based on source IP), Envoy consults its internal rate limiter.

The rate limiter maintains counters for each unique source.ip combination. It increments the counter for the incoming request’s IP. If the count for that IP within the current minute exceeds 100, Envoy doesn’t forward the request to the user-service pod. Instead, it immediately returns an HTTP 429 Too Many Requests response to the client. This happens before the request even reaches your application logic.

This is powerful because it offloads the rate-limiting logic entirely to the edge proxy (Envoy), keeping your application services lean and focused on their core business logic. You can also define more complex rate limits, for example, by combining source IP with other request attributes like a specific header value or the requested URL path.

The key to understanding how this works under the hood is realizing that Envoy is not just a dumb proxy; it’s a sophisticated edge proxy with built-in capabilities for traffic management and security. The RateLimitConfig and AuthorizationPolicy are essentially declarative ways to instruct Envoy on how to apply these capabilities. Envoy’s internal state management, often using in-memory counters or potentially a distributed store like Redis for more advanced scenarios, is what actually tracks the request rates.

If you’re using a service mesh like Istio, you’re likely already familiar with AuthorizationPolicy. The connection here is that Istio leverages Envoy’s capabilities, and RateLimitConfig is one of those capabilities that can be exposed and controlled via Istio’s custom resources. The domain field in RateLimitConfig is critical; it’s how Envoy knows which set of rate limit rules to apply to a given service. It’s not directly tied to Kubernetes service names but to the logical service name Istio uses internally for routing.

Once you have rate limiting in place, the next thing you’ll likely encounter is how to handle the 429 responses gracefully in your client applications, perhaps with exponential backoff strategies.