Canary deployments aren’t really about deploying new code; they’re about managing user traffic.
Let’s see what this looks like in the real world. Imagine we have a simple frontend service running two versions: v1 and v2. We want to send 5% of traffic to v2 and 95% to v1.
Here’s the Istio configuration for that:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: frontend-vs
spec:
hosts:
- frontend
http:
- route:
- destination:
host: frontend
subset: v1
weight: 95
- destination:
host: frontend
subset: v2
weight: 5
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: frontend-dr
spec:
host: frontend
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
This VirtualService tells Istio how to route traffic to the frontend service. It says: "If a request comes in for frontend, send 95% of it to the v1 subset and 5% to the v2 subset." The DestinationRule defines these v1 and v2 subsets, usually based on Kubernetes labels applied to the pods.
The core problem canary deployments solve is reducing the blast radius of a bad deployment. Instead of pushing a new version to 100% of users and hoping for the best, you start with a tiny fraction. If things look good (metrics are stable, error rates are low), you gradually increase the traffic to the new version until it’s handling 100%. If something goes wrong, you can quickly rollback by shifting all traffic back to the stable version.
Under the hood, Istio’s Envoy proxies intercept all traffic. The VirtualService and DestinationRule are Istio configuration objects that are translated into Envoy’s routing rules. When a request for frontend arrives at an Envoy proxy (either an ingress gateway or a sidecar proxy attached to a client pod), Envoy consults its learned routing table. Based on the weights defined in the VirtualService, it makes a probabilistic decision about which version of the frontend service (which pods with specific labels) to send the request to.
You control the rollout by updating the weight field in the VirtualService. To move from 5% to 10% for v2, you’d change the VirtualService to:
# ... (previous parts of VirtualService)
http:
- route:
- destination:
host: frontend
subset: v1
weight: 90
- destination:
host: frontend
subset: v2
weight: 10
# ... (rest of VirtualService)
And then apply this updated VirtualService using kubectl apply -f updated-frontend-vs.yaml. The Envoy proxies will pick up this change dynamically, and traffic will start shifting according to the new weights.
The real magic is that this happens without any application code changes or service restarts. The routing logic is externalized to Istio. You can even implement more sophisticated routing, like sending traffic only to specific users based on request headers (e.g., User-Agent or a custom X-User-ID header), which allows for targeted testing before a broader rollout.
The most surprising thing about this mechanism is how it leverages probabilistic routing for what feels like a deterministic rollout. When you set a weight of 5% for v2, it’s not that every fifth request goes to v2. Instead, each request has an independent 5% chance of being routed to v2. Over a large number of requests, this averages out to approximately 5%, but it means that in a short burst, you might see 0% or even 10% of traffic go to v2. This statistical behavior is crucial for ensuring that the initial small traffic sample is truly representative and that failures, if they occur, are isolated to a small, manageable set of requests.
Once you’re comfortable with traffic splitting, the next step is automating the rollout based on observed metrics.