Canary deployments don’t just let you roll out new code; they fundamentally change how you think about risk by shifting from "all or nothing" to "gradual exposure."
Let’s see this in action. Imagine you have a web service running on Fly.io, and you want to deploy a new version.
Here’s a typical Fly.io fly.toml for a simple app:
app = "my-canary-app"
primary_region = "ord"
[build]
image = "my-dockerhub-user/my-app:v1.0.0"
[[services]]
internal_port = 8080
protocol = "tcp"
script_image = "flyio/fly-proxy:latest"
[[services.ports]]
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["https"]
port = 443
To start a canary deployment, we’ll first deploy the new version to a small percentage of our users. On Fly.io, this is managed by the services.concurrency field in fly.toml. We’ll deploy our new version (let’s say my-dockerhub-user/my-app:v1.1.0) alongside the old one.
First, ensure you have the flyctl CLI installed and authenticated.
We’ll create a new fly.toml for our canary. Let’s say we want to send 5% of traffic to the new version. We’ll set the concurrency for the new version to 5% and the old version to 95%.
Old fly.toml (for v1.0.0):
app = "my-canary-app"
primary_region = "ord"
[build]
image = "my-dockerhub-user/my-app:v1.0.0"
[[services]]
internal_port = 8080
protocol = "tcp"
script_image = "flyio/fly-proxy:latest"
[[services.ports]]
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["https"]
port = 443
# This is the key for canary: set concurrency for the primary service
[services.concurrency]
type = "percent"
hard_limit = 95 # 95% of traffic goes to this version
New fly.toml (for v1.1.0):
app = "my-canary-app"
primary_region = "ord"
[build]
image = "my-dockerhub-user/my-app:v1.1.0" # Point to the new image
[[services]]
internal_port = 8080
protocol = "tcp"
script_image = "flyio/fly-proxy:latest"
[[services.ports]]
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["https"]
port = 443
# Set concurrency for the canary service
[services.concurrency]
type = "percent"
hard_limit = 5 # 5% of traffic goes to this version
Now, deploy both versions. You’ll deploy the older version first, and then the newer one.
# Deploy the older version (v1.0.0) if it's not already running
fly deploy -c fly.toml --remote-only
# Now deploy the new version (v1.1.0)
fly deploy -c fly.toml --remote-only
Fly.io’s proxy will automatically start routing traffic according to the concurrency settings. You can monitor the traffic distribution using flyctl status:
flyctl status --app my-canary-app
You’ll see something like this, indicating two versions of your app running, each with a percentage of traffic assigned:
App: my-canary-app
ID: some_app_id
Status: Running
Version: some_version_id
Owner: some_org
Created: 2023-10-27T10:00:00Z
Updated: 2023-10-27T11:00:00Z
Services:
- ports: 80/tcp, 443/tcp
Instances:
ID VM ID REGION STATE LOADS IMPORTANT UPTIME
inst_1 vm_id_1 ord running 0 true 1h
inst_2 vm_id_2 ord running 0 false 5m # This is our canary instance
Checks:
service: success (3/3)
Instances by version:
v1.0.0 (95%): 1 instance (inst_1)
v1.1.0 (5%): 1 instance (inst_2)
The hard_limit in services.concurrency is the critical lever. It tells the Fly.io proxy how to distribute incoming requests across different deployed versions of your application. When you have multiple versions deployed, the proxy reads these percentages and directs traffic accordingly. This isn’t just about percentage; it’s about isolating the blast radius of a faulty deployment. If v1.1.0 has a bug that causes it to crash under load, only 5% of your users will experience it, and you can quickly roll back by removing the canary deployment or adjusting the concurrency back to 100% for the stable version.
Once you’ve monitored the canary for a period (hours, days, depending on your risk tolerance and the criticality of the change) and are confident in its stability, you can gradually increase its share of traffic. You might do this in stages: 25%, 50%, 75%, and finally 100%.
To promote the canary to 100%, you’d update the fly.toml for the new version to hard_limit = 100 and the old version to hard_limit = 0. Then deploy again.
Promoted fly.toml (for v1.1.0):
app = "my-canary-app"
primary_region = "ord"
[build]
image = "my-dockerhub-user/my-app:v1.1.0" # This is now the stable version
[[services]]
internal_port = 8080
protocol = "tcp"
script_image = "flyio/fly-proxy:latest"
[[services.ports]]
handlers = ["http"]
port = 80
[[services.ports]]
handlers = ["https"]
port = 443
[services.concurrency]
type = "percent"
hard_limit = 100 # All traffic goes to this version
And then remove the old version from deployment or set its concurrency to 0.
# Deploy the promoted version
fly deploy -c fly.toml --remote-only
# Optional: Remove the old version's deployment configuration or scale it down
# (e.g., by removing its fly.toml or scaling its instances to 0)
The most counterintuitive part of canary deployments is that they require you to run multiple versions of your application simultaneously for an extended period. This means your infrastructure must be capable of handling this mixed state, and your application must be designed to tolerate requests being processed by both old and new code paths during the transition, especially if you have stateful operations or database schema changes.
The next step after mastering canary deployments is understanding how to automate this process with feature flags.