Rolling deployments are the unsung heroes of modern application delivery, letting you update your services without a flicker of downtime.
Let’s see this in action with Fly.io. Imagine you have a simple web app running on Fly.io, currently at version v1.0.0.
Here’s a snippet of what your fly.toml might look like:
app = "my-cool-app"
primary_region = "ord"
[deploy]
strategy = "rolling"
Now, you’ve built a new version, v1.0.1, with some critical bug fixes. You push it up:
fly deploy --image your-dockerhub-username/my-cool-app:v1.0.1
Fly.io’s rolling deployment strategy kicks in. It doesn’t just swap out all your old machines for new ones instantly. Instead, it orchestrates a gradual transition.
Here’s how the mental model builds:
- The Goal: Zero downtime during deployments. This means at any given moment, at least one instance of your application must be available to serve traffic.
- The Mechanism: Fly.io manages a pool of "machines" (your application instances). For a rolling deployment, it introduces new machines running the new version while keeping old machines running.
- The Rollout:
- Fly.io starts launching new machines for
v1.0.1. - Once a new machine is healthy and ready (it passes health checks), Fly.io begins to send a fraction of the incoming traffic to it.
- Simultaneously, it starts gracefully shutting down older machines running
v1.0.0. Graceful shutdown means allowing any in-flight requests to complete before terminating. - This process repeats: launch a new machine, wait for it to be healthy, shift a bit more traffic, shut down an old machine.
- This continues until all old machines are replaced by new ones.
- Fly.io starts launching new machines for
- Levers You Control:
strategy = "rolling"infly.toml: This is the fundamental switch.max_concurrent_transforms: This setting (not directly infly.tomlbut configurable viaflyctlcommands or environment variables) controls how many machines can be in a "transition" state (either being created or destroyed) at once. A lower number means a slower, more cautious rollout, while a higher number can speed things up but increases risk if many new machines fail.min_machines: Ensures you always have at least a certain number of instances running, even during a deployment.- Health Checks: Crucial! Fly.io relies on your application reporting its health. If a new machine fails its health checks, Fly.io won’t send traffic to it and won’t proceed with shutting down old machines until the issue is resolved or you intervene.
Consider a scenario where you have 5 machines.
- Fly.io launches 1 new machine for
v1.0.1. - It becomes healthy. Traffic is now split 4/1 (old/new).
- Fly.io shuts down 1 old machine. Now you have 4 old, 1 new.
- It launches another new machine. Traffic is now split 3/2.
- It becomes healthy. Traffic is now split 3/2.
- Fly.io shuts down another old machine. Now you have 3 old, 2 new.
This continues until all 5 machines are running v1.0.1. The key is that at no point are there zero healthy machines serving traffic.
What most people don’t realize is that the "graceful shutdown" part of the rolling deployment is entirely dependent on your application responding correctly to termination signals. When Fly.io signals a machine to stop, it sends a SIGTERM. Your application should catch this signal, stop accepting new connections, finish processing any current requests, and then exit cleanly. If your app ignores SIGTERM or takes too long to shut down, Fly.io will eventually force-kill it after a timeout, potentially leading to dropped requests during the transition.
Once your rolling deployment is complete, the next challenge is often managing the version history and potentially rolling back if the new version introduces unforeseen issues.