LiteFS is an interesting beast, and replicating SQLite across Fly.io regions isn’t as simple as just pointing a tool at it; it’s about embracing eventual consistency and understanding how distributed systems really work.

Let’s see it in action. Imagine you have a Fly.io app with a SQLite database. You deploy it to two regions, lhr and ord.

# First, create a LiteFS cluster (this is a one-time setup)
flyctl.ioctl cluster create --name my-sqlite-cluster --regions lhr,ord

# Now, create a LiteFS volume for your app
# This volume will be replicated across the specified regions
flyctl.ioctl volume create my-litefs-volume --region lhr --size 10GB --litefs-cluster my-sqlite-cluster

# Attach this volume to your app (assuming your app is named 'my-fly-app')
flyctl.ioctl apps.volume.attach my-fly-app --volume my-litefs-volume --mount-path /data

Now, when your application writes to /data/my_database.sqlite on any instance in lhr or ord, LiteFS takes over. It intercepts these writes and replicates them to the other regions.

Here’s a look at the LiteFS configuration you’d typically have in your fly.toml:

app = "my-fly-app"
primary_region = "lhr"
[mount]
  source = "my-litefs-volume"
  destination = "/data"
[env]
  # This tells your app to use the SQLite database within the mounted volume
  DATABASE_URL = "sqlite://data/my_database.sqlite"

When an instance in ord receives a write, LiteFS on that instance will ensure it’s applied locally. If another instance in lhr writes to the same database simultaneously, LiteFS handles the conflict resolution. The default is last-writer-wins, but you can configure this.

The core problem LiteFS solves is the "SQLite in the cloud" dilemma. Traditionally, SQLite is a single-file, single-process database. Trying to run it directly on ephemeral Fly.io instances means your data disappears when the instance dies, or you have a single point of failure. Using a shared volume (like fly.io’s persistent volumes) can work for a single region, but it doesn’t offer multi-region availability or automatic failover. LiteFS bridges this gap by making a single SQLite file appear to be local to each instance while synchronizing changes across regions.

Internally, LiteFS operates as a FUSE (Filesystem in Userspace) process. When your application writes to the SQLite file, it’s actually writing to the LiteFS FUSE filesystem. LiteFS then intercepts these writes, journals them, and uses a gossip protocol to exchange these changes with other LiteFS instances in the same cluster. It uses a consensus mechanism (Raft, via etcd or Consul, though Fly.io’s managed solution abstracts this) to ensure consistency and manage leader election within the cluster.

The magic is that your application still thinks it’s just talking to a local SQLite file. It doesn’t need to be aware of LiteFS or replication. This makes migrating existing SQLite applications to a distributed environment remarkably straightforward.

The surprising truth is how LiteFS manages writes. It doesn’t try to lock the entire file across all regions for every write. Instead, it relies on the fact that most applications will primarily write to a single instance at any given time, or that writes to different parts of the database are unlikely to conflict. When conflicts do occur, LiteFS uses a sophisticated conflict resolution strategy. For example, it tracks the version of each file chunk and uses timestamps to determine which write "wins." This is crucial for performance; imagine the latency if every write had to be acknowledged by every region before it was considered complete!

The next hurdle you’ll likely encounter is understanding how to manage failover scenarios and the implications of eventual consistency on your application’s read operations.

Want structured learning?

Take the full Fly-io course →