Neon Compute-Storage Separation: Architecture Explained (2026)

Neon’s compute-storage separation is less about isolating compute and storage as distinct resources, and more about abstracting them into a shared, elastic pool that can scale independently and serve multiple compute nodes simultaneously.

Let’s see this in action. Imagine a Neon compute node, c-abcd1234efgh, connected to a shared, distributed log of operations. This log isn’t just a passive record; it’s an active, ordered sequence of changes to the database state. When a query comes in, c-abcd1234efgh reads the relevant portions of this log, reconstructs the necessary data pages in its local cache, and executes the query. If another compute node, c-ijkl5678mnop, needs to perform a write operation, it appends its changes to the same log. The beauty is that the storage layer, managed by Neon’s Flink-based system, ensures that this log is durably stored and made available to all compute nodes, even those that weren’t active when the write occurred.

The core problem Neon solves is the traditional bottleneck of monolithic database architectures. In a single-node or even a sharded system with tightly coupled compute and storage, scaling one often means scaling the other, leading to over-provisioning or under-utilization. Neon breaks this coupling. Compute nodes are stateless and ephemeral. They can be spun up or down in seconds to handle varying query loads without impacting the underlying data. The storage layer, meanwhile, is a persistent, append-only log of all database operations. This log is then processed by a separate system that materializes the database state into immutable pages, which are then stored durably. This separation means compute can scale to millions of TPS and storage can grow to petabytes, independently.

Here’s a breakdown of the key components:

Compute Nodes: These are the stateless engines that execute SQL queries. They fetch data from the shared storage layer, process it, and return results. Because they are stateless, they can be added or removed dynamically. You can observe these in the Neon console or via psql by looking at the neon_local_pg.compute_nodes table if you’re running in a self-hosted setup, or by querying the Neon API for active compute instances.
Shared Storage Layer (WAL Log): This is the heart of Neon’s separation. It’s an append-only log of all database operations (Write-Ahead Log, or WAL). Think of it as a global, immutable history of every change made to the database. This log is processed by Neon’s storage layer.
Storage Layer (Pageserver): This component consumes the WAL log and materializes the database state into immutable, versioned data pages. These pages are then stored durably (e.g., on S3). The pageserver is responsible for serving these pages to compute nodes on demand. You can’t directly "see" the pageserver in the same way you see a compute node, but its health and performance are critical. In a self-hosted Neon, you’d monitor its logs for errors and its resource utilization.
Metadata Management: A system (often a distributed key-value store or similar) manages the mapping between logical database objects, WAL segments, and physical page locations.

The magic happens when a compute node needs data. It first checks its local cache. If the data isn’t there, it requests it from the pageserver. The pageserver, using the WAL log and its stored pages, reconstructs the required data version and sends it back. This means all compute nodes are reading from the same, consistent view of the data, derived from the shared WAL log. Writes are simply appended to the WAL log, and the pageserver asynchronously updates the durable storage.

The truly surprising part of Neon’s architecture is how it handles point-in-time recovery and branching. Because the WAL log is an immutable history, and the pageserver creates versioned snapshots, you can rewind any compute node or even an entire branch to any specific point in time. This isn’t just about restoring from a backup; it’s about creating a new, independent compute instance that is looking at the database as it existed at that precise moment, all without copying massive amounts of data. A typical command to create a branch might look like neonctl branch main my-feature-branch --restore-from-time '2023-10-27 10:00:00 UTC'.

This architecture allows for incredibly granular control over scaling and cost. You pay for compute when you’re actively querying, and you pay for storage based on the actual data stored. The ability to spin up and tear down compute nodes in seconds means you can handle peak loads without paying for idle capacity.

The next frontier you’ll encounter is understanding how Neon manages data consistency across these independently scaled components, particularly in scenarios involving high-volume concurrent writes and reads.