Neon Compute: Shared vs Dedicated for Your Workload (2026)

Neon Compute, the backbone of your serverless PostgreSQL, offers two fundamental deployment models: Shared and Dedicated. The surprising truth is that for many workloads, the distinction isn’t performance, but rather predictability and isolation.

Let’s see this in action. Imagine two identical Python applications, both connecting to Neon.

import os
import time
from neon.api import NeonAPI

# Assume NEON_API_KEY and NEON_PROJECT_URL are set in environment variables
api = NeonAPI()

# --- Scenario 1: Shared Compute ---
print("--- Running on Shared Compute ---")
start_time = time.time()
# Simulate a read query that might be affected by noisy neighbors
for _ in range(100):
    api.query("SELECT pg_sleep(0.01)")
end_time = time.time()
print(f"Shared Compute duration: {end_time - start_time:.2f}s")

# --- Scenario 2: Dedicated Compute ---
print("\n--- Running on Dedicated Compute ---")
# In a real scenario, you'd create and attach a dedicated endpoint
# For demonstration, we'll just simulate the *expectation* of dedicated performance
start_time = time.time()
# Simulate the same read query
for _ in range(100):
    api.query("SELECT pg_sleep(0.01)")
end_time = time.time()
print(f"Dedicated Compute duration: {end_time - start_time:.2f}s")

In this simplified example, you’d likely observe that the "Shared Compute" duration might fluctuate more, sometimes being faster, sometimes slower, than the "Dedicated Compute" duration. This isn’t because the underlying PostgreSQL engine is inherently different, but because shared compute resources are dynamically allocated and can be influenced by other users on the same infrastructure. Dedicated compute, on the other hand, provisions a fixed set of resources for your exclusive use.

The core problem Neon Compute solves is the traditional trade-off between the operational simplicity of serverless and the predictable performance of dedicated infrastructure. Shared compute offers the "pay-per-use" elasticity of serverless, where you’re billed based on actual compute time consumed. It’s excellent for variable, spiky, or less latency-sensitive workloads where occasional micro-pauses or slight latency increases are acceptable.

Dedicated compute, however, provides a stable, consistent performance baseline. You provision compute endpoints with specific CPU and memory configurations (e.g., cpu_cores: 2, memory_gb: 4). This is ideal for latency-sensitive applications, production environments with strict SLAs, or workloads that require consistent throughput and predictable response times, even under load. You pay a fixed hourly rate for the provisioned endpoint, regardless of actual usage, ensuring your resources are always ready.

Internally, both shared and dedicated compute run on the same robust PostgreSQL engine. The difference lies in the resource allocation layer above it. Shared compute uses a dynamic pooling mechanism. When your query needs compute, a slot is allocated from a shared pool. If the pool is busy due to other users’ high demand, your query might experience a slight delay in getting its allocated resources, manifesting as increased latency or lower throughput. Dedicated compute bypasses this pooling for your specific endpoint; the resources are already reserved and waiting for your queries.

The real power of Neon’s architecture is how seamlessly you can switch between these models. You can start with shared compute for development and testing, then migrate to a dedicated endpoint for production with minimal code changes. The connection string remains the same; you simply provision and attach a different compute endpoint to your Neon database. This flexibility allows you to optimize for cost and performance as your application evolves.

A common misconception is that "shared" automatically means "slow." This isn’t always true. For many common read-heavy workloads, shared compute can be incredibly fast because Neon’s architecture allows it to scale out compute resources rapidly to meet demand. The potential for reduced performance comes not from the technology itself, but from the inherent variability of sharing a finite resource pool with potentially unpredictable neighbors. The mechanism that allows shared compute to be cost-effective is its ability to oversubscribe compute resources, assuming not all users will be at peak load simultaneously, and then dynamically rebalancing those resources.

When you’re ready to move beyond basic performance, you’ll want to explore Neon’s branching and merging capabilities.