Neon’s storage costs can be surprisingly high if you’re not careful, and often the culprit isn’t the raw data size, but how it’s being managed.

Let’s see this in action. Imagine you have a Neon project with a few databases. You’ve been happily inserting data, but suddenly your storage bill spikes.

-- Example: Creating a table and inserting some data
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    name VARCHAR(100),
    email VARCHAR(100)
);

INSERT INTO users (name, email)
SELECT 'User ' || n, 'user' || n || '@example.com'
FROM generate_series(1, 1000000) AS n;

You might think, "Okay, 1 million rows, that’s not that much data." But if you’ve also been deleting and updating rows without proper vacuuming, or if you have tables with many versions of rows lingering, your actual storage usage can be much larger than the pg_relation_size() suggests.

Here’s how Neon’s storage actually works and why those costs can creep up:

Neon, like PostgreSQL, uses a Multi-Version Concurrency Control (MVCC) system. When you update or delete a row, PostgreSQL (and by extension, Neon) doesn’t immediately remove the old version. Instead, it marks it as "dead" and creates a new version. This allows concurrent transactions to see consistent snapshots of the data. Over time, these dead rows, or "tuples," accumulate.

The storage cost in Neon is directly tied to the physical space consumed by your data files on disk, not just the logical amount of "current" data. This includes live data, dead tuples waiting to be cleaned up, and space reserved for future writes that hasn’t been reclaimed.

Here are the main drivers of unexpected storage costs and how to manage them:

1. Bloat from Unvacuumed Tables

PostgreSQL needs to periodically clean up these dead tuples. This process is called VACUUM. If VACUUM (or AUTOVACUUM) isn’t running effectively, dead tuples accumulate, making tables larger than they need to be.

  • Diagnosis: Check your table bloat. You can use a query like this:

    SELECT
        schemaname,
        relname,
        n_live_tup,
        n_dead_tup,
        pg_size_pretty(pg_table_size(oid)) AS table_size,
        pg_size_pretty(pg_total_relation_size(oid)) AS total_size,
        CASE
            WHEN n_live_tup > 0 THEN
                (n_dead_tup::numeric / n_live_tup) * 100
            ELSE 0
        END AS dead_tuple_ratio
    FROM pg_stat_user_tables
    ORDER BY n_dead_tup DESC
    LIMIT 20;
    

    Look for tables with a high n_dead_tup count or a high dead_tuple_ratio.

  • Fix: Manually run VACUUM on bloated tables. For a more aggressive cleanup, use VACUUM FULL.

    VACUUM (VERBOSE, ANALYZE) your_bloated_table_name;
    -- Or for more aggressive cleanup, but locks the table:
    -- VACUUM FULL your_bloated_table_name;
    

    VACUUM marks dead tuples as reusable space. VACUUM FULL rewrites the entire table and its indexes, reclaiming space by removing dead tuples and reorganizing data, but it requires an exclusive lock.

  • Why it works: VACUUM reclaims space occupied by dead tuples, making the physical file size smaller. VACUUM FULL rebuilds the table, removing all dead space and reorganizing it contiguously.

2. Ineffective Autovacuum Configuration

Neon has autovacuum enabled by default, but its thresholds might not be aggressive enough for your workload, especially if you have very high write/update/delete activity.

  • Diagnosis: Check autovacuum settings.

    SHOW autovacuum_vacuum_threshold;
    SHOW autovacuum_vacuum_scale_factor;
    SHOW autovacuum_analyze_threshold;
    SHOW autovacuum_analyze_scale_factor;
    

    If autovacuum_vacuum_scale_factor is too high (e.g., 0.2 or 20%), autovacuum might only trigger after a huge number of changes.

  • Fix: Adjust autovacuum parameters per table. It’s generally recommended to tune these on a per-table basis for tables with high churn.

    ALTER TABLE your_high_churn_table SET (autovacuum_vacuum_threshold = 1000, autovacuum_vacuum_scale_factor = 0.05);
    ALTER TABLE your_high_churn_table SET (autovacuum_analyze_threshold = 500, autovacuum_analyze_scale_factor = 0.02);
    

    These values mean autovacuum will run when at least 1000 rows + 5% of the table are dead, or 500 rows + 2% of the table are new/changed for analysis.

  • Why it works: Lowering the thresholds makes autovacuum more proactive, cleaning up dead tuples more frequently before they accumulate significantly, thus keeping physical file sizes smaller.

3. Large Transaction Logs (WAL)

While Neon handles WAL efficiently, extremely long-running transactions or very high write volumes can lead to temporary increases in storage due to WAL files that haven’t been fully archived or recycled. This is less about permanent bloat and more about temporary growth.

  • Diagnosis: Monitor WAL segment growth. While Neon abstracts much of this, observing a consistent upward trend in storage that doesn’t correlate with new data insertion might hint at WAL-related activity or retention issues. Neon’s dashboard or monitoring tools can show overall storage usage trends.

  • Fix: Ensure no extremely long-running transactions are open. Review application logic for BEGIN/COMMIT pairs.

    SELECT pid, datname, usename, state, query_start, query
    FROM pg_stat_activity
    WHERE state = 'active' AND backend_xid IS NOT NULL
    ORDER BY query_start;
    

    Identify and terminate any excessively long-running transactions if appropriate.

  • Why it works: Long-running transactions prevent old WAL segments from being recycled, as they might need to access older transaction states. Closing them allows WAL recycling.

4. Unused Indexes

Indexes are crucial for performance, but they also consume storage. If you have indexes on columns that are no longer queried or on tables that are no longer used, they contribute to storage costs without providing value.

  • Diagnosis: Identify unused indexes. PostgreSQL 13+ has pg_stat_user_indexes which tracks index usage.

    SELECT
        schemaname,
        relname,
        indexrelname,
        pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
        idx_scan,
        idx_tup_read
    FROM pg_stat_user_indexes
    WHERE idx_scan = 0 AND relname NOT IN (SELECT tablename FROM pg_views)
    ORDER BY pg_relation_size(indexrelid) DESC;
    

    Look for indexes with idx_scan = 0 and significant index_size.

  • Fix: Drop unused indexes.

    DROP INDEX IF EXISTS your_unused_index_name ON your_table_name;
    
  • Why it works: Removing an index directly reduces the physical storage footprint of your database.

5. Temporary Tables and Large Sort Operations

Complex queries, especially those involving ORDER BY or GROUP BY on large datasets without appropriate indexes, can spill to disk and create temporary files. These are usually cleaned up, but in some scenarios, they can contribute to temporary storage growth.

  • Diagnosis: Monitor pg_stat_database for temp_files and temp_bytes.

    SELECT datname, temp_files, temp_bytes
    FROM pg_stat_database
    WHERE temp_bytes > 0;
    
  • Fix: Optimize queries that generate large temporary files. This often involves adding indexes, rewriting subqueries, or using more efficient join methods.

    -- Example: Add an index to help a sorting operation
    CREATE INDEX idx_users_email ON users (email);
    
  • Why it works: Proper indexing can allow sorts and group-bys to be performed in memory or more efficiently, reducing the need for disk-based temporary file creation.

6. Large Object (LOB) Storage

Storing large binary objects (like images or documents) directly in tables using BYTEA or BLOB types can significantly increase table size.

  • Diagnosis: Identify large BYTEA columns or large row sizes.

    SELECT
        schemaname,
        relname,
        pg_size_pretty(pg_table_size(oid)) AS table_size,
        pg_size_pretty(pg_total_relation_size(oid)) AS total_size
    FROM pg_stat_user_tables
    ORDER BY pg_table_size(oid) DESC
    LIMIT 10;
    

    If a table with large objects is at the top, investigate its row sizes.

  • Fix: Consider storing LOBs externally (e.g., in S3) and only store references (URLs or IDs) in your database.

    -- Example: Instead of storing the image data directly:
    -- CREATE TABLE products (id SERIAL PRIMARY KEY, name VARCHAR(100), image BYTEA);
    -- Store a URL:
    CREATE TABLE products (id SERIAL PRIMARY KEY, name VARCHAR(100), image_url VARCHAR(255));
    
  • Why it works: Moving large binary data out of the database reduces the physical size of your tables and WAL, leading to lower storage costs.

By actively monitoring and managing these aspects, you can keep your Neon storage costs in check and avoid surprise bills.

The next error you’ll likely encounter if you’ve fixed all bloat issues is related to pg_wal segment growth if you have very high write throughput and long-running transactions, or potentially relation '...' does not exist if you drop something you shouldn’t have.

Want structured learning?

Take the full Neon course →