The MongoDB WiredTiger cache is designed to keep your working set in RAM, but its default configuration often leaves significant performance on the table.

Let’s see it in action. Imagine a busy e-commerce platform. We’ve got a collection orders with millions of documents.

// Simulate a read operation on a frequently accessed order
db.orders.findOne({ _id: ObjectId("60b8f2a4b1f8c8a3d4e5f6a7") });

If this order’s data, along with its index, is in the WiredTiger cache, the read is near-instantaneous. If not, MongoDB has to go to disk, which is orders of magnitude slower. The goal is to make sure that frequently accessed data is in that cache.

The core problem WiredTiger solves is the trade-off between disk I/O and RAM usage. MongoDB uses WiredTiger as its storage engine, and WiredTiger employs a sophisticated caching mechanism. It’s not just a simple buffer pool; it’s a hierarchical cache with a WiredTiger-specific internal cache for index and data blocks, and a separate portion for the filesystem cache.

Here’s the mental model:

  1. WiredTiger Internal Cache: This is the primary cache for WiredTiger itself. It stores compressed data and index blocks. When MongoDB needs data, it first checks this cache. If it’s there, great. If not, WiredTiger has to read it from disk, decompress it, and then store it in its internal cache.
  2. Filesystem Cache: WiredTiger also leverages the operating system’s filesystem cache. MongoDB doesn’t directly control this, but it does influence how much memory is available for it.
  3. The storage.wiredTiger.engineConfig.cacheSizeGB setting: This is the big lever. It directly dictates the maximum size of the WiredTiger internal cache. The default is often too small, especially on servers with ample RAM.
  4. Journaling: WiredTiger’s journaling mechanism is crucial for durability. It writes changes to a journal before applying them to data files. This journal data also consumes memory and can interact with the cache.

The common advice is to set cacheSizeGB to a significant portion of your available RAM, but the exact percentage and how it interacts with other system processes is key.

Consider a scenario where you’ve just restarted your MongoDB instance, or you’ve had a period of low activity. Your cache is cold. The first few reads for frequently accessed data will be slow. As those documents are read and cached, subsequent reads will be fast. This is the "warming up" process.

The cacheSizeGB parameter is a hard limit on the WiredTiger internal cache. MongoDB will try to use up to this amount. However, it’s a shared resource. The operating system’s filesystem cache also needs memory. A common recommendation is to allocate 50% to 75% of the available RAM to the WiredTiger cache, leaving the rest for the OS and other processes. For a 64GB RAM server, this might mean setting cacheSizeGB to 32 or even 48.

To check your current configuration, you’d look at your mongod.conf file or query the running instance:

# On the server where mongod is running
cat /etc/mongod.conf | grep cacheSizeGB

Or via mongosh:

db.adminCommand({ getParameter: 1, cacheSizeGB: "" })

The fix involves modifying the mongod.conf file and restarting mongod. For example, to set the cache to 40GB on a server with 64GB RAM:

storage:
  wiredTiger:
    engineConfig:
      cacheSizeGB: 40

This change allows WiredTiger to hold more data blocks and index entries in memory. When a read operation occurs, there’s a higher probability that the required data is already in RAM, significantly reducing disk I/O and improving read latency. The OS will then manage the remaining RAM for its filesystem cache.

The wiredTiger.collectionConfig.blockCompressor setting is another related tuning knob. While not directly cache size, the choice of compression algorithm (e.g., snappy, zlib, zstd) impacts the size of data blocks that need to be cached. snappy offers a good balance of compression ratio and CPU overhead, making it a common default. If your CPU is maxed out, you might consider a less CPU-intensive compressor. If disk I/O is your bottleneck and you have spare CPU, a more aggressive compressor like zstd could yield a better cache hit ratio by fitting more data into the same cache size.

When you set cacheSizeGB, you’re not just telling MongoDB how much RAM to reserve. You’re also influencing the eviction process. WiredTiger uses a least-recently-used (LRU) algorithm to decide which blocks to evict from the cache when it’s full. A larger cache means less frequent evictions and a higher likelihood that frequently accessed data remains resident.

The real magic happens when you consider the working set. Your "working set" is the portion of your data and indexes that your application accesses most frequently. The goal of tuning the WiredTiger cache is to ensure your entire working set fits within the cacheSizeGB allocation. You can monitor your cache hit ratio using db.serverStatus().wiredTiger.cache. A hit ratio consistently above 95% is generally considered good. If it’s lower, you likely need to increase cacheSizeGB (assuming you have available RAM and your CPU isn’t saturated by compression/decompression).

A common misconception is that you should set cacheSizeGB to almost all available RAM. This is dangerous. The operating system needs RAM for its own operations, network buffers, and its own filesystem cache. If you starve the OS, you can see increased swapping, which will kill performance far worse than a slightly undersized WiredTiger cache.

The next thing you’ll likely encounter is understanding how to monitor and interpret the various metrics available in db.serverStatus() related to WiredTiger, particularly the cache statistics and eviction counts.

Want structured learning?

Take the full Mongodb course →