Memcached Hit Ratio: Monitor Cache Effectiveness (2026)

A memcached hit ratio below 90% means you’re likely burning CPU on your backend servers for data that could have been served from cache.

Here’s a live memcached instance serving requests:

# Simulate a memcached server (requires memcached installed)
memcached -p 11211 -m 64 -vv

# Simulate client requests
while true; do
  # Set some data
  echo "set mykey 0 60 5" | nc 127.0.0.1 11211
  echo "hello" | nc 127.0.0.1 11211

  # Get the data
  echo "get mykey" | nc 127.0.0.1 11211
  sleep 0.1
done

In the memcached -vv output, you’ll see lines like:

<10 new connection> -> get mykey <- VALUE mykey 0 5 hello <- END

This shows a successful get (a hit). If you try to get a key that hasn’t been set, you’ll see:

-> get non_existent_key <- END

This is a miss. The hit ratio is the percentage of get operations that result in a VALUE (hit) versus just END (miss).

Memcached itself doesn’t expose a direct "hit ratio" metric. You have to calculate it. The key stats you need are get_hits and cmd_get.

get_hits: The number of successful retrievals of existing items.
cmd_get: The total number of get commands issued, regardless of whether the item existed.

The formula is (get_hits / cmd_get) * 100.

To get these numbers, you can use stats commands. Connect to memcached using telnet or nc:

echo "stats" | nc 127.0.0.1 11211

This will output a wall of text. Look for lines like:

STAT get_hits 154876 STAT cmd_get 170000 STAT get_misses 15124 STAT evictions 0 STAT bytes 4096 END

From these, you can calculate the hit ratio: (154876 / 170000) * 100 = 91.1%.

Why is a low hit ratio bad?

Every miss means memcached had to tell your application, "Nope, don’t have that." Your application then has to go to the source of truth (database, API, etc.) to fetch the data, process it, and then potentially store it back in memcached. This is significantly more work and latency than a cache hit.

Common Causes and Fixes for Low Hit Ratio:

Cache Evictions: Memcached has a fixed memory limit (-m flag). When it runs out of space, it has to remove older items to make room for new ones. If your working set (the data frequently accessed) is larger than your cache size, you’ll see constant evictions and low hit rates.
- Diagnosis: Check STAT evictions in your stats output. If this number is high and increasing rapidly, you’re evicting.
- Fix: Increase the memory allocated to memcached. For example, if you’re using -m 64 (64MB), try -m 128 or -m 256.
- Why it works: More memory means memcached can hold more items, reducing the need to evict.
Short TTLs (Time To Live): Items expire from memcached based on their TTL. If your TTLs are too short for your access patterns, data will be removed before it’s naturally evicted due to space constraints, leading to misses.
- Diagnosis: Examine your application code. What TTLs are being set for cache entries? Are they appropriate for how often the underlying data changes?
- Fix: Increase the TTL for your cache items. For example, if you’re setting set mykey 0 60 5 (60-second TTL), consider set mykey 0 300 5 (300-second TTL) if the data doesn’t change that often.
- Why it works: Longer TTLs keep items in the cache longer, increasing the chance they’ll be hit before expiring.
"Thundering Herd" Problem / Cache Stampede: A single popular item expires. Many clients simultaneously request this item, all missing the cache. They all hit the backend source, overwhelming it, and then all try to re-populate the cache at once.
- Diagnosis: This is harder to see directly in memcached stats. You’ll see spikes in backend load and a temporary dip in hit ratio, followed by a recovery. Application-level monitoring is key.
- Fix: Implement cache locking or probabilistic early expiration. A common technique is for the first client requesting an expired item to acquire a lock (e.g., using memcached’s add command with a unique key), fetch the data, update the cache, and then release the lock. Subsequent clients will find the item in the cache.
- Why it works: It serializes the expensive backend fetch and cache update for a single item, preventing a flood of requests.
Incorrect Key Naming or Usage: If your application uses many short-lived or very specific keys that aren’t reused, or if keys are being generated incorrectly, you might be setting and getting different keys for what should be the same data.
- Diagnosis: Review your application’s cache key generation logic. Are you accidentally creating unique keys for identical data? Are you setting a key and then immediately trying to get a slightly different one?
- Fix: Standardize your cache key naming conventions. Ensure that identical data always maps to the same key. For example, instead of user:123:profile:update_timestamp, use user:123:profile.
- Why it works: Consistent key usage ensures that when data is requested, the correct, existing cache entry is found.
Cache Sharding/Distribution Issues: If you have multiple memcached instances, but your application logic isn’t distributing keys evenly across them, one instance might be overloaded and evicting heavily while others are underutilized.
- Diagnosis: Monitor stats for each memcached instance. Look for significant variations in get_hits, cmd_get, and evictions across your memcached pool.
- Fix: Review your application’s memcached client library or custom sharding logic. Ensure a consistent hashing algorithm (like Ketama or consistent hashing) is used to distribute keys across your available memcached servers.
- Why it works: Even distribution ensures that the load and memory usage are spread across all memcached nodes, preventing hot spots.
Application Logic Not Caching at All: The most basic reason for a low hit ratio is that the application simply isn’t using memcached effectively, or at all, for frequently accessed data.
- Diagnosis: Trace your application’s data access patterns. Identify frequently hit data sources (databases, APIs) and verify if and how memcached is being used to cache responses.
- Fix: Implement or improve caching logic in your application. Before fetching from a slow backend, check memcached. If the data is there (hit), return it. If not (miss), fetch from the backend, store it in memcached with an appropriate TTL, and then return it.
- Why it works: Proactive caching reduces the load on backend systems and improves response times by serving data from the fast in-memory cache.

After fixing these, you might encounter issues with memcached connection pooling if your application opens and closes connections frequently, as the overhead of establishing connections can become a bottleneck.