Memcached’s strength is its speed, but that speed comes at a cost: it doesn’t inherently know when the data it stores has changed in your primary data store.

Let’s see Memcached in action with a simple Python example. Imagine you have a web application that frequently fetches user profiles.

import memcache
import time
import random

# Connect to Memcached (assuming it's running on localhost:11211)
mc = memcache.Client(['127.0.0.1:11211'], debug=0)

def get_user_profile(user_id):
    cache_key = f"user_profile:{user_id}"
    profile_data = mc.get(cache_key)

    if profile_data:
        print(f"Cache hit for user {user_id}!")
        return profile_data
    else:
        print(f"Cache miss for user {user_id}. Fetching from DB...")
        # Simulate fetching from a database
        time.sleep(0.1)
        profile_data = {"id": user_id, "name": f"User {user_id}", "timestamp": time.time()}
        # Store in cache for 60 seconds
        mc.set(cache_key, profile_data, time=60)
        print(f"Stored user {user_id} in cache.")
        return profile_data

def update_user_profile_in_db(user_id, new_data):
    print(f"Updating user {user_id} in the database with: {new_data}")
    # Simulate database update
    time.sleep(0.05)
    # *** Crucially, we need to invalidate the cache ***
    # mc.delete(f"user_profile:{user_id}") # This is the key step!
    print(f"Database update complete for user {user_id}.")

# --- Simulation ---
user_id_to_test = 123

# First call - cache miss
profile1 = get_user_profile(user_id_to_test)
print(f"Profile 1: {profile1}")

# Second call - cache hit
profile2 = get_user_profile(user_id_to_test)
print(f"Profile 2: {profile2}")

# Simulate an update in the database
update_user_profile_in_db(user_id_to_test, {"name": "Updated Name"})

# Now, if we fetch again, ideally we should get the updated data.
# Without cache invalidation, Memcached would still return the old data.
# Let's simulate this by *not* deleting from cache for now to show the problem.
print("\n--- Simulating update without cache invalidation ---")
# For demonstration, let's force a cache miss by setting a new expiry
mc.delete(f"user_profile:{user_id_to_test}") # Clear cache to simulate a fresh start after DB update
profile3 = get_user_profile(user_id_to_test) # This will be a miss and get the *old* data from the simulated DB
print(f"Profile 3 (after DB update, but cache was stale): {profile3}")

# If we had correctly invalidated the cache above (by uncommenting mc.delete),
# then profile3 would have been a cache miss, fetched the *new* data from the DB,
# and then stored the new data.

# Let's re-run the update with proper invalidation
print("\n--- Simulating update WITH cache invalidation ---")
update_user_profile_in_db(user_id_to_test, {"name": "Another Update"})
mc.delete(f"user_profile:{user_id_to_test}") # *** This is the crucial invalidation step ***
print("Cache invalidated. Fetching again...")

profile4 = get_user_profile(user_id_to_test) # This is now a cache miss, fetches *new* data, and stores it.
print(f"Profile 4 (after DB update and cache invalidation): {profile4}")

# Subsequent calls will be cache hits with the new data
profile5 = get_user_profile(user_id_to_test)
print(f"Profile 5: {profile5}")

The core problem Memcached solves is reducing latency for frequently accessed data. It acts as a high-speed, in-memory key-value store. When your application needs data that’s expensive to fetch (like from a disk-based database or a remote API), it first checks Memcached. If the data is there (a cache hit), it’s returned almost instantaneously. If not (a cache miss), the application fetches it from the source, returns it to the user, and also stores a copy in Memcached for future requests.

The mental model for Memcached is a distributed hash table. You set(key, value, expiry_time) and later get(key). The expiry_time is a lease: Memcached guarantees the data will be available for that duration, but it makes no promises about what happens after expiry. It also makes no promises about what happens if the underlying data source changes before expiry. This is where cache invalidation comes in.

Your application is the sole arbiter of truth. Memcached is just a temporary copy. When the original data changes, your application must tell Memcached to forget its old copy. This is typically done by deleting the corresponding key from Memcached.

When you update user_profile:123 in your database, Memcached has no way of knowing this happened. If you don’t explicitly remove user_profile:123 from Memcached, subsequent get requests will still retrieve the stale, old data, leading to inconsistencies. The mc.delete(f"user_profile:{user_id}") call is the mechanism to ensure the next get will be a cache miss, forcing a re-fetch of the freshest data from the primary source.

The most surprising thing about cache invalidation is how often it’s treated as an afterthought, leading to subtle but persistent data staleness bugs. Many systems rely solely on time-to-live (TTL) expiry, which is a passive form of invalidation. This is insufficient when data can change arbitrarily and needs to be immediately consistent. Active invalidation, where the application explicitly deletes or updates cached items upon data modification, is essential for maintaining strong consistency guarantees.

The next challenge is handling complex relationships and distributed invalidation across multiple services.

Want structured learning?

Take the full Memcached course →