Memcached query caching is a powerful technique to drastically speed up your application by avoiding redundant database queries.
Here’s a simplified view of a web request without caching:
- User requests a page.
- Your application code needs data.
- Application queries the database.
- Database executes the query, fetches data.
- Database returns data to the application.
- Application renders the page.
Now, with Memcached query caching:
- User requests a page.
- Your application code needs data.
- Application checks Memcached for the data.
- If data is in Memcached (a "cache hit"):
- Memcached returns data to the application instantly.
- Application renders the page.
- If data is NOT in Memcached (a "cache miss"):
- Application queries the database.
- Database executes the query, fetches data.
- Database returns data to the application.
- Application stores the fetched data in Memcached with a specific key and an expiration time.
- Application renders the page.
The magic is in step 4: if the data is already in Memcached, you skip the entire database round trip, which is orders of magnitude slower.
How it Works Internally
Memcached is an in-memory key-value store. Think of it as a giant, super-fast dictionary.
- Keys: These are strings you invent to identify your cached data. A good key is descriptive and unique to the specific query and its parameters. For example,
user:123:profileorproducts:category:electronics:page:2. - Values: This is the actual data you fetched from your database. It can be anything serializable: arrays, objects, JSON strings, etc.
- Expiration Time (TTL - Time To Live): You set a duration (in seconds) after which Memcached will automatically discard the key-value pair. This is crucial for ensuring your cache doesn’t serve stale data.
When your application needs data, it first constructs a cache key. It then asks Memcached: "Do you have [this key]?"
- If Memcached replies with the value, you use it.
- If Memcached replies with "not found," you go to the database, get the data, store it in Memcached using
set(key, value, ttl), and then use it.
The Levers You Control
-
Key Naming Strategy: This is arguably the most important. Your keys must be consistent and unambiguous. If you have two different queries that could produce the same result, they must have the same key. Conversely, if a slight change in parameters should result in different data, your key must reflect that. A common pattern is
object_type:id:attribute. -
Time To Live (TTL): How long should data stay in the cache?
- Short TTL (e.g., 60 seconds): Good for data that changes frequently or where near-real-time updates are critical.
- Medium TTL (e.g., 300 seconds / 5 minutes): Suitable for data that changes less often, like product listings.
- Long TTL (e.g., 3600 seconds / 1 hour): For relatively static data, like site configuration or user roles.
- Zero TTL: This tells Memcached to keep the item indefinitely (or until it’s evicted due to memory pressure). Use with caution, only for data that never changes.
-
Serialization Format: What format will you store the data in Memcached?
- JSON: Human-readable, widely compatible.
json_encode()andjson_decode(). - PHP
serialize()/unserialize(): PHP-specific, can handle complex types but less interoperable. - Pickle (Python), Gob (Go), etc.: Language-specific serialization.
- Plain Strings: If your database returns simple strings or numbers.
- JSON: Human-readable, widely compatible.
-
Cache Invalidation Strategy: When does data change in the database? You need a way to tell Memcached to remove or update stale data before its TTL expires.
- TTL-based: The simplest. Data expires automatically.
- Event-driven: When you update data in the database (e.g.,
UPDATE users SET email = 'new@example.com' WHERE id = 123), you also explicitly delete the corresponding cache key from Memcached:delete('user:123:profile'). This is more complex but ensures fresher data.
Example: PHP with Predis
Let’s say you have a function to get a user’s profile.
<?php
require 'vendor/autoload.php';
use Predis\Client;
$memcached = new Client([
'scheme' => 'tcp',
'host' => '127.0.0.1',
'port' => 11211,
]);
function getUserProfile(int $userId): ?array
{
global $memcached;
$cacheKey = "user:{$userId}:profile";
$ttl = 300; // 5 minutes
// 1. Try to get from cache
$cachedData = $memcached->get($cacheKey);
if ($cachedData) {
// Cache hit! Return deserialized data
return json_decode($cachedData, true);
}
// 2. Cache miss - fetch from database (simulated)
echo "Cache miss for user {$userId}. Fetching from DB...\n";
$dbData = fetchUserFromDatabase($userId); // Your actual DB query function
if ($dbData) {
// 3. Store in cache
$memcached->set($cacheKey, json_encode($dbData), $ttl);
return $dbData;
}
return null;
}
// Simulated database function
function fetchUserFromDatabase(int $userId): ?array
{
// In a real app, this would be a PDO, mysqli, or ORM call
if ($userId === 123) {
return ['id' => 123, 'username' => 'alice', 'email' => 'alice@example.com'];
}
return null;
}
// --- Usage ---
// First call: Cache miss
$profile1 = getUserProfile(123);
print_r($profile1);
/*
Cache miss for user 123. Fetching from DB...
Array
(
[id] => 123
[username] => alice
[email] => alice@example.com
)
*/
// Second call: Cache hit (if within 5 minutes)
$profile2 = getUserProfile(123);
print_r($profile2);
/*
Array
(
[id] => 123
[username] => alice
[email] => alice@example.com
)
*/
// Simulate an update and explicit cache invalidation
function updateUserEmail(int $userId, string $newEmail): void
{
global $memcached;
// Update in DB (simulated)
echo "Updating email for user {$userId} to {$newEmail} in DB...\n";
// Invalidate cache
$cacheKey = "user:{$userId}:profile";
$memcached->delete($cacheKey);
echo "Deleted cache key: {$cacheKey}\n";
}
updateUserEmail(123, 'alice.new@example.com');
// Third call: Cache miss again because we deleted it
$profile3 = getUserProfile(123);
print_r($profile3);
/*
Updating email for user 123 to alice.new@example.com in DB...
Deleted cache key: user:123:profile
Cache miss for user 123. Fetching from DB...
Array
(
[id] => 123
[username] => alice
[email] => alice.new@example.com
)
*/
?>
The most common pitfall with Memcached query caching is not invalidating the cache correctly when the underlying data changes. This leads to serving stale data, which can be much harder to debug than a slow query.
The next step is understanding how to manage cache stampedes, where many requests simultaneously miss the cache for the same item and all hit the database at once.