Storing complex objects in Memcached isn’t as simple as just setting them directly; you need to serialize them first, and the library you choose has a significant impact on performance and compatibility.

Let’s see it in action. Imagine we have a Python object representing a user:

class User:
    def __init__(self, user_id, username, preferences):
        self.user_id = user_id
        self.username = username
        self.preferences = preferences

    def __str__(self):
        return f"User(id={self.user_id}, name='{self.username}', prefs={self.preferences})"

user_data = User(123, "alice", {"theme": "dark", "notifications": True})

We want to store this user_data object in Memcached. If we try to set it directly without serialization, it will likely fail or store a string representation that’s useless for retrieval.

The core problem Memcached solves is fast, in-memory key-value storage. It’s designed for simple data types. When you have complex Python objects (lists, dictionaries, custom classes), Memcached doesn’t inherently understand their structure. It just sees a blob of bytes. Serialization is the process of converting your Python object into a format that can be stored as a string (or bytes) and then converting it back into a Python object upon retrieval.

Here’s how you’d typically do it using Python’s built-in pickle module:

import memcache
import pickle

# Assuming memcached is running on localhost:11211
mc = memcache.Client(['127.0.0.1:11211'], debug=0)

user_id_key = f"user:{user_data.user_id}"

# Serialize the object using pickle
serialized_user = pickle.dumps(user_data)

# Store it in memcached
mc.set(user_id_key, serialized_user)

print(f"Stored user data for {user_id_key}")

# Retrieve and deserialize
retrieved_serialized_user = mc.get(user_id_key)

if retrieved_serialized_user:
    retrieved_user = pickle.loads(retrieved_serialized_user)
    print(f"Retrieved: {retrieved_user}")
    print(f"Retrieved username: {retrieved_user.username}")
else:
    print("User not found in memcached.")

In this example, pickle.dumps() converts the User object into a byte stream, which mc.set() stores. pickle.loads() then reconstructs the original Python object from that byte stream when we mc.get() it.

However, pickle has a significant drawback: it’s Python-specific and can be a security risk if you deserialize data from untrusted sources. For interoperability or better security, you might opt for formats like JSON or MessagePack.

Using JSON:

import memcache
import json

mc = memcache.Client(['127.0.0.1:11211'], debug=0)
user_id_key = f"user:{user_data.user_id}"

# JSON can only serialize basic types. We need a custom encoder or to convert
# complex objects to dictionaries. For this User class, we'd need to
# implement a method or handle it during serialization.
# A common approach is to convert to a dict:
user_dict = {
    "user_id": user_data.user_id,
    "username": user_data.username,
    "preferences": user_data.preferences
}

serialized_user_json = json.dumps(user_dict)
mc.set(user_id_key, serialized_user_json)
print(f"Stored user data (JSON) for {user_id_key}")

retrieved_serialized_user_json = mc.get(user_id_key)
if retrieved_serialized_user_json:
    retrieved_user_dict = json.loads(retrieved_serialized_user_json)
    # Reconstruct the object if needed, or just use the dict
    print(f"Retrieved (JSON): {retrieved_user_dict}")
    print(f"Retrieved username (JSON): {retrieved_user_dict['username']}")
else:
    print("User not found in memcached.")

JSON is human-readable and widely compatible but can be verbose and slower than binary formats for large datasets. It also doesn’t natively support all Python types (like dates, custom objects directly), requiring custom handling.

Using MessagePack (via msgpack-python):

import memcache
import msgpack

mc = memcache.Client(['127.0.0.1:11211'], debug=0)
user_id_key = f"user:{user_data.user_id}"

# Similar to JSON, complex objects need conversion.
user_dict = {
    "user_id": user_data.user_id,
    "username": user_data.username,
    "preferences": user_data.preferences
}

serialized_user_msgpack = msgpack.dumps(user_dict)
mc.set(user_id_key, serialized_user_msgpack)
print(f"Stored user data (MsgPack) for {user_id_key}")

retrieved_serialized_user_msgpack = mc.get(user_id_key)
if retrieved_serialized_user_msgpack:
    retrieved_user_dict = msgpack.loads(retrieved_serialized_user_msgpack)
    print(f"Retrieved (MsgPack): {retrieved_user_dict}")
    print(f"Retrieved username (MsgPack): {retrieved_user_dict['username']}")
else:
    print("User not found in memcached.")

MessagePack is a binary serialization format that is more compact and faster than JSON, while still offering good cross-language compatibility. It handles many common data types efficiently.

The most surprising truth about Memcached serialization is that it’s often the size and speed of the serialized representation that dictates the best choice, not just compatibility. pickle is fast and handles arbitrary Python objects but is a security and compatibility risk. JSON is readable and compatible but verbose. MessagePack offers a great balance of speed and size for many use cases, but still requires your object to be convertible to a set of basic types.

The specific levers you control are the serialization library and how you adapt your complex objects to be compatible with that library’s requirements. For instance, you might add to_dict() and from_dict() methods to your custom classes to make them easily serializable/deserializable with JSON or MessagePack.

The next thing you’ll likely encounter is managing cache invalidation when your original object changes, and how to ensure the data you retrieve from Memcached is still fresh.

Want structured learning?

Take the full Memcached course →