Kafka’s compression codecs aren’t just about saving disk space; they fundamentally alter your cluster’s performance characteristics, often in ways that surprise people.

Let’s see how this plays out. Imagine a Kafka producer sending messages to a broker, and then a consumer reading them.

// Producer Config (example)
{
  "acks": "all",
  "key.serializer": "org.apache.kafka.common.serialization.StringSerializer",
  "value.serializer": "org.apache.kafka.common.serialization.StringSerializer",
  "compression.type": "snappy" // This is what we're comparing
}

// Broker Config (relevant snippet)
// server.properties
log.message.format.version=2.8
compression.type=producer # or snappy, lz4, zstd

// Consumer Config (example)
{
  "key.deserializer": "org.apache.kafka.common.serialization.StringDeserializer",
  "value.deserializer": "org.apache.kafka.common.serialization.StringDeserializer",
  "group.id": "my-consumer-group",
  "auto.offset.reset": "earliest"
}

When a producer sends data, it can choose to compress messages before they hit the network. Kafka brokers can also be configured to re-compress data if the producer’s codec differs from the broker’s configured compression, though this is less common and usually leads to performance degradation. The primary decision point is the producer’s compression.type.

The core problem Kafka compression solves is the trade-off between CPU usage and network/disk I/O. More compression means less data to transfer over the network and write to disk, but it requires more CPU cycles on the producer to compress and on the consumer to decompress. The "best" codec depends entirely on your cluster’s bottleneck.

Here’s a breakdown of the common codecs and their characteristics:

Snappy:

  • What it is: A fast, but not aggressively compressing, codec developed by Google.
  • Pros: Very low CPU overhead for compression and decompression. Great for CPU-bound producers or consumers where I/O isn’t the primary bottleneck.
  • Cons: Lower compression ratios compared to LZ4 and ZSTD. You’ll transfer more data.
  • When to use: When your CPU is the limiting factor, and you want to offload the least amount of work from it. Good for high-throughput, low-latency scenarios where raw speed is paramount.
  • Typical Compression Ratio: Around 2.5:1 to 3:1 for general text data.
  • CPU Usage: Very Low.
  • Performance Example: A producer sending 100MB/sec might only use 5-10% CPU for Snappy compression. A consumer decompressing that same data would use a similar percentage.

LZ4:

  • What it is: Another fast compressor, often seen as a good balance between Snappy and more aggressive codecs.
  • Pros: Faster than ZSTD (usually) and compresses better than Snappy. Offers a sweet spot for many use cases.
  • Cons: Still not as good compression ratio as ZSTD.
  • When to use: When you need better compression than Snappy without a significant CPU penalty. It’s often the default or recommended choice for a good all-around balance.
  • Typical Compression Ratio: Around 3:1 to 4:1.
  • CPU Usage: Low to Medium.
  • Performance Example: For the same 100MB/sec, LZ4 might push producer/consumer CPU to 15-25%, but you’re sending less data.

ZSTD (Zstandard):

  • What it is: A modern, highly versatile compression algorithm from Facebook. It offers a wide range of compression levels.
  • Pros: Achieves the best compression ratios among these three. Can be tuned to be very fast at lower levels, approaching LZ4, or extremely high compression at higher levels.
  • Cons: Higher CPU usage at higher compression levels. The decompression speed at very high levels can sometimes be slower than LZ4.
  • When to use: When disk space or network bandwidth is your primary concern, and you have CPU headroom. Also excellent if you need to tune for specific trade-offs. For Kafka, often the compression.type=zstd setting implies a reasonable default level (e.g., level 3).
  • Typical Compression Ratio: 4:1 to 6:1 or even higher, depending on level and data.
  • CPU Usage: Medium to High (depending on level).
  • Performance Example: At a moderate level, 100MB/sec might use 25-40% CPU, but the data size on disk/network is significantly reduced.

The compression.type setting:

This is set on the producer. If not set, it defaults to producer (meaning no compression) or inherits from log.message.format.version if older Kafka versions are involved.

  • Producer Configuration:
    # In producer.properties or passed via Java API
    compression.type=zstd
    
  • Broker Configuration:
    # In server.properties
    # This setting is less critical if producers handle compression
    # and consumers are configured to decompress based on message headers.
    # Setting it here can enforce a minimum or default compression for certain operations.
    compression.type=producer # Default behavior
    

Choosing the right codec is an empirical process. You need to monitor your Kafka cluster’s CPU, network, and disk I/O under realistic load.

  • If your producer/consumer CPU is maxing out: Try Snappy or a lower compression level of ZSTD.
  • If your network throughput is maxed out or disk I/O is the bottleneck: Try ZSTD, potentially at a higher level.
  • For a balanced approach: LZ4 is often a good starting point.

The "magic" of ZSTD lies in its ability to offer a sliding scale. By default, compression.type=zstd in Kafka often uses a moderate compression level (e.g., 3). This level is chosen because it provides excellent compression ratios (often 4:1 or better) while keeping CPU usage manageable, typically around 20-30% for heavy loads, making it a fantastic default for many. You can explicitly set the level if needed: zstd.level=5.

The next hurdle you’ll encounter after optimizing compression is understanding Kafka’s idempotent producer and transactional APIs.

Want structured learning?

Take the full Kafka course →