Kafka’s isolation levels are a bit of a misnomer; they don’t actually prevent consumers from seeing data that’s already been committed, but rather control whether they can see uncommitted data.

Let’s see this in action. Imagine we have a Kafka topic named orders with three partitions.

# Create a topic with 3 partitions
kafka-topics --bootstrap-server localhost:9092 --create --topic orders --partitions 3 --replication-factor 1

Now, let’s simulate a producer writing some data. We’ll use kafka-console-producer for simplicity.

# Start a producer that sends messages every second
kafka-console-producer --bootstrap-server localhost:9092 --topic orders --producer-property enable.idempotence=true --producer-property acks=all
> {"order_id": 1, "status": "PENDING"}
> {"order_id": 2, "status": "PENDING"}
> {"order_id": 3, "status": "PENDING"}

By default, Kafka consumers operate with read_committed isolation. This means a consumer will only read messages that have been acknowledged by the producer and committed to Kafka. If a producer sends a message but it hasn’t been committed yet (e.g., due to a temporary network issue or the producer hasn’t received acknowledgments), a read_committed consumer won’t see it.

Consider a consumer group order_processors reading from orders:

# Start a consumer with default isolation level (read_committed)
kafka-console-consumer --bootstrap-server localhost:9092 --topic orders --group order_processors --from-beginning --isolation-level read_committed

If the producer above had a brief hiccup and only managed to commit the first two messages before the consumer started, the consumer would only see:

{"order_id": 1, "status": "PENDING"}
{"order_id": 2, "status": "PENDING"}

It would not see {"order_id": 3, "status": "PENDING"} until that message is successfully committed by the producer. This prevents consumers from acting on incomplete or potentially rolled-back transactions.

The alternative is read_uncommitted. A consumer with this setting will read all messages from the log, including those that are still being written and haven’t been committed yet.

# Start a consumer with read_uncommitted isolation
kafka-console-consumer --bootstrap-server localhost:9092 --topic orders --group order_processors_uncommitted --from-beginning --isolation-level read_uncommitted

If the producer again had a hiccup and only committed two messages, but the third was still in flight, a read_uncommitted consumer might see all three, including the one that’s not yet durable. This is generally undesirable for most applications as it can lead to processing data that might be later discarded.

The primary problem Kafka’s isolation levels solve is consumer consistency in the face of producer failures or transactional complexities. When a producer uses transactional writes (which requires enable.idempotence=true and acks=all), messages are only visible to read_committed consumers after the transaction is successfully committed. If the transaction aborts, those messages are never seen by read_committed consumers. read_uncommitted consumers, however, would have seen those messages during the transaction, leading to a potential inconsistency.

The actual mechanism involves Kafka brokers maintaining a "transactional state" for each producer session. When a producer initiates a transaction, a unique producer ID (producer_id) and producer epoch are assigned. Messages written within a transaction are tagged with this producer_id and an epoch. The broker keeps track of which transactions are still active, committed, or aborted. read_committed consumers only retrieve messages whose associated transaction is in a "committed" state. read_uncommitted consumers bypass this check entirely and read everything.

The one thing most people don’t realize is that read_committed isolation doesn’t magically prevent all forms of data duplication or out-of-order processing if your application logic isn’t designed for it. It specifically addresses the visibility of uncommitted messages within Kafka’s transactional framework. If your producer is sending duplicate messages (even if committed) due to retries without proper idempotence, read_committed won’t stop the consumer from seeing those duplicates. You still need to handle idempotence at the producer and potentially downstream at the consumer.

The next challenge you’ll likely encounter is managing consumer offsets correctly when dealing with transactional messages, especially if your consumer also needs to perform transactional writes.

Want structured learning?

Take the full Kafka course →