Kafka Replication: Fault Tolerance Explained

Kafka’s replication is the secret sauce that lets it shrug off broker failures like they’re mere inconveniences, keeping your data safe and your streams flowing.

Imagine a Kafka topic. It’s split into partitions, and each partition is a log of messages. Replication means that instead of just one copy of that log, there are multiple copies spread across different brokers. If one broker dies, another broker holding a replica of that partition can immediately take over, and your application doesn’t even notice a blip.

Let’s look at a simple setup. We have three brokers: broker-1, broker-2, and broker-3. We create a topic named my-events with one partition. By default, Kafka will try to make 3 copies (replicas) of this partition, one on each broker.

Here’s how we’d configure that in server.properties on each broker:

broker.id=1 # or 2, or 3
listeners=PLAINTEXT://:9092
log.dirs=/tmp/kafka-logs
num.partitions=1 # This is for the topic, but we're showing it here for context.
                  # The topic itself will be created with specific partition counts.
offsets.topic.replication.factor=3
transaction.state.log.replication.factor=3
default.replication.factor=3 # This is the key one for your own topics!

The default.replication.factor=3 tells Kafka that for any new topic created without an explicit replication factor, it should aim for 3 replicas.

Now, let’s say broker-2 suddenly crashes. Your application is writing to my-events. Because broker-1 and broker-3 also have a copy of that partition’s log, Kafka automatically promotes one of them to be the new "leader" for that partition. Producers and consumers will start talking to the new leader, and they won’t know that broker-2 is gone.

The magic happens because of the In-Sync Replicas (ISRs). For each partition, there’s one broker designated as the leader. All other brokers holding a copy of that partition are followers. A follower is considered "in-sync" if it’s caught up to the leader’s log. When a producer sends a message, it only considers the write successful if the leader acknowledges it and if a configurable number of ISRs have also acknowledged it.

This "configurable number" is controlled by min.insync.replicas. If min.insync.replicas=2 and default.replication.factor=3, then a producer will wait for the leader and at least one follower to acknowledge a write. If broker-2 dies, and broker-1 is the leader and broker-3 is the only remaining ISR, the write will still succeed because min.insync.replicas (2) is met (leader + 1 ISR).

If min.insync.replicas was set to 3, and broker-2 died, the producer would stop receiving acknowledgments because only two replicas (leader + one follower) are available, failing to meet the min.insync.replicas requirement. This prevents writes from succeeding if a quorum of replicas cannot be guaranteed, thus preventing data loss.

Here’s how you’d check the ISRs for a partition using the command line:

kafka-topics.sh --bootstrap-server localhost:9092 --describe --topic my-events

The output would look something like this:

Topic:my-events PartitionCount:1 ReplicationFactor:3 Isr:1,2,3
  Topic: my-events Partition: 0 Leader: 1 Replicas: 1,2,3 Isrs: 1,2,3

Here, Leader: 1 means broker-1 is currently the leader for partition 0. Replicas: 1,2,3 means all three brokers are configured to hold a replica. Isrs: 1,2,3 means all three brokers are currently in sync and healthy.

If broker-2 went down, and broker-1 was still the leader, the Isrs might change to 1,3, and the output would show:

Topic:my-events PartitionCount:1 ReplicationFactor:3 Isr:1,3
  Topic: my-events Partition: 0 Leader: 1 Replicas: 1,2,3 Isrs: 1,3

Kafka’s internal controller constantly monitors broker health. If a leader broker fails, the controller triggers a leader election among the remaining ISRs. The partition will then have a new leader, and operations can continue.

The critical interaction is between default.replication.factor (or topic-specific replication.factor) and min.insync.replicas. For durability and availability, you typically want replication.factor to be at least 3 and min.insync.replicas to be set to replication.factor - 1. This ensures that even if one broker fails, you still have a majority of replicas available to acknowledge writes and serve reads, preventing data loss.

If you have replication.factor=3 and min.insync.replicas=1, and broker-2 fails, writes will still succeed as long as broker-1 (the leader) is up. However, if broker-1 then fails, and broker-3 becomes the leader, and then broker-3 fails, you’ve lost data because only one replica was required for acknowledgment.

The next thing you’ll grapple with is how Kafka handles unclean leader elections and the trade-offs between consistency and availability during network partitions.