Kafka’s KRaft mode isn’t just about ditching ZooKeeper; it fundamentally redefines how Kafka brokers achieve consensus, making Kafka itself the metadata controller.
Here’s a live example of a simple Kafka cluster running in KRaft mode. Notice the absence of ZooKeeper processes and how brokers directly elect a controller.
# Example Kafka directory structure (simplified)
/opt/kafka/
├── bin/
│ ├── kafka-topics.sh
│ ├── kafka-server-start.sh
│ └── ...
└── config/
├── kraft.properties
└── server.properties
Let’s set up a minimal kraft.properties file for a single-node KRaft cluster. This file contains the essential configuration for KRaft to operate.
# kraft.properties
process.roles=broker,controller
node.id=1
listeners=PLAINTEXT://:9092
advertised.listeners=PLAINTEXT://localhost:9092
log.dirs=/tmp/kraft-logs
controller.quorum.voters=1@localhost:9093 # The controller listener
controller.listener.names=CONTROLLER
group.initial.rebalance.delay.ms=0
Now, we start the Kafka server with this configuration.
/opt/kafka/bin/kafka-server-start.sh /opt/kafka/config/kraft.properties
You’ll see logs indicating the broker registering with the controller quorum and eventually becoming the controller itself.
[2023-10-27 10:00:00,123] INFO [KafkaServer id=1] Starting Kafka server (kafka.server.KafkaServer)
[2023-10-27 10:00:00,456] INFO [Controller id=1] Starting controller (kafka.controller.KafkaController)
[2023-10-27 10:00:00,789] INFO [KafkaRaftServer id=1] Starting Raft server (kafka.raft.KafkaRaftServer)
[2023-10-27 10:00:01,012] INFO [Controller id=1] Elected as controller (kafka.controller.KafkaController)
The problem KRaft solves is the operational overhead and complexity of managing a separate ZooKeeper ensemble. ZooKeeper, while robust, adds another distributed system to monitor, secure, and scale. KRaft integrates the consensus mechanism directly into Kafka brokers, simplifying deployment and management.
Internally, KRaft uses the Raft consensus algorithm. Each broker can be configured with process.roles=broker,controller. When a broker starts, it attempts to join the controller quorum defined by controller.quorum.voters. The controller quorum is a list of node.id@host:port for all brokers designated as controllers. The Raft protocol ensures that only one controller is active at any given time, managing all cluster metadata like topics, partitions, and consumer group offsets. When the active controller fails, the remaining quorum members elect a new controller through the Raft leader election process.
The key levers you control are:
process.roles: Determines if a broker acts as a controller, a broker, or both. For a production KRaft cluster, you’d typically have multiple nodes withprocess.roles=broker,controllerfor high availability.node.id: A unique identifier for each broker in the cluster.controller.quorum.voters: The list ofnode.id@host:portfor all potential controllers. This is crucial for bootstrapping and leader election.controller.listener.names: Specifies the listener name used for controller-to-controller communication.listenersandadvertised.listeners: Standard Kafka listener configurations.
KRaft mode introduces a new critical configuration parameter: initial.metadata.enable. When bootstrapping a new KRaft cluster, the first controller elected needs to initialize the cluster’s metadata. If initial.metadata.enable is set to true in the kraft.properties of the initial controller, it will create necessary internal topics like __cluster_id, __controller_epoch, __metadata_version, and __transaction_state_log. If this is false or not present, and no metadata exists, the controller will fail to start because it cannot establish a baseline for cluster state. For subsequent restarts of an existing cluster, this setting is generally ignored or should be false.
The next step is to understand how to manage topics and consumer groups in this new metadata management paradigm.