Kafka brokers don’t just stop when you tell them to; they try to be good citizens and clean up after themselves.

Let’s see what that looks like in practice. Imagine we have a simple Kafka setup: two brokers, broker-1 and broker-2, and a topic my-topic with one partition.

{
  "version": 1,
  "brokers": {
    "broker-1": {
      "host": "localhost",
      "port": 9092
    },
    "broker-2": {
      "host": "localhost",
      "port": 9093
    }
  },
  "topics": {
    "my-topic": {
      "partitions": {
        "0": {
          "replicas": [1, 2],
          "leader": 1
        }
      }
    }
  }
}

Here, broker-1 is the leader for my-topic-0. Now, if we wanted to shut down broker-1, a naive approach would be to just kill the process. But Kafka has a built-in mechanism to handle this more gracefully.

When you initiate a shutdown of a Kafka broker, it first attempts to transfer leadership of any partitions it currently leads to another replica. This is crucial because if the leader goes down abruptly without transferring leadership, clients (producers and consumers) will experience errors and delays as they discover the new leader.

The process looks something like this:

  1. Leader Election Triggered: When a broker receives a shutdown signal (e.g., via Ctrl+C or a systemctl stop kafka command), it starts a graceful shutdown sequence.
  2. Partition Leader Handover: For each partition where the shutting-down broker is the leader, it initiates a request to ZooKeeper (or its internal Raft quorum if using KRaft) to transfer leadership.
  3. New Leader Takes Over: ZooKeeper (or the Raft quorum) selects a new leader from the available replicas for that partition. This new leader will then begin serving requests.
  4. Broker Shuts Down: Once leadership has been successfully transferred for all partitions, the broker proceeds with its shutdown.

Let’s simulate this. Suppose we have a producer sending messages to my-topic.

kafka-console-producer --broker-list localhost:9092,localhost:9093 --topic my-topic
> message 1
> message 2

Now, let’s shut down broker-1 gracefully.

# On the machine running broker-1
./bin/kafka-server-stop.sh

If you monitor your Kafka logs, you’ll see messages indicating that broker-1 is attempting to transfer leadership. You can also check the partition leadership using the kafka-topics.sh tool before the shutdown completes.

# Run this command in a separate terminal while broker-1 is shutting down
./bin/kafka-topics.sh --bootstrap-server localhost:9092,localhost:9093 --describe --topic my-topic

Before the shutdown is finalized, you’d see something like this:

Topic: my-topic  PartitionCount: 1  ReplicationFactor: 2  Configs:
        Topic: my-topic Partition: 0  Leader: 1  Replicas: 1,2  Isr: 1,2

And as broker-1 (ID 1) successfully transfers leadership, you’d see the leader change:

Topic: my-topic  PartitionCount: 1  ReplicationFactor: 2  Configs:
        Topic: my-topic Partition: 0  Leader: 2  Replicas: 1,2  Isr: 2,1

Notice how the Leader field changes from 1 to 2. Now, broker-2 (ID 2) is the leader. The producer and consumer clients, configured with both broker addresses, will seamlessly switch their requests to broker-2 without interruption. If broker-1 had simply been killed, clients would have had to go through a longer discovery process to find the new leader, potentially resulting in dropped requests or increased latency.

This graceful shutdown is controlled by several configurations in server.properties:

  • controlled.shutdown.enable: Set to true (which is the default) to enable this feature. If false, the broker will not attempt to transfer leadership.
  • controlled.shutdown.broker.timeout.ms: The maximum time in milliseconds to wait for partitions to transfer leadership. The default is 30000 (30 seconds).
  • zookeeper.connection.timeout.ms: This is important because the leader transfer relies on ZooKeeper. If ZooKeeper is slow or unresponsive, the controlled shutdown might time out. The default is 6000 (6 seconds).

The most counterintuitive aspect of controlled shutdown is that it’s not instantaneous. It relies on an external coordination service (ZooKeeper or Raft) and network communication to negotiate leadership changes. This means that the shutdown time is dependent on the health and responsiveness of the cluster and its coordination mechanism, not just the local machine. If ZooKeeper is overloaded or unreachable, the controlled.shutdown.broker.timeout.ms will eventually be hit, and the broker might force a more abrupt shutdown, leaving partitions leaderless temporarily.

The next thing you’ll likely encounter is handling scenarios where the controlled shutdown fails to complete within the timeout.

Want structured learning?

Take the full Kafka course →