Kafka’s disk balancing isn’t about moving data around after it’s written; it’s about preventing uneven disk usage from the start by intelligently placing new partitions.
Let’s see it in action. Imagine we have a Kafka cluster with three brokers (broker-1, broker-2, broker-3) and we’re creating a new topic, user_activity, with 6 partitions.
# List existing topics and their partition assignments
kafka-topics.sh --bootstrap-server kafka-broker-1:9092 --list
# (Assume user_activity isn't there yet)
# Create the topic with 6 partitions and 3 replicas
kafka-topics.sh --bootstrap-server kafka-broker-1:9092 --create \
--topic user_activity \
--partitions 6 \
--replication-factor 3
# Describe the topic to see partition distribution
kafka-topics.sh --bootstrap-server kafka-broker-1:9092 --describe --topic user_activity
If Kafka’s partition placement strategy is working well, you’d see something like this:
Topic:user_activity PartitionCount:6 ReplicationFactor:3 Configs:
Topic: user_activity Partition: 0 Leader: broker-1 Replicas: 0,1,2 Isr: 0,1,2
Topic: user_activity Partition: 1 Leader: broker-2 Replicas: 1,2,0 Isr: 1,2,0
Topic: user_activity Partition: 2 Leader: broker-3 Replicas: 2,0,1 Isr: 2,0,1
Topic: user_activity Partition: 3 Leader: broker-1 Replicas: 0,1,2 Isr: 0,1,2
Topic: user_activity Partition: 4 Leader: broker-2 Replicas: 1,2,0 Isr: 1,2,0
Topic: user_activity Partition: 5 Leader: broker-3 Replicas: 2,0,1 Isr: 2,0,1
Notice how partitions 0 and 3 are led by broker-1, 1 and 4 by broker-2, and 2 and 5 by broker-3. Each broker is a leader for two partitions, and each broker holds a replica for all six partitions. This is the ideal scenario for even disk usage and load distribution.
The core problem Kafka’s disk balancing mechanism addresses is the "hotspotting" of brokers. If all partitions for a popular topic are concentrated on a few brokers, those brokers’ disks will fill up faster, and their network I/O will become a bottleneck, impacting the entire cluster’s performance. Conversely, if partitions are spread out, the read and write load, as well as the storage footprint, are distributed more evenly across all available brokers.
Kafka achieves this primarily through its partition assignment strategy when topics are created or expanded. The auto.partition.assignment.strategy broker configuration setting dictates which algorithm is used. The default and most common strategy is org.apache.kafka.clients.consumer.internals.AbstractCoordinator.DefaultPartitionAssignmentStrategy. This strategy aims to:
- Distribute partition leaders evenly: It tries to assign the leader for each partition to a different broker, cycling through the available brokers.
- Distribute replicas evenly: For each partition, it assigns replicas to different brokers, again aiming for an even spread and avoiding placing all replicas on the same broker.
- Consider rack awareness (if configured): If brokers are tagged with rack IDs (via
broker.rackinserver.properties), the assignment strategy tries to place replicas in different racks to improve fault tolerance.
The actual server.properties configuration on your brokers looks something like this, though you usually don’t need to change it from the default:
# In server.properties on each broker
auto.partition.assignment.strategy=org.apache.kafka.clients.consumer.internals.AbstractCoordinator.DefaultPartitionAssignmentStrategy
When you create a topic (kafka-topics.sh --create), Kafka’s controller takes the number of partitions, replication factor, and available brokers into account. It then uses the configured assignment strategy to determine which broker will be the leader for each partition and which brokers will hold the replicas. This is a one-time decision for a partition’s initial placement. Kafka does not dynamically move partitions around to rebalance disk usage. If you need to rebalance after the fact, you typically have to manually decommission and re-add brokers, or use tools that manage partition re-assignment.
One crucial detail often missed is how Kafka handles broker failures and recoveries. When a broker goes down, its partitions become unavailable. Kafka’s controller will then elect a new leader for those partitions from their in-sync replicas (ISRs). This election process itself is part of maintaining availability, but it doesn’t rebalance disk usage. If a broker is permanently removed, its partitions must be reassigned to other brokers. This is typically done via kafka-reassign-partitions.sh, which allows you to specify a new, desired partition distribution. This tool reads the current assignment, then writes a new one, triggering leader elections and replica creation/deletion.
For example, if broker-2 failed and we wanted to distribute its partitions among broker-1 and broker-3 and then bring broker-2 back online with new data:
# Generate a JSON file describing the desired partition reassignment
# (This is a manual process or scripted)
# Example: reassignment.json
{
"partitions": [
{
"topic": "user_activity",
"partition": 0,
"replicas": [0, 2, 3] # Add broker-3 (ID 3) as a replica, remove broker-2 (ID 1)
},
{
"topic": "user_activity",
"partition": 1,
"replicas": [0, 2, 3] # Reassign partition 1
},
// ... other partitions
]
}
# Execute the reassignment
kafka-reassign-partitions.sh --bootstrap-server kafka-broker-1:9092 \
--execute --reassignment-json-file reassignment.json
# Monitor progress
kafka-reassign-partitions.sh --bootstrap-server kafka-broker-1:9092 \
--describe --reassignment-json-file reassignment.json
The auto.partition.assignment.strategy is only invoked when a topic is created or when a partition is added to an existing topic. For existing topics, if you add new brokers to the cluster, they won’t automatically receive partitions. You must explicitly use kafka-reassign-partitions.sh to move partitions onto the new brokers to achieve a better distribution.
The next step after ensuring your partitions are well-distributed is understanding how consumer group rebalancing works.