The UnknownTopicOrPartitionException means a Kafka client tried to interact with a topic or partition that doesn’t exist on the Kafka cluster, or the broker it contacted doesn’t know about it yet. This usually happens when a new topic is created and a client tries to produce or consume before all brokers have registered the topic’s metadata.
Here are the common causes and how to fix them:
1. Topic Creation Lag/Replication Issues
-
Diagnosis: Check the topic’s status and partition leadership across brokers.
kafka-topics.sh --bootstrap-server <your_broker_address>:9092 --describe --topic <your_topic_name>Look for partitions that are
Offlineor have no leader (Leader: none). If replication is set to1and you seeOffline, it’s likely a broker issue. If replication is>1and some replicas are missing orOffline, it points to replication lag or broker unavailability. -
Fix:
- If partitions are
OfflineorLeader: none: This often means the broker responsible for the partition (or all brokers if it’s a small cluster) is down or unhealthy. Restart the broker(s). If it’s a new topic and you just created it, wait a bit for metadata to propagate. - If replication factor is
>1and replicas are missing/offline: Ensure all brokers are running and healthy. If a broker is down, bring it back online. Once healthy, Kafka should automatically re-elect leaders and sync replicas. If a broker is permanently removed, you might need to manually reassign partitions usingkafka-reassign-partitions.sh. - Why it works: Kafka relies on brokers having up-to-date metadata about topics and partitions. If a partition is offline, it means no broker can serve requests for it. Bringing brokers back online or allowing metadata to propagate ensures the cluster has a consistent view of the topic.
- If partitions are
2. Client Connecting to the Wrong Broker
-
Diagnosis: Verify the
bootstrap.serversconfiguration for your Kafka client. Ensure it points to at least one, and ideally multiple, active broker addresses. If you are using a load balancer, ensure it’s correctly forwarding to healthy Kafka brokers and not a stale list.# Example producer config snippet bootstrap.servers=broker1:9092,broker2:9092,broker3:9092Check broker logs for connection attempts from clients and see which broker the client is actually talking to.
-
Fix: Correct the
bootstrap.serverslist in your client’s configuration to include valid, reachable broker addresses. If using a load balancer, reconfigure it to point to healthy brokers.- Why it works: Clients use the
bootstrap.serversto discover the Kafka cluster’s metadata. If they connect to a broker that doesn’t know about the topic (e.g., it’s not the leader for any partition, or metadata hasn’t updated), they’ll get this error.
- Why it works: Clients use the
3. auto.create.topics.enable Setting
-
Diagnosis: Check the Kafka broker configuration file (
server.properties) for theauto.create.topics.enablesetting.auto.create.topics.enable=trueIf this is
true, Kafka should auto-create topics on first use. If it’sfalse, topics must be explicitly created beforehand. -
Fix:
- If
auto.create.topics.enable=falseand you want auto-creation: Set it totrueinserver.propertiesand restart the broker(s). Be aware that auto-creation uses default configurations (partition count, replication factor) which might not be optimal. - If
auto.create.topics.enable=trueand you’re still getting the error: This implies a timing issue. The client tried to access the topic before the broker had a chance to auto-create it and propagate that metadata. Waiting a few seconds or increasingmetadata.max.age.secondson the client (though this is usually a last resort) can help. More reliably, explicitly create the topic usingkafka-topics.shbefore clients access it. - Why it works: This setting controls whether Kafka automatically creates topics when a producer or consumer requests one that doesn’t exist. If disabled, explicit creation is required. If enabled but failing, it’s often a race condition.
- If
4. Incorrect Topic Name or Case Sensitivity
-
Diagnosis: Double-check the exact spelling and casing of the topic name used by your client against the topic name that was actually created or intended. Kafka topic names are case-sensitive.
# List all topics to verify exact names kafka-topics.sh --bootstrap-server <your_broker_address>:9092 --list -
Fix: Correct the topic name in your client’s configuration to match the exact name registered in Kafka.
- Why it works: Simple typo or case mismatch means the client is asking for a non-existent entity.
5. Zookeeper Connectivity or Health
-
Diagnosis: Kafka brokers rely on ZooKeeper for cluster coordination, including storing and retrieving topic metadata. Check the health of your ZooKeeper ensemble. Ensure brokers can connect to ZooKeeper. Look for
ZooKeeper unavailableor similar errors in Kafka broker logs.# Check ZK status (example for standalone ZK server) echo 'stat' | nc localhost 2181 | grep Mode # Expected output: Mode: follower or Mode: leader -
Fix: Resolve any ZooKeeper connectivity or health issues. Restart ZooKeeper ensemble members if necessary. Ensure Kafka brokers have the correct
zookeeper.connectproperty pointing to a healthy ensemble.- Why it works: If brokers cannot communicate with ZooKeeper, they cannot update or retrieve critical metadata, including topic information, leading to
UnknownTopicOrPartitionException.
- Why it works: If brokers cannot communicate with ZooKeeper, they cannot update or retrieve critical metadata, including topic information, leading to
6. Broker Not Yet Aware of Topic Metadata (New Cluster/Broker)
-
Diagnosis: If you’ve just started a new Kafka cluster or added new brokers, it takes time for metadata to propagate. New brokers need to connect to ZooKeeper and fetch the existing cluster state, including all topics. Check broker logs for
Received new cluster IDorRegistering brokermessages. -
Fix: Wait. The metadata propagation time depends on cluster size and network latency. For critical applications, explicitly create topics and wait for confirmation before starting clients.
- Why it works: In a distributed system, it takes time for all nodes to learn about new information. New brokers need to "catch up" on the cluster’s state.
The next error you’ll likely encounter if you fix all the above but haven’t yet committed data to partitions is a NotEnoughReplicasAfterAppendException if your producer’s min.insync.replicas is set too high for the number of available replicas, or a LeaderNotAvailableException if a partition leader election is still in progress.