The Kafka broker failed to fulfill a FetchResponse request because the requested data was unavailable or corrupted.
Common Causes and Fixes:
-
Log Segment Deletion:
- Diagnosis: Check broker logs for messages like
Log segment ... has been deletedorLog is not found. You can also check thelog.dirsconfiguration on your broker and navigate to the specific topic partition’s directory to see if the relevant segment files (e.g.,00000000000000000000.log) are missing. - Fix: If the log segment was intentionally deleted (e.g., due to retention policies and compaction), you may need to re-produce the lost data if it’s critical. If it was an accidental deletion, restore the segment from a backup.
- Why it works: Kafka stores messages in immutable log segments. If a segment is missing, the broker cannot retrieve the requested messages, leading to this error.
- Diagnosis: Check broker logs for messages like
-
Corrupted Log Segment:
- Diagnosis: Broker logs might show
Failed to read from offset ... in ...or mention CRC errors. You can also runkafka-log-dirs.sh --topic-dir <log_dir>/<topic_name>/<partition_id> --print-data-logand look forCRC mismatchorInvalid message format. - Fix: If the corruption is minor and only affects a few messages at the end of a segment, you might be able to truncate the segment. Use
kafka-log-dirs.sh --topic-dir <log_dir>/<topic_name>/<partition_id> --truncate-to-offset <offset>where<offset>is the first valid offset after the corrupted section. For severe corruption, you’ll likely need to restore the segment from a replica or re-produce data. - Why it works: Log segments are indexed files. If the data within a segment is corrupted, Kafka cannot read it reliably, and attempting to do so results in an error.
- Diagnosis: Broker logs might show
-
Leader Not Available (for the partition):
- Diagnosis: Check
kafka-topic.sh --describe --bootstrap-server <broker_list>and look for partitions where the leader is listed as-1or the controller has recently loggedPartition ... leader is ... but is not available. You might also seeLeaderNotAvailableerrors in consumer or producer logs. - Fix: This usually indicates a broker failure or network partition affecting the leader. Restart the failed broker, or if it’s a persistent issue, reassign the partition leadership using
kafka-reassign-partitions.sh. - Why it works: Consumers and producers always fetch data from the leader replica for a partition. If the leader is down or unreachable, no fetches can be served.
- Diagnosis: Check
-
Out of Disk Space on Leader Broker:
- Diagnosis: Monitor disk usage on the broker hosting the partition leader. Broker logs might contain
No space left on deviceerrors. Usedf -hon the broker’s host. - Fix: Free up disk space or add more storage to the broker. Consider adjusting
log.retention.bytesorlog.retention.msto manage disk usage. - Why it works: Kafka needs disk space to write new log segments. If the disk is full, it cannot append new messages, and existing segments might become inaccessible if the OS aggressively caches or truncates.
- Diagnosis: Monitor disk usage on the broker hosting the partition leader. Broker logs might contain
-
Incorrect
fetch.min.bytesConfiguration:- Diagnosis: While less common for a direct
FetchResponseerror and more for client-side lag, iffetch.min.bytesis set very high on the consumer and no new data has arrived to meet that threshold, the consumer might appear to hang or report issues. Check consumer configuration. - Fix: Lower
fetch.min.byteson the consumer to0or a more reasonable value. - Why it works:
fetch.min.bytestells the consumer to wait for at least that many bytes of data before returning from a fetch request. If set too high, it can lead to perceived unresponsiveness if data volume is low.
- Diagnosis: While less common for a direct
-
ZooKeeper Issues:
- Diagnosis: Broker logs might show
ZooKeeper connection lostorSession expired. Check the health of your ZooKeeper ensemble. You can also useecho "stat" | nc <zookeeper_host> <zookeeper_port>to check ZooKeeper status. - Fix: Ensure your ZooKeeper ensemble is healthy and accessible from all Kafka brokers. Restart ZooKeeper nodes if necessary or investigate network connectivity.
- Why it works: Kafka relies on ZooKeeper for cluster metadata, including partition leader elections and broker registration. If ZooKeeper is unavailable, Kafka cannot maintain cluster state, leading to various errors, including the inability to serve fetches.
- Diagnosis: Broker logs might show
-
Broker Undergoing Leader Election:
- Diagnosis: Broker logs will show messages related to
[Partition state change listener]and leader election attempts for specific partitions. Consumers might briefly seeLEADER_NOT_AVAILABLEerrors. - Fix: This is usually a transient state. If it persists, investigate the underlying cause of the leader election (e.g., broker failure, network issues).
- Why it works: During a leader election, there’s a brief period where no partition leader is actively serving requests.
- Diagnosis: Broker logs will show messages related to
The next error you’ll likely encounter if you haven’t addressed the root cause is TopicAuthorizationException due to misconfigured ACLs.