The Kafka broker is reporting its log directories as offline because it cannot access the underlying file system where its topic partitions are stored.
Common Causes and Fixes:
1. File System Full:
- Diagnosis: Check disk space on the Kafka broker’s server.
df -h /path/to/your/kafka/logs - Fix: Free up space by deleting old logs, unneeded files, or expanding the disk. For example, if
/data/kafka-logsis 98% full:# Example: delete old logs older than 7 days find /data/kafka-logs/my_topic-0/ -type f -mtime +7 -delete # Or, if using a cloud provider, resize the volume. - Why it works: Kafka cannot write new data or metadata to a full disk, so it marks the affected log directories as offline to prevent data loss. Freeing space allows normal operations.
2. Incorrect File System Permissions:
- Diagnosis: Verify that the user running the Kafka process has read/write/execute permissions on the log directory.
ls -ld /path/to/your/kafka/logs # Check ownership and permissions. If Kafka runs as user 'kafka' and group 'kafka': sudo chown -R kafka:kafka /path/to/your/kafka/logs sudo chmod -R 755 /path/to/your/kafka/logs - Fix: Change ownership and permissions to grant the Kafka user access.
sudo chown -R kafka:kafka /path/to/your/kafka/logs sudo chmod -R ug+rwX,o-rwx /path/to/your/kafka/logs - Why it works: The Kafka broker process needs to read and write to these directories. If the user ID (UID) or group ID (GID) running Kafka doesn’t have the necessary permissions, the file system will appear inaccessible.
3. File System Mounted Read-Only:
- Diagnosis: Check the mount options for the file system where Kafka logs reside.
mount | grep /path/to/your/kafka/logs # Look for '(ro, ...)' in the output. - Fix: Remount the file system as read-write. This often requires root privileges and might indicate a deeper issue with the storage or underlying system.
sudo mount -o remount,rw /path/to/your/kafka/logs # To make it permanent, edit /etc/fstab. # Find the line for your log directory and change 'ro' to 'rw'. - Why it works: If the file system is mounted read-only, Kafka cannot create new segment files, append to existing ones, or perform necessary metadata updates, leading to the offline error.
4. Inode Exhaustion:
- Diagnosis: Check if the file system has run out of inodes, which are data structures used to store information about files and directories.
df -i /path/to/your/kafka/logs # If 'IUse%' is 100%, you've run out of inodes. - Fix: Delete a large number of small, unnecessary files or reformat the partition with more inodes (a more drastic measure).
# Example: find and delete empty files if they are numerous find /path/to/your/kafka/logs -type f -empty -delete - Why it works: Each file and directory requires an inode. Even if there’s disk space, running out of inodes prevents the creation of new files (like Kafka segment files) or directories.
5. Network File System (NFS) Issues:
- Diagnosis: If Kafka logs are on an NFS mount, check the NFS server’s health, network connectivity, and ensure the mount options are appropriate (e.g.,
rw,hard,intr).# On the Kafka broker: ping <nfs_server_ip> showmount -e <nfs_server_ip> mount | grep /path/to/your/kafka/logs - Fix: Resolve NFS server issues, network problems, or adjust mount options. For example, if
softmounts are causing timeouts, switch tohard.# Example: remount with hard mounts sudo mount -o remount,hard,intr /path/to/your/kafka/logs - Why it works: NFS mounts are susceptible to network interruptions, server unresponsiveness, or incorrect export/mount configurations. These can make the directory appear inaccessible to the Kafka broker.
6. Corrupted File System or Disk Errors:
- Diagnosis: Check system logs (
/var/log/syslog,/var/log/messages) for disk-related errors or file system corruption messages (e.g.,EXT4-fs error,XFSerrors). Run a file system check tool.sudo fsck -n /dev/sdXn # Use -n for a non-destructive check first # If errors are found, unmount and run fsck -y sudo umount /path/to/your/kafka/logs sudo fsck -y /dev/sdXn - Fix: Repair the file system using
fsckor similar tools. If the disk is failing, it needs to be replaced. - Why it works: Physical disk errors or logical file system corruption can make directories and files unreadable or unwriteable, directly impacting Kafka’s ability to operate.
7. Kafka Broker Configuration Incorrect:
- Diagnosis: Verify the
log.dirssetting in yourserver.propertiesfile points to the correct, accessible directories.grep log.dirs /path/to/your/kafka/config/server.properties # Ensure the paths listed are valid and match the actual directories. - Fix: Correct the
log.dirspath inserver.propertiesand restart the Kafka broker.log.dirs=/data/kafka-logs-1,/data/kafka-logs-2 - Why it works: If the
log.dirsconfiguration is mistyped, points to a non-existent directory, or a directory that is not mounted, Kafka will report those paths as offline.
After resolving these issues, you might encounter Controller maybe not available errors if the controller itself was unable to communicate with brokers due to these underlying storage problems.