Fluent Bit’s tail input plugin can get bogged down if it’s not configured to handle log file rotation and truncation gracefully.

Common Causes and Fixes

  1. Fluent Bit losing track of file position after rotation:

    • Diagnosis: Check Fluent Bit’s tail plugin status. If it’s reporting errors like "file not found" or reprocessing old logs, it’s likely lost its position.
    • Cause: When a log file is rotated (e.g., app.log becomes app.log.1), the tail plugin, by default, might not recognize the new file as a continuation of the old one. It relies on inode numbers and file offsets.
    • Fix: Enable rotate_wait in your Fluent Bit configuration. This tells Fluent Bit to periodically check for rotated files and re-open them.
      [INPUT]
          Name              tail
          Path              /var/log/app.log
          Tag               app.log
          Rotate_Wait       360
      
    • Why it works: Rotate_Wait 360 instructs Fluent Bit to wait up to 360 seconds between checks for file changes, allowing it to detect and re-attach to newly rotated files.
  2. Fluent Bit reprocessing logs after truncation:

    • Diagnosis: You see duplicate log entries in your output, or the log volume processed by Fluent Bit suddenly jumps significantly without a corresponding increase in actual log generation.
    • Cause: If a log file is truncated (its size is reduced, often back to zero), Fluent Bit might interpret this as the start of a new file and reset its read offset, leading to reprocessing.
    • Fix: Set the truncation_timeout option. This tells Fluent Bit to wait for a specified period after detecting truncation before resetting its read position.
      [INPUT]
          Name              tail
          Path              /var/log/app.log
          Tag               app.log
          Truncation_Timeout 10
      
    • Why it works: Truncation_Timeout 10 ensures that if app.log is truncated, Fluent Bit will wait 10 seconds. If the file grows again within that timeout, it assumes it’s a continuation and resumes from the correct offset. If it remains empty or small, it’s treated as a reset.
  3. Fluent Bit indexing the wrong file after a rapid rotation/truncation cycle:

    • Diagnosis: Inconsistent log delivery, missing logs, or logs appearing out of order, especially under high log volume.
    • Cause: When log rotation and truncation happen very rapidly, Fluent Bit’s internal state might get confused about which file is the "current" one and where to resume reading from.
    • Fix: Use the Parser option to ensure consistent parsing and the Refresh_Interval to control how often Fluent Bit checks file metadata.
      [INPUT]
          Name              tail
          Path              /var/log/app.log
          Tag               app.log
          Parser            docker
          Refresh_Interval  5
      
    • Why it works: Parser docker ensures logs are parsed consistently, and Refresh_Interval 5 makes Fluent Bit check file metadata (like modification time and size) every 5 seconds, helping it stay synchronized with rapid file changes.
  4. Fluent Bit not following log file renames (e.g., app.log -> app.log.old):

    • Diagnosis: Fluent Bit stops processing logs for a specific file, and no new logs appear in the output for that source.
    • Cause: If the log rotation mechanism renames the current log file instead of moving it and creating a new one, Fluent Bit might lose track if it was strictly monitoring the original inode.
    • Fix: Ensure your log rotation configuration renames files and that Fluent Bit is configured with Read_from_Head On_Error. This option, combined with careful Rotate_Wait and Truncation_Timeout, helps it recover.
      [INPUT]
          Name              tail
          Path              /var/log/app.log
          Tag               app.log
          Rotate_Wait       360
          Truncation_Timeout 10
          Read_From_Head_On_Error true
      
    • Why it works: Read_From_Head_On_Error true tells Fluent Bit that if it encounters an error (like a file being renamed away), it should attempt to re-initialize its read position from the beginning of the current file it’s supposed to be watching, effectively picking up where it left off after the rename.
  5. Fluent Bit consuming too much CPU or memory during rotation checks:

    • Diagnosis: High CPU/memory usage on the Fluent Bit agent, correlating with frequent log file rotations.
    • Cause: Frequent polling and stat calls on many files, especially in directories with thousands of log files, can become a performance bottleneck.
    • Fix: Adjust Refresh_Interval to a higher value and use IO_Buf_Size appropriately.
      [INPUT]
          Name              tail
          Path              /var/log/app.log
          Tag               app.log
          Refresh_Interval  30
          IO_Buf_Size       16384
      
    • Why it works: Increasing Refresh_Interval to 30 seconds reduces the frequency of file system checks. IO_Buf_Size 16384 ensures efficient reading, reducing the number of read operations and thus system load.
  6. Fluent Bit failing to pick up new log files if the Path pattern is too broad:

    • Diagnosis: Fluent Bit processes logs from unintended files or fails to process logs from newly created files.
    • Cause: Using a very general Path like /var/log/* without proper rotation handling can lead to Fluent Bit trying to track too many files, including temporary ones or old rotated logs it shouldn’t be actively tailing.
    • Fix: Be specific with your Path and use Path_Wildcard if necessary, but ensure Rotate_Wait and Truncation_Timeout are set.
      [INPUT]
          Name              tail
          Path              /var/log/myapp/app.log
          Tag               myapp.log
          Rotate_Wait       360
          Truncation_Timeout 10
      
    • Why it works: A specific path (/var/log/myapp/app.log) directs Fluent Bit to only monitor that exact file. When rotated, the Rotate_Wait and Truncation_Timeout handle the transition to app.log.1 and the subsequent app.log.

After fixing these, you might encounter issues with output buffering if your downstream system is slow.

Want structured learning?

Take the full Fluentbit course →