The Fluentd tail input plugin can actually keep reading files even after they’ve been rotated, which is a surprisingly robust behavior.

Let’s see it in action. Imagine you have a web server spitting out access logs into /var/log/nginx/access.log. We want Fluentd to pick these up as they are written and send them somewhere (for this demo, we’ll just tag them and let them be).

Here’s a minimal Fluentd configuration:

<source>
  @type tail
  path /var/log/nginx/access.log
  pos_file /var/log/td-agent/pos/nginx-access.log.pos
  tag nginx.access
  <parse>
    @type nginx
  </parse>
</source>

<match nginx.access>
  @type stdout
</match>

When Fluentd starts, it reads the pos_file. If it doesn’t exist, it starts reading access.log from the beginning. If it does exist, it reads from the byte offset stored within it. This pos_file is crucial; it’s Fluentd’s memory of where it left off.

Now, let’s simulate log rotation. We’ll manually move the current access.log and create a new, empty one.

  1. Initial Log:

    echo '192.168.1.1 - - [10/Oct/2023:10:00:00 +0000] "GET / HTTP/1.1" 200 1234 "-" "curl/7.68.0"' > /var/log/nginx/access.log
    

    Fluentd picks this up. The pos_file now points to the end of this line.

  2. Simulate Rotation:

    mv /var/log/nginx/access.log /var/log/nginx/access.log.1
    touch /var/log/nginx/access.log
    

    The old log is now access.log.1, and a new, empty access.log exists.

  3. New Log Entry:

    echo '192.168.1.2 - - [10/Oct/2023:10:01:00 +0000] "GET /about HTTP/1.1" 200 567 "-" "curl/7.68.0"' > /var/log/nginx/access.log
    

When Fluentd checks the file specified by path (/var/log/nginx/access.log), it sees it’s the same inode. It then checks the pos_file. Because the pos_file points to a byte offset within the original file, Fluentd realizes the file has been truncated (or, in our case, replaced with a new file with the same name but a different inode). It resets its position to the beginning of the new file and starts reading again. This is why it can seamlessly pick up logs from the newly created file.

If instead of replacing the file, you were using logrotate with copytruncate, Fluentd would still work. copytruncate first copies the log, then truncates the original. Fluentd, reading the original, would hit the end of its pos_file’s offset, then see the file has been truncated. It would then reset its position to the beginning of the now-empty file. When new logs are written, it would pick them up. The copytruncate method is generally less reliable with tailing than moving/renaming, as there’s a window where logs can be lost between the copy and truncate.

The key is the pos_file. It stores the byte offset within a specific inode. When the inode changes (due to rotation via mv or creation of a new file), Fluentd detects this. The pos_file can be configured with dir to specify where it’s stored, and tag_sweep_interval to control how often it checks for new files if refresh_interval is set.

The refresh_interval parameter dictates how often Fluentd checks the path directory for new files or changes. By default, it’s 30 seconds. If you’re expecting very rapid log rotation and want to minimize any potential lag, you might lower this. For instance, refresh_interval 5s will make Fluentd check every 5 seconds.

Without the pos_file, Fluentd would lose its place on rotation and potentially miss logs.

The next thing you’ll likely encounter is needing to handle different log formats or parse specific fields for richer data.

Want structured learning?

Take the full Fluentd course →