The concat plugin in Fluentd can sometimes produce logs that are missing lines, despite appearing to be correctly configured.

This happens because Fluentd processes logs in chunks, and the concat plugin, which is designed to reassemble multiline logs, relies on a specific sequence of events within these chunks to correctly reconstruct a single log entry. When the chunk boundaries don’t align with the log entry boundaries, or when other plugins interfere with the chunking process, lines can be dropped.

Here’s a breakdown of common causes and how to fix them:

1. Incorrect multiline Configuration:

The most frequent culprit is an imprecise multiline regular expression. The concat plugin works by identifying the start of a new log entry. If your regex isn’t precisely capturing the beginning of each log, concat might incorrectly assume a line is part of the previous entry or a completely new one.

  • Diagnosis: Examine your multiline regex carefully. Does it only match the beginning of a log line? Consider a common Java stack trace where a new line starts with whitespace. A regex like /^\s+/ would be wrong, as it matches any line starting with whitespace, not just the start of a new log. A better regex would be something that specifically identifies the start of a new log, e.g., /(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3})/.
  • Fix: Adjust your multiline regex to be highly specific to the start of a log message. For example, if your logs start with a timestamp like YYYY-MM-DD HH:MM:SS,ms, use a regex that anchors to that pattern: multiline_mode next_fs_break and format /^(?<time>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2},\d{3}).*/. The next_fs_break mode is crucial here, as it signals that a line not matching the pattern is a continuation.
  • Why it works: This precise regex ensures that Fluentd knows exactly when a new log entry begins, allowing concat to reliably group subsequent lines until the next definitive start is encountered.

2. flush_interval Too Small:

If your flush_interval is set too low, Fluentd might be flushing chunks before the concat plugin has had a chance to fully assemble a multiline log entry.

  • Diagnosis: Check your flush_interval in your Fluentd configuration. If it’s very short (e.g., 1s or 5s), it might be the issue.
  • Fix: Increase the flush_interval to a value that allows sufficient time for multiline logs to be collected. For example, set flush_interval 10s or flush_interval 30s.
  • Why it works: A larger flush_interval gives the concat plugin more time to buffer and reassemble multiline logs within a single chunk before it’s sent downstream.

3. Intervening Plugins Disrupting Chunking:

Other plugins in your Fluentd pipeline might be modifying or re-chunking the log data after concat has processed it, or before concat has had a chance to see the complete multiline record. This is common with filter plugins that might split or merge records based on their own logic.

  • Diagnosis: Review all plugins in your filter and match sections that appear after your concat configuration. Look for plugins that modify record, message, or perform any form of data splitting/merging.
  • Fix: Reorder your plugins. Ensure that the concat plugin is placed as early as possible in your pipeline, ideally right after the input plugin, and before any other filters that might alter the log structure or chunking. If a filter is essential, ensure it doesn’t interfere with the record field that concat is assembling.
  • Why it works: By processing concat early, you ensure the multiline log is reassembled before subsequent plugins have a chance to break it apart or alter the structure it relies on.

4. max_lines Too Low:

The max_lines parameter in the concat plugin limits the number of lines that can be combined into a single event. If your multiline logs exceed this limit, subsequent lines will be dropped or treated as new events.

  • Diagnosis: Determine the maximum number of lines in your longest expected multiline log entries. Compare this to your concat plugin’s max_lines setting.
  • Fix: Increase the max_lines parameter to accommodate your longest log entries. For example, if your logs can be up to 50 lines, set max_lines 50.
  • Why it works: This ensures that the concat plugin is configured to handle the full extent of your multiline log structures without prematurely truncating them.

5. timeout Setting Too Aggressive:

The timeout parameter in the concat plugin dictates how long Fluentd waits for continuation lines before flushing an incomplete multiline event. If this is too short, and there are natural pauses between log lines (e.g., slow application logging), lines can be flushed prematurely.

  • Diagnosis: Observe the typical time between lines of a multiline log entry from your application. If this interval can sometimes exceed your timeout value, it’s a potential cause.
  • Fix: Increase the timeout value. For example, if you see pauses of up to 15 seconds between lines, set timeout 20s.
  • Why it works: A longer timeout provides a buffer for natural delays in log generation, preventing incomplete multiline events from being flushed and lost.

6. key Field Mismatch:

The concat plugin uses a key field (often message) to identify the content to be reassembled. If this key field is not consistently present or is being renamed by other plugins, concat won’t be able to find the data it needs.

  • Diagnosis: Inspect your logs before and after the concat plugin. Verify that the field specified in the key parameter of concat (e.g., key message) is present and contains the multiline content. Check for any filters that might rename or remove this field.
  • Fix: Ensure the key parameter in your concat configuration matches the actual field containing the log message. If another plugin renames it, adjust either the concat key or the renaming plugin to maintain consistency. For instance, if a record_transformer renames message to log_content, your concat should use key log_content.
  • Why it works: concat operates on a specific field. Consistency in this field’s name and content is paramount for its correct functioning.

After addressing these points, you might encounter issues with field extraction from the now-reassembled multiline logs. The next common hurdle is ensuring your downstream parsers (like parser plugins) are configured to correctly interpret the combined log message.

Want structured learning?

Take the full Fluentd course →