Fluent Bit’s multiline parser is fundamentally a state machine that stitches together log lines based on regular expression matching and timeouts.

Here’s how you can reassemble fragmented Java stack traces in Fluent Bit.

[
  {
    "log": "2023-10-27 10:00:00 INFO com.example.MyApplication - Starting application...",
    "stream": "stdout",
    "time": "2023-10-27T10:00:00.123Z"
  },
  {
    "log": "2023-10-27 10:00:05 ERROR com.example.MyApplication - Uncaught exception in thread \"main\"",
    "stream": "stdout",
    "time": "2023-10-27T10:00:05.456Z"
  },
  {
    "log": "java.lang.NullPointerException: Attempted to invoke virtual method 'void java.lang.String.length()' on a null object reference",
    "stream": "stdout",
    "time": "2023-10-27T10:00:05.789Z"
  },
  {
    "log": "\tat com.example.MyService.doSomething(MyService.java:42)",
    "stream": "stdout",
    "time": "2023-10-27T10:00:05.889Z"
  },
  {
    "log": "\tat com.example.MyApplication.run(MyApplication.java:25)",
    "stream": "stdout",
    "time": "2023-10-27T10:00:05.989Z"
  },
  {
    "log": "\tat java.base/java.lang.Thread.run(Thread.java:3045)",
    "stream": "stdout",
    "time": "2023-10-27T10:00:06.089Z"
  }
]

In this example, the java.lang.NullPointerException and the subsequent lines starting with \tat are part of a single logical Java stack trace. Fluent Bit, by default, treats each of these as separate log entries. To reassemble them, we need to tell Fluent Bit how to identify the start and continuation of a multiline log.

The core of this is the [MULTILINE_PARSER] configuration in fluent-bit.conf.

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File parsers.conf

[INPUT]
    Name         tail
    Path         /var/log/app.log
    Parser       json  # Assuming your logs are JSON
    Tag          app.log
    Mem_Buffer_Limit 10MB

[FILTER]
    Name         parser
    Match        app.log
    Key_Name     log
    Parser       java_stacktrace
    Reserve_Data true

[OUTPUT]
    Name         stdout
    Match        app.log
    Format       json

And in parsers.conf:

[PARSER]
    Name        java_stacktrace
    Format      regex
    Regex       ^(?<message>.*?)(?:(?:\n\tat\s.*)|(?:\n\s{4,}\S+.*))$
    Multiline   On
    State       start
    Regex_Close ^(?!\s{4,}\S+.*) # Matches lines NOT starting with 4 spaces and a non-space character
    Timeout     3600 # 1 hour timeout

The Regex ^(?<message>.*?)(?:(?:\n\tat\s.*)|(?:\n\s{4,}\S+.*))$ is crucial. It captures the initial part of the log line into the message field. The (?:(?:\n\tat\s.*)|(?:\n\s{4,}\S+.*)) part is a non-capturing group that looks for two possible continuation patterns:

  1. \n\tat\s.*: A newline followed by \tat (the typical Java stack trace indicator).
  2. \n\s{4,}\S+.*: A newline followed by at least four spaces, followed by a non-space character (this catches lines that might not have \tat but are still part of the trace, like Caused by: followed by a stack trace).

Multiline On tells Fluent Bit this is a multiline parser. State start initializes the parser. The Regex_Close is a bit of a trick: it’s a regex that must not match for the current line to be considered the end of the multiline message. Here, ^(?!\s{4,}\S+.*) means "the line does not start with four spaces and a non-space character." This ensures that if a line does look like a continuation (e.g., starts with 4 spaces), it doesn’t trigger the end condition. The Timeout 3600 sets how long Fluent Bit will wait for subsequent lines to match the continuation pattern before emitting the buffered message.

The filter Parser java_stacktrace applies this parser to the log field of records tagged app.log. Reserve_Data true ensures that the original log field content is preserved within the message field of the reassembled record.

If your logs are not JSON and you’re using tail with a different parser (e.g., apache or none), adjust the INPUT section accordingly. For example, if logs are plain text and the first line of a stack trace doesn’t have a specific marker, you might need a more general regex.

The most common pitfall is an incorrect Regex that either fails to capture the start of the trace or incorrectly identifies continuation lines. For instance, a regex that only looks for \tat will miss Caused by: sections. Another common issue is a Timeout that’s too short, causing partial stack traces to be emitted prematurely.

If you’re seeing \n characters within your reassembled log messages when you expected them to be lines, your Regex is likely not correctly accounting for the newline character in its capture group or continuation patterns.

The next error you’ll encounter is likely related to parsing the reassembled, multi-line log messages further down your processing pipeline if your downstream system expects single-line entries.

Want structured learning?

Take the full Fluentbit course →