Lua is the unsung hero of Fluent Bit’s data transformation pipeline, letting you do more than just filter and route.
Let’s see what this looks like in practice. Imagine you’re getting JSON logs from a Kubernetes pod, and you want to extract a specific field, say request_id, and prepend it to the message field.
Here’s a sample log entry:
{
"log": "{\"level\":\"info\",\"message\":\"User logged in\",\"request_id\":\"abc123xyz\"}",
"stream": "stdout",
"time": "2023-10-27T10:00:00.123456789Z"
}
And here’s the Lua script that accomplishes the transformation:
-- /etc/fluent-bit/filters/transform.lua
function parse_log_message(tag, timestamp, record)
-- Safely parse the JSON string within the 'log' field
local log_data = record['log']
if log_data then
local parsed_log = cjson.decode(log_data)
if parsed_log and parsed_log['request_id'] and parsed_log['message'] then
-- Construct the new message
local new_message = "[" .. parsed_log['request_id'] .. "] " .. parsed_log['message']
-- Update the 'message' field in the original record
record['message'] = new_message
-- Optionally, remove the original 'log' field if no longer needed
-- record['log'] = nil
end
end
return 1, tag, record
end
And here’s how you’d configure Fluent Bit to use it in fluent-bit.conf:
[SERVICE]
Flush 5
Daemon off
Log_Level info
[INPUT]
Name tail
Path /var/log/containers/*.log
Parser cri
Tag k8s.*
[FILTER]
Name lua
Match k8s.*
Script /etc/fluent-bit/filters/transform.lua
Call parse_log_message
[OUTPUT]
Name stdout
Match k8s.*
Format json
With this setup, the output for the sample log would look like this:
{
"message": "[abc123xyz] User logged in",
"stream": "stdout",
"time": "2023-10-27T10:00:00.123456789Z"
}
The core problem Fluent Bit’s Lua filter solves is providing a highly flexible, programmatic way to manipulate log records after they’ve been ingested and before they’re sent to an output. While built-in filters handle common tasks like adding metadata or simple regex replacements, complex logic—like parsing nested JSON, performing calculations, or conditionally altering fields based on intricate rules—quickly becomes unwieldy or impossible with standard filters. Lua, being a lightweight and embeddable scripting language, integrates seamlessly into Fluent Bit’s C core, allowing you to define custom filter plugins written in Lua that have direct access to the log record’s data.
Internally, when Fluent Bit encounters a FILTER section with Name lua, it loads the specified Script. For each log record that matches the Match pattern, Fluent Bit calls the Lua function specified by Call. This Lua function receives the log record as a Lua table, along with its tag and timestamp. The function can then modify this table in place or return a new record. The modifications are applied to the record as it flows through the pipeline. The cjson library, which is built into Fluent Bit’s Lua environment, is crucial for handling JSON parsing and serialization, enabling you to easily work with structured log data.
The key levers you control are the Lua script itself and the Match pattern in the Fluent Bit configuration. The Match pattern determines which logs your Lua script will operate on, allowing for targeted transformations. Within the Lua script, you have full programmatic control. You can access any field in the record table (record['field_name']), add new fields, modify existing ones, or even delete fields. You can use Lua’s control structures (if/else, loops) and its standard libraries, plus the Fluent Bit-specific cjson library for JSON manipulation. The function you Call must return three values: a status code (0 for success, -1 for failure), the tag, and the record itself. Returning 0 indicates the record should continue processing, while returning -1 effectively drops the record.
Many users, when dealing with nested JSON within a log field (like the log field in the Kubernetes example), attempt to directly access nested keys without first parsing the string. This leads to errors or unexpected behavior because the log field contains a string representation of JSON, not a JSON object itself. The correct approach, as shown in the example, is to use cjson.decode() to convert that string into a Lua table before attempting to access its keys.
The next logical step is to explore how to use Lua to interact with external data sources or perform more complex stateful transformations across multiple log records.