Fluent Bit’s grep filter is surprisingly powerful, and often people miss that it can filter both incoming and outgoing log lines, not just what eventually gets sent to a destination.
Let’s see it in action. Imagine you’re collecting logs from a Kubernetes pod and want to send them to Elasticsearch, but only want to see your application’s INFO level messages and above, while also excluding any logs that mention "debug" or "trace" specifically.
Here’s a simplified fluent-bit.conf snippet:
[SERVICE]
Flush 1
Daemon off
Log_Level info
[INPUT]
Name tail
Path /var/log/containers/my-app-*.log
Tag my-app.log
Parser docker
[FILTER]
Name grep
Match my-app.log
Regex level (info|warn|error|fatal)
Regex_type perl
# If you want to exclude, use this:
# Exclude message debug
# Exclude message trace
[OUTPUT]
Name es
Match my-app.log
Host elasticsearch.example.com
Port 9200
Logstash_Prefix my-app
Logstash_Format On
In this setup, the [FILTER] section is key.
Name grep: This tells Fluent Bit to use the grep filter plugin.Match my-app.log: This ensures the filter only applies to logs tagged withmy-app.log.Regex level (info|warn|error|fatal): This is the core of our inclusion. It specifies that we only want to keep records where thelevelfield contains "info", "warn", "error", or "fatal". If a log line doesn’t match this pattern, it’s dropped.Regex_type perl: This specifies the regex engine. Perl-compatible regular expressions are powerful and widely supported.
Now, let’s say you also want to exclude lines containing specific keywords, even if they match the level regex. You’d add Exclude directives:
[FILTER]
Name grep
Match my-app.log
Regex level (info|warn|error|fatal)
Regex_type perl
Exclude message debug
Exclude message trace
Here, Exclude message debug means if a log line also has a message field containing the word "debug", it will be dropped, regardless of its level. The same applies to Exclude message trace. This allows for fine-grained control, letting you keep generally useful logs while shedding specific noise.
The grep filter’s Match directive is crucial for performance. Applying a filter to all incoming logs unnecessarily can be a bottleneck. Always scope your filters to the specific tags they need to operate on.
The most surprising thing about grep is that it operates before the output stage. This means you can filter logs out entirely, rather than just conditionally sending them. If a log line is dropped by the grep filter, it never even makes it to the OUTPUT plugin. This is significantly more efficient than sending everything to the output and then filtering there, especially if your output destination has rate limits or is expensive.
What many users don’t realize is that the Regex directive can be used to include and the Exclude directive can be used to exclude based on different fields. You’re not limited to filtering on just one field. For instance, you could have a Regex that matches specific error codes in a code field and an Exclude that removes logs from a particular internal service identified by a service_name field.
This powerful combination allows you to sculpt your log stream precisely.
The next step after mastering grep is often learning how to enrich your logs with the modify filter before applying your filtering logic.