Fluent Bit can write its log records to a SQL database, but it’s not about simply dumping logs; it’s about turning unstructured log data into structured, queryable information for deeper analysis and auditing.

Let’s see it in action. Imagine you’re running a web server and want to capture each incoming HTTP request, along with its timestamp, source IP, and requested URL, into a PostgreSQL database.

Here’s a simplified Fluent Bit configuration for this:

[SERVICE]
    Flush        5
    Daemon       Off
    Log_Level    info

[INPUT]
    Name         tail
    Path         /var/log/nginx/access.log
    Tag          http.request
    Parser       nginx

[PARSER]
    Name         nginx
    Format       regex
    Regex        ^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>.*?) (?<path>.*?) (?<protocol>.*?)" (?<status>\d*) (?<bytes>\d*) "(?<referer>.*?)" "(?<agent>.*?)"
    Time_Key     time
    Time_Format  %d/%b/%Y:%H:%M:%S %z

[OUTPUT]
    Name         sql
    Match        http.request
    Host         localhost
    Port         5432
    Database     weblogs
    User         log_user
    Password     securepassword
    Table        access_logs
    Schema       public
    # Optional: Specify column mapping if parser names don't match DB column names
    # Map_Key    remote_addr:remote
    # Map_Key    request_path:path

And here’s the corresponding PostgreSQL table schema:

CREATE TABLE public.access_logs (
    timestamp timestamp with time zone,
    remote text,
    method text,
    path text,
    protocol text,
    status integer,
    bytes bigint,
    referer text,
    agent text
);

When Fluent Bit processes an Nginx access log line like this:

192.168.1.10 - - [10/Oct/2023:14:30:05 +0000] "GET /index.html HTTP/1.1" 200 1234 "-" "Mozilla/5.0"

The tail input reads it, the nginx parser extracts fields into a record like:

{
  "time": "10/Oct/2023:14:30:05 +0000",
  "method": "GET",
  "path": "/index.html",
  "protocol": "HTTP/1.1",
  "status": "200",
  "bytes": "1234",
  "referer": "-",
  "agent": "Mozilla/5.0"
}

The sql output plugin then takes this record, transforms the time string into a proper timestamp, and inserts a new row into the access_logs table in your PostgreSQL database.

The core problem this solves is the ephemeral and often unwieldy nature of raw log files. Instead of sifting through gigabytes of text, you get structured data. You can then run SQL queries like:

  • "Show me all 500 errors from the last hour."
  • "What are the top 10 most requested URLs?"
  • "Count requests by IP address to detect potential brute-force attempts."

The sql output plugin in Fluent Bit is remarkably flexible. It supports multiple SQL database backends (PostgreSQL, MySQL, SQLite, etc.) and handles connection pooling and retries automatically. The Map_Key directive is crucial when your parser’s field names don’t perfectly align with your database column names. It’s a direct mapping from the record key (parsed field) to the database column name.

One detail often overlooked is how Fluent Bit handles data types. While the parser might extract everything as strings, the sql output plugin attempts to cast these to appropriate SQL types based on your table schema. For example, if your status column is an INTEGER and Fluent Bit receives "200", it will attempt to cast "200" to 200 before insertion. This automatic casting can save a lot of manual effort, but it’s also a potential source of errors if the data doesn’t conform to the expected type (e.g., trying to insert "N/A" into an integer column).

The next logical step is to explore how to aggregate logs before sending them to the database to reduce the load and noise.

Want structured learning?

Take the full Fluentbit course →