Fluent Bit’s Kafka output plugin doesn’t actually produce logs to Kafka; it acts as a Kafka producer, forwarding log records it receives from its input plugins to Kafka topics.

Here’s Fluent Bit shipping logs to Kafka, using the tail input plugin to read from a file and the kafka output plugin to send them:

[SERVICE]
    Flush        1
    Daemon       off
    Log_Level    info
    Parsers_File parsers.conf

[INPUT]
    Name         tail
    Path         /var/log/myapp.log
    Tag          myapp.logs
    Parser       json

[OUTPUT]
    Name         kafka
    Match        myapp.logs
    Brokers      kafka-broker-1:9092,kafka-broker-2:9092
    Topics       myapp-topic
    Key          event_id
    Async        true

This configuration tells Fluent Bit to:

  • Flush buffered records every second (Flush 1).
  • Run in the foreground (Daemon off).
  • Log at the info level.
  • Use the parsers.conf file for parsing log lines.

The [INPUT] section defines a tail input plugin that monitors /var/log/myapp.log. It tags these incoming logs with myapp.logs and expects them to be in JSON format, as specified by the Parser json directive.

The [OUTPUT] section uses the kafka output plugin.

  • It only processes records tagged with myapp.logs (Match myapp.logs).
  • It connects to Kafka brokers at kafka-broker-1:9092 and kafka-broker-2:9092.
  • It sends records to the myapp-topic.
  • It uses the event_id field from the log record as the Kafka message key.
  • It sends messages asynchronously (Async true) for higher throughput.

When Fluent Bit starts, it will tail /var/log/myapp.log. Each time a new line appears, it will parse it as JSON. If the parsed record’s tag matches myapp.logs, it will be sent to the kafka output plugin. The plugin will then format it as a Kafka message with event_id as the key and produce it to the myapp-topic on the specified brokers.

The most surprising thing about Fluent Bit’s Kafka output is that it doesn’t require a separate Kafka client library to be installed on the host; the necessary Kafka producer logic is built directly into the plugin itself, leveraging librdkafka.

Consider a scenario where you have application logs in /var/log/app/service.log and you want to send them to a Kafka topic named app-logs on kafka-01:9092.

[INPUT]
    Name         tail
    Path         /var/log/app/service.log
    Tag          app.service
    Parser       docker  # Assuming logs are Docker formatted JSON

[OUTPUT]
    Name         kafka
    Match        app.service
    Brokers      kafka-01:9092
    Topics       app-logs
    # No Key specified, Kafka will use a random key or round-robin partitioning
    # Async is true by default for Kafka output

This configuration sets up another input, this time for Docker-formatted JSON logs in /var/log/app/service.log, tagged as app.service. These logs are then directed to the app-logs Kafka topic. Since Key is not specified, Kafka will handle partitioning automatically.

The Key parameter in the Kafka output plugin is powerful because it allows you to control partitioning. If you specify a field that is common across related log events (like a user ID or a request ID), all logs for that specific key will land on the same Kafka partition. This is crucial for ordered processing of related events downstream.

If your Kafka brokers use TLS encryption, you’ll need to configure TLS settings in the output plugin. For example, to use TLS with a CA certificate at /etc/ssl/certs/ca.crt:

[OUTPUT]
    Name         kafka
    Match        myapp.logs
    Brokers      kafka-tls-broker:9092
    Topics       myapp-topic-tls
    Key          event_id
    Tls          On
    Tls_Ca_File  /etc/ssl/certs/ca.crt

The Tls On directive enables TLS, and Tls_Ca_File points to the certificate authority file for verifying the broker’s identity.

The acks configuration option controls how many acknowledgments Kafka requires from brokers before considering a produce request successful. Setting acks 0 means Fluent Bit won’t wait for any acknowledgment, offering the highest throughput but with the risk of data loss if a broker fails immediately after receiving the message. acks 1 (the default) waits for an acknowledgment from the leader broker, providing a good balance. acks all waits for acknowledgment from all in-sync replicas, ensuring maximum durability but at the cost of higher latency.

The Buffer_Queue_Size and Buffer_Max_Events settings in the [SERVICE] section control how many records Fluent Bit buffers in memory before flushing to the output plugin. For high-volume scenarios, increasing these can improve throughput by reducing the frequency of writes to Kafka, but also increases memory usage and the potential for data loss if Fluent Bit crashes before flushing.

The id field in the parsed JSON log record is used as the Kafka message key. This is useful for ensuring that all log messages related to a specific user session (identified by id) are processed in order by downstream consumers that partition based on the message key.

The next logical step after reliably sending logs to Kafka is to consume and process them, which often involves setting up Kafka consumers or using stream processing frameworks like Apache Flink or Spark Streaming.

Want structured learning?

Take the full Fluentbit course →