Kubernetes itself doesn’t store logs; it just gives pods a place to write them, and then those logs vanish when the pod dies.

Let’s see how Fluent Bit can grab those ephemeral logs and send them somewhere useful, like Elasticsearch.

Imagine this: a simple Nginx pod spitting out access logs.

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
spec:
  containers:
  - name: nginx
    image: nginx:latest
    ports:
    - containerPort: 80

By default, Nginx writes to /var/log/nginx/access.log inside its container. Kubernetes collects these stdout/stderr streams, but if you want structured logs or logs from files, you need a log collector. That’s where Fluent Bit comes in.

Fluent Bit runs as a DaemonSet on your Kubernetes cluster. This means one Fluent Bit pod runs on each node, ensuring that no matter where your application pods land, there’s a Fluent Bit agent there to watch them.

The core idea is that Fluent Bit uses "inputs" to collect logs and "outputs" to send them. For Kubernetes, the most common input is tail for log files.

Here’s a simplified Fluent Bit configuration for collecting Nginx logs from a file:

[SERVICE]
    Flush        1
    Daemon       Off
    Log_Level    info
    Parsers_File /fluent-bit/etc/parsers.conf

[INPUT]
    Name         tail
    Tag          kube.var.log.containers.nginx-pod_default_nginx-pod-*.log
    Path         /var/log/containers/nginx-pod_default_nginx-pod-*.log
    Parser       nginx
    DB           /var/log/flb_kube.db
    Mem_Buffer_Size 1MB
    Skip_Long_Lines On
    Refresh_Interval 10

[OUTPUT]
    Name         es
    Match        kube.*
    Host         elasticsearch.logging.svc.cluster.local
    Port         9200
    Logstash_Format On
    Logstash_Prefix kubernetes
    Retry_Limit    False

Let’s break this down.

The [SERVICE] section is pretty standard for Fluent Bit. Flush 1 means it tries to send logs every second. Daemon Off keeps it in the foreground, which is typical for containerized deployments.

The [INPUT] section is where the magic happens.

  • Name tail: We’re using the tail input plugin, which reads from files like a tail -f command.
  • Tag kube.var.log.containers.nginx-pod_default_nginx-pod-*.log: This is a crucial part. Fluent Bit uses tags to route logs. In Kubernetes, logs from containers are typically symlinked into /var/log/containers/ on the node. The * allows us to match any pod name that fits our pattern.
  • Path /var/log/containers/nginx-pod_default_nginx-pod-*.log: This is the actual file path on the host node where Fluent Bit will look for logs. Note that this path is relative to the host, not the container. Fluent Bit, running as a DaemonSet, has access to the host’s filesystem (often via a volume mount).
  • Parser nginx: This tells Fluent Bit to use a pre-defined parser to structure the log lines. You’d define this nginx parser in parsers.conf to match Nginx’s access log format (e.g., time,remoteip,method,url,protocol,status,bytes,referer,useragent).
  • DB /var/log/flb_kube.db: This is a SQLite database file where Fluent Bit stores its progress. It remembers which lines it has already processed, so if Fluent Bit restarts, it can pick up where it left off.
  • Mem_Buffer_Size 1MB: This sets a buffer in memory for logs before they are flushed.
  • Skip_Long_Lines On: Prevents very long log lines from causing issues.
  • Refresh_Interval 10: How often Fluent Bit checks the log file for new entries.

The [OUTPUT] section defines where the logs go.

  • Name es: We’re using the Elasticsearch output plugin.
  • Match kube.*: This is the routing rule. It means "send any log record whose tag starts with kube. to this output." Our input tag was kube.var.log.containers..., so it matches.
  • Host elasticsearch.logging.svc.cluster.local: The hostname of your Elasticsearch cluster.
  • Port 9200: The port Elasticsearch is listening on.
  • Logstash_Format On: This formats the output to be compatible with Logstash, which is often used with Elasticsearch.
  • Logstash_Prefix kubernetes: This adds a prefix to the Elasticsearch index name, helping to organize your logs (e.g., kubernetes-YYYY.MM.DD).
  • Retry_Limit False: This tells Fluent Bit to keep retrying indefinitely if it can’t send logs to Elasticsearch.

You’d typically deploy this Fluent Bit configuration as a DaemonSet manifest in Kubernetes. The path in the [INPUT] section and the host/port in the [OUTPUT] section are the most common things to adjust. You also need to ensure the Fluent Bit pod has the necessary permissions to access host log files (often via hostPath volume mounts for /var/log/containers and /var/log).

The tail input plugin, when used with Kubernetes, actually watches the symlinks in /var/log/containers/ on the host. These symlinks point to the actual log files managed by the container runtime (like containerd or Docker) on the node. Fluent Bit follows these symlinks to read the container’s log output. The Tag in the input configuration is dynamically generated by Fluent Bit based on the symlink name, and it’s this tag that gets matched by the Match directive in the output.

When Fluent Bit processes a log line, it attaches metadata like the pod name, namespace, and container name. This metadata is often derived from the log file path itself and is invaluable for filtering and searching logs later. The Logstash_Format On directive ensures this metadata is included in a structured way that Elasticsearch can easily index.

The real power comes from Fluent Bit’s ability to parse and enrich logs. You can define custom parsers for application-specific log formats, or use built-in parsers for common ones like Nginx, Apache, or JSON. You can also add filters to modify log records before they are sent to the output, such as adding environment variables or dropping sensitive fields.

The next thing you’ll want to tackle is handling log rotation and ensuring you don’t lose logs if Fluent Bit itself crashes.

Want structured learning?

Take the full Fluentbit course →