Kubernetes itself doesn’t store logs; it just gives pods a place to write them, and then those logs vanish when the pod dies.
Let’s see how Fluent Bit can grab those ephemeral logs and send them somewhere useful, like Elasticsearch.
Imagine this: a simple Nginx pod spitting out access logs.
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
By default, Nginx writes to /var/log/nginx/access.log inside its container. Kubernetes collects these stdout/stderr streams, but if you want structured logs or logs from files, you need a log collector. That’s where Fluent Bit comes in.
Fluent Bit runs as a DaemonSet on your Kubernetes cluster. This means one Fluent Bit pod runs on each node, ensuring that no matter where your application pods land, there’s a Fluent Bit agent there to watch them.
The core idea is that Fluent Bit uses "inputs" to collect logs and "outputs" to send them. For Kubernetes, the most common input is tail for log files.
Here’s a simplified Fluent Bit configuration for collecting Nginx logs from a file:
[SERVICE]
Flush 1
Daemon Off
Log_Level info
Parsers_File /fluent-bit/etc/parsers.conf
[INPUT]
Name tail
Tag kube.var.log.containers.nginx-pod_default_nginx-pod-*.log
Path /var/log/containers/nginx-pod_default_nginx-pod-*.log
Parser nginx
DB /var/log/flb_kube.db
Mem_Buffer_Size 1MB
Skip_Long_Lines On
Refresh_Interval 10
[OUTPUT]
Name es
Match kube.*
Host elasticsearch.logging.svc.cluster.local
Port 9200
Logstash_Format On
Logstash_Prefix kubernetes
Retry_Limit False
Let’s break this down.
The [SERVICE] section is pretty standard for Fluent Bit. Flush 1 means it tries to send logs every second. Daemon Off keeps it in the foreground, which is typical for containerized deployments.
The [INPUT] section is where the magic happens.
Name tail: We’re using thetailinput plugin, which reads from files like atail -fcommand.Tag kube.var.log.containers.nginx-pod_default_nginx-pod-*.log: This is a crucial part. Fluent Bit uses tags to route logs. In Kubernetes, logs from containers are typically symlinked into/var/log/containers/on the node. The*allows us to match any pod name that fits our pattern.Path /var/log/containers/nginx-pod_default_nginx-pod-*.log: This is the actual file path on the host node where Fluent Bit will look for logs. Note that this path is relative to the host, not the container. Fluent Bit, running as a DaemonSet, has access to the host’s filesystem (often via a volume mount).Parser nginx: This tells Fluent Bit to use a pre-defined parser to structure the log lines. You’d define thisnginxparser inparsers.confto match Nginx’s access log format (e.g.,time,remoteip,method,url,protocol,status,bytes,referer,useragent).DB /var/log/flb_kube.db: This is a SQLite database file where Fluent Bit stores its progress. It remembers which lines it has already processed, so if Fluent Bit restarts, it can pick up where it left off.Mem_Buffer_Size 1MB: This sets a buffer in memory for logs before they are flushed.Skip_Long_Lines On: Prevents very long log lines from causing issues.Refresh_Interval 10: How often Fluent Bit checks the log file for new entries.
The [OUTPUT] section defines where the logs go.
Name es: We’re using the Elasticsearch output plugin.Match kube.*: This is the routing rule. It means "send any log record whose tag starts withkube.to this output." Our input tag waskube.var.log.containers..., so it matches.Host elasticsearch.logging.svc.cluster.local: The hostname of your Elasticsearch cluster.Port 9200: The port Elasticsearch is listening on.Logstash_Format On: This formats the output to be compatible with Logstash, which is often used with Elasticsearch.Logstash_Prefix kubernetes: This adds a prefix to the Elasticsearch index name, helping to organize your logs (e.g.,kubernetes-YYYY.MM.DD).Retry_Limit False: This tells Fluent Bit to keep retrying indefinitely if it can’t send logs to Elasticsearch.
You’d typically deploy this Fluent Bit configuration as a DaemonSet manifest in Kubernetes. The path in the [INPUT] section and the host/port in the [OUTPUT] section are the most common things to adjust. You also need to ensure the Fluent Bit pod has the necessary permissions to access host log files (often via hostPath volume mounts for /var/log/containers and /var/log).
The tail input plugin, when used with Kubernetes, actually watches the symlinks in /var/log/containers/ on the host. These symlinks point to the actual log files managed by the container runtime (like containerd or Docker) on the node. Fluent Bit follows these symlinks to read the container’s log output. The Tag in the input configuration is dynamically generated by Fluent Bit based on the symlink name, and it’s this tag that gets matched by the Match directive in the output.
When Fluent Bit processes a log line, it attaches metadata like the pod name, namespace, and container name. This metadata is often derived from the log file path itself and is invaluable for filtering and searching logs later. The Logstash_Format On directive ensures this metadata is included in a structured way that Elasticsearch can easily index.
The real power comes from Fluent Bit’s ability to parse and enrich logs. You can define custom parsers for application-specific log formats, or use built-in parsers for common ones like Nginx, Apache, or JSON. You can also add filters to modify log records before they are sent to the output, such as adding environment variables or dropping sensitive fields.
The next thing you’ll want to tackle is handling log rotation and ensuring you don’t lose logs if Fluent Bit itself crashes.