Fluentd on Kubernetes is surprisingly resilient, but its core strength lies in its ability to process and filter logs before they even hit your primary storage, not just act as a dumb pipe.
Let’s see it in action. Imagine you have a Kubernetes cluster and you want to collect logs from all your application pods, parse them, and send them to Elasticsearch.
First, you’ll need to add the Fluentd Helm repository.
helm repo add fluent https://fluent.github.io/helm-charts
helm repo update
Now, let’s install Fluentd using the official chart. We’ll configure it to send logs to a local Elasticsearch instance that’s also running in Kubernetes.
# fluentd-values.yaml
elasticsearch:
enabled: true
host: elasticsearch-master # Assuming your Elasticsearch service is named this
port: 9200
index: "kubernetes-logs-%Y.%m.%d"
serviceAccount:
create: true
name: fluentd
rbac:
create: true
image:
repository: fluent/fluentd-kubernetes-daemonset
tag: v1.14.2-debian-elasticsearch7-1.1
resources:
requests:
cpu: 200m
memory: 200Mi
limits:
cpu: 500m
memory: 500Mi
Apply these values with your Helm installation:
helm install fluentd fluent/fluentd-kubernetes-daemonset -f fluentd-values.yaml
This command deploys Fluentd as a DaemonSet across your Kubernetes nodes. Each pod runs Fluentd and collects logs from containers on its respective node. The fluentd-values.yaml file configures Fluentd to output logs to an Elasticsearch instance. The elasticsearch.enabled: true setting tells the chart to deploy a basic Elasticsearch if one isn’t already present, or to connect to an existing one if elasticsearch.enabled is false and elasticsearch.host is provided.
The Fluentd DaemonSet pods will have privileged access to the host’s /var/log directory (or wherever your container runtime stores logs) and use tail input plugins to read new log entries. These entries are then parsed based on common formats or custom configurations. For example, if your application logs are in JSON format, Fluentd can automatically parse them.
Here’s a peek at the internal workings. The fluentd-kubernetes-daemonset chart deploys a specific configuration for Fluentd. The core of this configuration is a fluentd.conf file that defines input, filter, and output plugins.
# Example snippet from fluentd.conf within the chart
<source>
@type tail
path /var/log/containers/*.log
pos /var/log/td-agent/pos/containers.pos
tag kubernetes.*
<parse>
@type json
time_key time
time_format %Y-%m-%dT%H:%M:%S.%NZ
</parse>
</source>
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
<match kubernetes.var.log.containers.**>
@type elasticsearch
host elasticsearch-master
port 9200
index_name kubernetes-logs-%Y.%m.%d
flush_interval 5s
</match>
The tail source reads log files from /var/log/containers/. The tag kubernetes.* assigns a tag to these incoming logs. The parse directive tells Fluentd to expect JSON logs and how to parse the timestamp. The filter kubernetes_metadata plugin is crucial; it enriches each log record with Kubernetes metadata like pod name, namespace, labels, and container name. Finally, the match directive sends logs tagged with kubernetes.var.log.containers.** to Elasticsearch, indexing them with a daily index pattern.
The most surprising thing about Fluentd’s Kubernetes integration is how seamlessly it injects container and pod metadata without requiring any application-level instrumentation. It achieves this by reading the log files directly from the container runtime’s log directory (typically /var/log/containers/), which contain JSON objects that include the Kubernetes metadata as part of the log entry itself. Fluentd’s kubernetes_metadata filter then uses this embedded information to enrich the log record further, making it incredibly easy to search and analyze logs based on Kubernetes context.
Once deployed, you can verify logs are flowing by checking your Elasticsearch index. You’ll see documents appearing with fields like kubernetes.pod_name, kubernetes.namespace_name, and kubernetes.container_name, alongside your actual log messages.
The next logical step is to secure this log stream with TLS.