Kubernetes Pods can generate a lot of logs, and without context, they’re just a wall of text. This article shows you how to automatically add namespace and label metadata to your pod logs, making them infinitely more searchable and debuggable.
Let’s see this in action. Imagine you have a simple Nginx deployment.
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-app
labels:
app: nginx
spec:
replicas: 2
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
environment: staging
spec:
containers:
- name: nginx
image: nginx:latest
ports:
- containerPort: 80
When Nginx logs its startup message, it looks like this by default:
2023/10/27 10:30:00 [emerg] 1#1: unknown directive "daemon" in /etc/nginx/nginx.conf:3
After enrichment, the same log line might look like this (depending on your logging agent configuration):
{
"log": "2023/10/27 10:30:00 [emerg] 1#1: unknown directive \"daemon\" in /etc/nginx/nginx.conf:3\n",
"stream": "stdout",
"time": "2023-10-27T10:30:00.123456789Z",
"kubernetes": {
"namespace_name": "default",
"pod_name": "nginx-app-abcdef-12345",
"container_name": "nginx",
"labels": {
"app": "nginx",
"environment": "staging"
}
}
}
Notice how namespace_name, pod_name, container_name, and labels are now part of the log record. This is typically handled by a cluster-level logging agent, like Fluentd, Fluent Bit, or the Vector agent, often deployed as a DaemonSet on each node.
The core idea is that the logging agent running on the node inspects the running containers and their associated Kubernetes metadata. It then "tags" or "appends" this metadata to the log records it collects before sending them to your central logging backend (like Elasticsearch, Loki, or Splunk).
Here’s how you’d typically configure this with Fluent Bit, a popular choice for Kubernetes logging. You’d have a DaemonSet with a fluent-bit.conf that includes an input plugin (e.g., tail for log files) and an output plugin (e.g., es for Elasticsearch). The magic happens in a filter plugin.
apiVersion: v1
kind: ConfigMap
metadata:
name: fluent-bit-config
namespace: kube-system
data:
fluent-bit.conf: |
[SERVICE]
Flush 5
Daemon Off
Log_Level info
Parsers_File parsers.conf
@INCLUDE input-kubernetes.conf
@INCLUDE filter-kubernetes.conf
@INCLUDE output-elasticsearch.conf
input-kubernetes.conf: |
[INPUT]
Name tail
Tag kube.*
Path /var/log/containers/*.log
Parser docker
DB /var/log/flb_kube.db
Mem_Buffer_Limit 10MB
Skip_Long_Lines On
Refresh_Interval 10
filter-kubernetes.conf: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On
Merge_Log_Key log_processed
K8S-Logging.Parser On
K8S-Logging.Exclude Off
Labels On
Annotations Off
output-elasticsearch.conf: |
[OUTPUT]
Name es
Match kube.*
Host elasticsearch.logging.svc.cluster.local
Port 9200
Logstash_Format On
Replace_Dots On
Retry_Limit False
In this filter-kubernetes.conf:
Name kubernetes: This tells Fluent Bit to use its built-in Kubernetes filter.Match kube.*: Applies this filter to logs tagged withkube.*, which ourtailinput does.Kube_URL,Kube_CA_File,Kube_Token_File: These point to the Kubernetes API server, allowing the agent to fetch metadata for the pods whose logs it’s reading. The agent runs with a Service Account that has permissions to query the API.Labels On: This is the key setting that tells the filter to append Kubernetes labels associated with the pod.Annotations Off: We’re not enriching with annotations in this example, but you could turn thisOntoo.Merge_Log On: This is often used to combine the original log line with the enriched metadata into a single field, making it easier to parse by your backend.
The most surprising thing about this process is how the logging agent, running as a DaemonSet on each Kubernetes node, acts as a proxy for the Kubernetes API. It doesn’t just read log files; it actively queries the API server for each log stream it handles to get the pod’s name, namespace, and associated labels and annotations. This allows it to enrich logs even if the pod’s metadata changes after the log line was generated but before the log line is processed.
Once you have this set up, you can query your logs using your backend’s language. For example, in Elasticsearch using Kibana, you could search for all logs from pods with app: nginx in the staging environment:
kubernetes.labels.app: "nginx" AND kubernetes.labels.environment: "staging"
This turns a chaotic stream of logs into a structured, searchable dataset, allowing you to quickly pinpoint issues within specific applications and environments.
The next step you’ll likely encounter is handling multiline logs, like stack traces, and ensuring consistent log formatting across all your applications.