A DaemonSet ensures that a copy of a pod runs on all (or a subset of) nodes in your cluster, making it the go-to for node-level agents.
Let’s see it in action. Imagine you want to run a log collector, like Fluentd, on every node in your Kubernetes cluster.
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-daemonset
labels:
app: fluentd
spec:
selector:
matchLabels:
name: fluentd
template:
metadata:
labels:
name: fluentd
spec:
containers:
- name: fluentd
image: fluent/fluentd:v1.14-debian
resources:
limits:
cpu: 200m
memory: 200Mi
requests:
cpu: 100m
memory: 100Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: fluentd-data
mountPath: /var/lib/fluentd
volumes:
- name: varlog
hostPath:
path: /var/log
- name: fluentd-data
hostPath:
path: /var/lib/fluentd
When you apply this DaemonSet manifest:
kubectl apply -f fluentd-daemonset.yaml
Kubernetes doesn’t just create one pod; it looks at all your nodes (by default) and schedules one Fluentd pod on each. You can verify this:
kubectl get pods -o wide | grep fluentd
You’ll see a fluentd pod running on each of your worker nodes, each collecting logs from that specific node’s /var/log directory, thanks to the hostPath volume.
The core problem a DaemonSet solves is distributing essential, node-specific services that need to run everywhere. Think of monitoring agents, log forwarders, storage daemons, or even security agents. Without DaemonSet, you’d have to manually create a Deployment for each node, which is obviously not scalable or manageable.
Internally, the DaemonSet controller watches for node additions and removals. When a new node joins the cluster, the DaemonSet controller automatically creates a new pod for that node. When a node is removed, the DaemonSet garbage collector cleans up the pods associated with that node. You can also control which nodes the DaemonSet runs on using nodeSelector or affinity rules within the DaemonSet spec, allowing you to target specific groups of nodes.
The updateStrategy field is a crucial lever you control. By default, it’s RollingUpdate, which is generally what you want. You can configure maxUnavailable and maxSurge to control how the pods are updated across your nodes. For instance, setting maxUnavailable: 1 means only one pod will be unavailable at a time during an update, ensuring your service remains available across the cluster.
What most people don’t realize is how the deletionGracePeriodSeconds on the DaemonSet’s pod template directly impacts the shutdown behavior of your node-level agents. If your agent needs to flush logs or gracefully shut down, a short grace period can lead to data loss or incomplete operations as the pod is abruptly terminated when a node is drained or deleted. You need to ensure this value is sufficient for your agent’s shutdown sequence to complete.
The next step is often configuring the DaemonSet to run on a subset of nodes using node selectors or affinity.