Fix Kubernetes Node Not Ready Errors (2026)

The NodeNotReady error means the kubelet on a worker node has stopped sending heartbeats to the Kubernetes API server, and the control plane can no longer confirm the node’s health. This prevents the node from scheduling new pods and can lead to existing pods being rescheduled elsewhere.

The most common reason for this is the kubelet service itself crashing or becoming unresponsive. Check its status with systemctl status kubelet. If it’s inactive or failed, try restarting it with systemctl restart kubelet. This works because the kubelet is the primary agent on the node responsible for reporting its status and managing pods.

Network connectivity issues between the node and the control plane are another frequent culprit. The kubelet needs to communicate with the API server over port 6443 (or your configured API server port). Use ping <api-server-ip> from the affected node to check basic reachability. If ping works but connections fail, try nc -vz <api-server-ip> 6443. If that fails, check your network security groups, firewalls, or routing tables to ensure traffic on port 6443 is allowed. This is critical because the kubelet must be able to talk to the API server for the control plane to consider it ready.

Disk pressure on the node can also cause the kubelet to enter a NotReady state. The kubelet monitors disk usage for /var/lib/kubelet and the pod ephemeral storage. Check disk usage with df -h. If a filesystem is over 90% full, you’ll need to free up space. This might involve deleting old images with docker image prune -a (if using Docker) or crictl rmi --prune (if using containerd/CRI-O), or cleaning up pod logs. The kubelet can become unstable when its critical data partitions are full.

Container runtime issues are a significant cause. The kubelet relies on the container runtime (like containerd or Docker) to start and stop pods. Check the status of your container runtime service, e.g., systemctl status containerd or systemctl status docker. If it’s not running, try restarting it: systemctl restart containerd or systemctl restart docker. A healthy container runtime is essential for the kubelet to perform its core pod management duties.

Incorrect kubelet configuration can lead to readiness problems. Verify that the kubelet configuration file (usually /var/lib/kubelet/config.yaml or /etc/kubernetes/kubelet.conf) is correctly formatted and that parameters like clusterDNS, clusterDomain, and apiServerEndpoint are accurate. A typo or incorrect value here can prevent the kubelet from initializing properly or connecting to the cluster. After correcting the config, restart the kubelet: systemctl restart kubelet.

Resource exhaustion on the node itself, beyond just disk, can impact the kubelet. High CPU or memory utilization can starve the kubelet process, causing it to miss its heartbeats. Use top or htop to check resource usage. If the node is consistently maxed out, you may need to add more resources to the node or reduce the workload by moving pods to other nodes. The kubelet, like any process, needs sufficient CPU and memory to function.

Finally, issues with the CNI (Container Network Interface) plugin can also manifest as NodeNotReady. If the CNI plugin fails to initialize or configure networking for pods, the kubelet might report the node as not ready. Check the logs of your CNI daemonset pods (e.g., Calico, Flannel, Cilium) in the kube-system namespace for errors. A common fix is to restart the CNI pods or, in some cases, redeploy the CNI. The kubelet often waits for the CNI to be operational before reporting readiness.

After resolving these, you’ll likely encounter ImagePullBackOff errors if your image registry is inaccessible or misconfigured.