The Kubernetes scheduler is refusing to place pods onto nodes because the nodes have taints that the pods do not tolerate.

Common Causes and Fixes

  1. Node taints preventing scheduling:

    • Diagnosis: kubectl describe node <node-name>. Look for the Taints section. You’ll see entries like node-role.kubernetes.io/master:NoSchedule or custom taints.
    • Fix: Add a toleration to your pod’s spec. For example, to tolerate a node-role.kubernetes.io/master:NoSchedule taint, add:
      spec:
        tolerations:
        - key: "node-role.kubernetes.io/master"
          operator: "Exists"
          effect: "NoSchedule"
      
      This tells the scheduler that this pod is allowed to run on nodes with this specific taint.
    • Why it works: Taints are applied to nodes to repel pods. Tolerations are applied to pods to allow them to be scheduled onto tainted nodes. Without a matching toleration, the pod will not be scheduled.
  2. NoExecute taint effect:

    • Diagnosis: Same as above, but observe the effect is NoExecute. This means not only will new pods not be scheduled, but existing pods on the node that don’t tolerate the taint will be evicted.
    • Fix: Add a toleration with the matching effect. For example, for key: "special-gpu:NoExecute", add:
      spec:
        tolerations:
        - key: "special-gpu"
          operator: "Equal"
          value: "true"
          effect: "NoExecute"
      
      This explicitly allows the pod to run on (or remain on) nodes with this NoExecute taint.
    • Why it works: The NoExecute effect is a stronger form of repulsion, actively evicting pods. Tolerating it prevents eviction and allows scheduling.
  3. Incorrect taint key or value:

    • Diagnosis: Double-check the exact key and value (if operator is Equal) in the node’s taints against your pod’s tolerations. Typos are common. For instance, a node might have gpu=true and your toleration is gpu=true.
    • Fix: Correct the toleration to match the taint. If the node has gpu=true and the taint is key: "gpu", operator: "Equal", value: "true", your toleration should be identical.
      spec:
        tolerations:
        - key: "gpu"
          operator: "Equal"
          value: "true"
          effect: "NoSchedule" # Or whatever effect is on the node
      
    • Why it works: Kubernetes performs an exact match for Equal operators. Any discrepancy, including case sensitivity, will cause the toleration to fail.
  4. Using Exists operator incorrectly:

    • Diagnosis: If a taint has a value (e.g., dedicated=gpu:NoSchedule), using operator: "Exists" in your toleration will match any pod with that key, regardless of its value. This might not be what you intended if you need to match a specific value.
    • Fix: Use operator: "Equal" if you need to match a specific value associated with the taint key. For a taint dedicated=gpu:NoSchedule, the toleration should be:
      spec:
        tolerations:
        - key: "dedicated"
          operator: "Equal"
          value: "gpu"
          effect: "NoSchedule"
      
    • Why it works: The Exists operator checks for the presence of the taint key, while Equal checks for both the key and its associated value.
  5. Taints applied by node labels or controllers:

    • Diagnosis: Some Kubernetes components (like kubeadm for control-plane nodes, or cloud provider integrations) automatically apply taints. For example, kubeadm adds node-role.kubernetes.io/control-plane:NoSchedule to control-plane nodes.
    • Fix: Understand why the taint is there. If it’s a control-plane node, you likely should tolerate it for specific pods (like monitoring agents). If it’s a custom taint applied by a controller, consult that controller’s documentation.
    • Why it works: Taints are not always manually applied. Understanding the source helps determine if you should tolerate it or if the taint itself is misconfigured.
  6. Node not ready or unhealthy:

    • Diagnosis: kubectl get nodes. If the node status is not Ready, it might not be scheduling pods, regardless of taints. Check kubectl describe node <node-name> for Conditions like Ready: False.
    • Fix: Troubleshoot the node’s health. This could involve checking kubelet logs (journalctl -u kubelet), container runtime status, or network connectivity to the control plane.
    • Why it works: A node must be in a Ready state for the scheduler to consider it for pod placement. Unhealthy nodes are cordoned by default, effectively preventing new pods.

The next error you’ll likely hit is a CrashLoopBackOff if the pod itself has an issue after finally being scheduled.

Want structured learning?

Take the full Kubernetes course →