K3s nodes are failing to join the cluster and kubectl get nodes shows them as NotReady.

The k3s-agent process on the worker nodes is failing to establish a connection to the K3s server, usually due to network issues or misconfiguration.

Common Causes and Fixes

  1. Incorrect Server URL: The server URL configured on the agent is wrong, preventing it from finding the control plane.

    • Diagnosis: Check the K3S_URL environment variable or the contents of /etc/rancher/k3s/config.yaml on the agent node. For example, sudo cat /etc/rancher/k3s/config.yaml or sudo printenv K3S_URL.
    • Fix: Ensure the server URL in /etc/rancher/k3s/config.yaml or the K3S_URL environment variable points to the correct IP address or hostname of the K3s server node, including the correct port (default 6443). For instance, if your server is at 192.168.1.100, the URL should be https://192.168.1.100:6443. Edit the file: sudo sed -i 's/server: https:\/\/old-ip:6443/server: https:\/\/192.168.1.100:6443/' /etc/rancher/k3s/config.yaml.
    • Why it works: The agent needs the precise address to know where to send its registration requests.
  2. Firewall Blocking Port 6443: Network firewalls (on the nodes or in between) are blocking the essential Kubernetes API server port.

    • Diagnosis: From the agent node, try to curl https://<server_ip>:6443. If it times out or returns an error, the port is likely blocked. Use sudo ufw status on Ubuntu or sudo firewall-cmd --list-all on CentOS/RHEL to check local firewall rules.
    • Fix: Open port 6443 for TCP traffic on the server node’s firewall. For ufw: sudo ufw allow 6443/tcp. For firewalld: sudo firewall-cmd --zone=public --add-port=6443/tcp --permanent && sudo firewall-cmd --reload.
    • Why it works: Unblocking the port allows the agent to communicate with the server’s API endpoint.
  3. K3s Agent Service Not Running or Crashing: The k3s-agent service itself has failed to start or is repeatedly crashing due to an internal error or resource constraint.

    • Diagnosis: Check the status of the K3s agent service: sudo systemctl status k3s-agent. Look for "active (running)" or "inactive (dead)" and any error messages in the output. Also, check the agent’s logs: sudo journalctl -u k3s-agent -f.
    • Fix: If the service is inactive, try starting it: sudo systemctl start k3s-agent. If it’s failing, investigate the journalctl output for specific errors. Common fixes include insufficient memory, disk space, or corrupted configuration files. A common restart command is sudo systemctl restart k3s-agent.
    • Why it works: Ensures the agent process is actively running and attempting to connect.
  4. Incorrect Token: The token specified in the agent’s configuration does not match the token on the server, preventing authentication.

    • Diagnosis: Compare the token in /etc/rancher/k3s/config.yaml (or K3S_TOKEN env var) on the agent with the token used when starting the K3s server (often found in /var/lib/rancher/k3s/server/node-token on the server, or specified via --token flag).
    • Fix: Ensure the token value in the agent’s /etc/rancher/k3s/config.yaml exactly matches the server’s token. You can update it on the agent: sudo sed -i 's/token: <old-token>/token: <correct-token>/' /etc/rancher/k3s/config.yaml and then restart the agent: sudo systemctl restart k3s-agent.
    • Why it works: The token is a shared secret used for the agent to authenticate itself to the server.
  5. Network Connectivity Issues (General): While firewalls are common, more general network problems like incorrect subnet masks, routing issues, or DNS resolution failures can also prevent communication.

    • Diagnosis: From the agent node, try basic network checks: ping <server_ip> and traceroute <server_ip>. Ensure DNS resolution works for the server’s hostname if you’re using one: nslookup <server_hostname>.
    • Fix: Correct any IP addressing, subnetting, or routing misconfigurations on the agent’s network interface. If using hostnames, ensure the DNS server is correctly configured (/etc/resolv.conf) and accessible. For example, if ping fails, check /etc/netplan/*.yaml or /etc/sysconfig/network-scripts/ifcfg-* for IP/gateway settings.
    • Why it works: Establishes a reliable IP-level path for K3s traffic.
  6. K3s Server Not Ready or Unhealthy: The K3s server itself might be experiencing issues, preventing it from accepting new agent connections.

    • Diagnosis: SSH into the K3s server node and check its service status: sudo systemctl status k3s. Also, examine its logs: sudo journalctl -u k3s -f. Check if kubectl get nodes on the server shows the server node itself as Ready.
    • Fix: If the K3s server is not running or showing errors, resolve those issues first. This might involve restarting the server (sudo systemctl restart k3s), checking its configuration, or ensuring it has sufficient resources.
    • Why it works: A healthy control plane is a prerequisite for any agents to join.

After resolving these, you’ll likely encounter No CNI configuration found if your CNI plugin (like Flannel) isn’t installed or properly configured on the nodes.

Want structured learning?

Take the full K3s course →