The K3s agent process on your node is failing to connect to the K3s server, preventing it from joining the cluster.

Common Causes and Fixes:

1. Incorrect Server IP/Hostname in Agent Config

  • Diagnosis: Check the K3S_URL environment variable or the server field in /etc/rancher/k3s/config.yaml on the agent node. It must precisely match the IP address or resolvable hostname of the K3s server.
  • Fix:
    • If using environment variables (common in systemd services):
      # On the agent node
      sudo vi /etc/systemd/system/k3s-agent.service
      # Find and correct the K3S_URL line, e.g.:
      # Environment="K3S_URL=https://192.168.1.100:6443"
      sudo systemctl daemon-reload
      sudo systemctl restart k3s-agent
      
    • If using config.yaml:
      # On the agent node
      sudo vi /etc/rancher/k3s/config.yaml
      # Ensure the server line is correct:
      # server: https://192.168.1.100:6443
      sudo systemctl restart k3s-agent
      
  • Why it works: The agent needs the exact network endpoint to initiate the connection. A typo or incorrect IP means it’s trying to connect to the wrong place.

2. Firewall Blocking Agent-to-Server Communication

  • Diagnosis: The K3s server typically listens on port 6443 (TCP) for API requests from agents and other clients. If a firewall on the server or the agent node, or any network device in between, is blocking this port, the agent cannot connect.
    • On the agent node, try telnet <server_ip> 6443 or nc -vz <server_ip> 6443. If it times out or connection refused, the port is likely blocked.
  • Fix:
    • UFW (Ubuntu/Debian):
      # On the server node
      sudo ufw allow 6443/tcp
      sudo ufw reload
      
    • firewalld (CentOS/RHEL/Fedora):
      # On the server node
      sudo firewall-cmd --add-port=6443/tcp --permanent
      sudo firewall-cmd --reload
      
    • iptables:
      # On the server node
      sudo iptables -A INPUT -p tcp --dport 6443 -j ACCEPT
      # Save rules if using iptables-persistent
      # sudo netfilter-persistent save
      
  • Why it works: This explicitly opens the necessary communication channel, allowing the agent to send its connection requests to the server.

3. Incorrect Token

  • Diagnosis: K3s uses a token for agent authentication. This token is specified via K3S_TOKEN in the agent’s service file or token in /etc/rancher/k3s/config.yaml. It must match the token used by the server. The server’s token is typically found in /var/lib/rancher/k3s/server/node-token (on the server node).
  • Fix:
    • On the agent node, ensure K3S_TOKEN or the token field in config.yaml is set to the correct value from the server’s /var/lib/rancher/k3s/server/node-token.
      # On the agent node
      # If using environment variable in systemd service:
      sudo vi /etc/systemd/system/k3s-agent.service
      # Environment="K3S_TOKEN=YOUR_SHARED_SECRET"
      # If using config.yaml:
      sudo vi /etc/rancher/k3s/config.yaml
      # token: YOUR_SHARED_SECRET
      
      sudo systemctl daemon-reload
      sudo systemctl restart k3s-agent
      
  • Why it works: The token acts as a shared secret, verifying the agent’s identity to the server. An incorrect token is rejected by the server.

4. Agent Running on a Server Node (or vice-versa)

  • Diagnosis: A K3s node is configured as either a server or an agent. You cannot run both k3s (server) and k3s-agent services on the same node. If you’ve installed K3s and are trying to join a cluster, you should only have k3s-agent running (or just k3s if it’s the first server node). Check sudo systemctl status k3s and sudo systemctl status k3s-agent.
  • Fix:
    • If you intended this node to be an agent, stop and disable the k3s service and ensure k3s-agent is running and configured correctly with K3S_URL and K3S_TOKEN.
      # On the agent node
      sudo systemctl stop k3s
      sudo systemctl disable k3s
      sudo systemctl start k3s-agent
      sudo systemctl enable k3s-agent
      
    • If you intended this node to be a server (and it’s not the first one), this is an invalid configuration. A cluster has one or more servers and many agents.
  • Why it works: K3s binaries are distinct for server and agent roles, and running both on one node leads to conflicts and unresolvable connection attempts.

5. Network Connectivity Issues (General)

  • Diagnosis: Beyond firewalls, basic network reachability might be the issue. Can the agent node ping the server node’s IP address?
    # On the agent node
    ping <server_ip>
    
    If ping fails, you have a more fundamental network problem (routing, physical connectivity, IP configuration) that needs to be resolved first. Also, check DNS resolution if you’re using hostnames.
    # On the agent node
    nslookup <server_hostname>
    
  • Fix: Troubleshoot standard networking issues. Ensure IP addresses are correctly assigned, subnets are configured for routing, and DNS is functional.
  • Why it works: The agent and server must be able to establish and maintain a TCP connection over the network.

6. SELinux or AppArmor Interference

  • Diagnosis: Security modules like SELinux or AppArmor can sometimes prevent K3s processes from binding to ports or making network connections, especially if they have custom or restrictive policies. Check system logs (sudo journalctl -xe, /var/log/audit/audit.log for SELinux, /var/log/syslog or dmesg for AppArmor).
  • Fix:
    • Temporarily disable (for testing):
      • SELinux: sudo setenforce 0
      • AppArmor: sudo systemctl stop apparmor && sudo systemctl disable apparmor
    • Permanent fix: Adjust the SELinux/AppArmor policies to allow K3s network operations. This is highly environment-specific. For SELinux, you might look for avc: denied messages in audit.log and use audit2allow to create a custom policy module.
  • Why it works: Security modules enforce access controls. If K3s operations violate these controls, they are blocked. Disabling them temporarily confirms if they are the cause.

After fixing these, the next error you might encounter is related to the Kubernetes API server itself being unhealthy or overloaded if the server node is undersized or experiencing its own issues.

Want structured learning?

Take the full K3s course →