The K3s agent process on your node is failing to connect to the K3s server, preventing it from joining the cluster.
Common Causes and Fixes:
1. Incorrect Server IP/Hostname in Agent Config
- Diagnosis: Check the
K3S_URLenvironment variable or theserverfield in/etc/rancher/k3s/config.yamlon the agent node. It must precisely match the IP address or resolvable hostname of the K3s server. - Fix:
- If using environment variables (common in systemd services):
# On the agent node sudo vi /etc/systemd/system/k3s-agent.service # Find and correct the K3S_URL line, e.g.: # Environment="K3S_URL=https://192.168.1.100:6443" sudo systemctl daemon-reload sudo systemctl restart k3s-agent - If using
config.yaml:# On the agent node sudo vi /etc/rancher/k3s/config.yaml # Ensure the server line is correct: # server: https://192.168.1.100:6443 sudo systemctl restart k3s-agent
- If using environment variables (common in systemd services):
- Why it works: The agent needs the exact network endpoint to initiate the connection. A typo or incorrect IP means it’s trying to connect to the wrong place.
2. Firewall Blocking Agent-to-Server Communication
- Diagnosis: The K3s server typically listens on port 6443 (TCP) for API requests from agents and other clients. If a firewall on the server or the agent node, or any network device in between, is blocking this port, the agent cannot connect.
- On the agent node, try
telnet <server_ip> 6443ornc -vz <server_ip> 6443. If it times out or connection refused, the port is likely blocked.
- On the agent node, try
- Fix:
- UFW (Ubuntu/Debian):
# On the server node sudo ufw allow 6443/tcp sudo ufw reload - firewalld (CentOS/RHEL/Fedora):
# On the server node sudo firewall-cmd --add-port=6443/tcp --permanent sudo firewall-cmd --reload - iptables:
# On the server node sudo iptables -A INPUT -p tcp --dport 6443 -j ACCEPT # Save rules if using iptables-persistent # sudo netfilter-persistent save
- UFW (Ubuntu/Debian):
- Why it works: This explicitly opens the necessary communication channel, allowing the agent to send its connection requests to the server.
3. Incorrect Token
- Diagnosis: K3s uses a token for agent authentication. This token is specified via
K3S_TOKENin the agent’s service file ortokenin/etc/rancher/k3s/config.yaml. It must match the token used by the server. The server’s token is typically found in/var/lib/rancher/k3s/server/node-token(on the server node). - Fix:
- On the agent node, ensure
K3S_TOKENor thetokenfield inconfig.yamlis set to the correct value from the server’s/var/lib/rancher/k3s/server/node-token.# On the agent node # If using environment variable in systemd service: sudo vi /etc/systemd/system/k3s-agent.service # Environment="K3S_TOKEN=YOUR_SHARED_SECRET" # If using config.yaml: sudo vi /etc/rancher/k3s/config.yaml # token: YOUR_SHARED_SECRET sudo systemctl daemon-reload sudo systemctl restart k3s-agent
- On the agent node, ensure
- Why it works: The token acts as a shared secret, verifying the agent’s identity to the server. An incorrect token is rejected by the server.
4. Agent Running on a Server Node (or vice-versa)
- Diagnosis: A K3s node is configured as either a server or an agent. You cannot run both
k3s(server) andk3s-agentservices on the same node. If you’ve installed K3s and are trying to join a cluster, you should only havek3s-agentrunning (or justk3sif it’s the first server node). Checksudo systemctl status k3sandsudo systemctl status k3s-agent. - Fix:
- If you intended this node to be an agent, stop and disable the
k3sservice and ensurek3s-agentis running and configured correctly withK3S_URLandK3S_TOKEN.# On the agent node sudo systemctl stop k3s sudo systemctl disable k3s sudo systemctl start k3s-agent sudo systemctl enable k3s-agent - If you intended this node to be a server (and it’s not the first one), this is an invalid configuration. A cluster has one or more servers and many agents.
- If you intended this node to be an agent, stop and disable the
- Why it works: K3s binaries are distinct for server and agent roles, and running both on one node leads to conflicts and unresolvable connection attempts.
5. Network Connectivity Issues (General)
- Diagnosis: Beyond firewalls, basic network reachability might be the issue. Can the agent node ping the server node’s IP address?
If ping fails, you have a more fundamental network problem (routing, physical connectivity, IP configuration) that needs to be resolved first. Also, check DNS resolution if you’re using hostnames.# On the agent node ping <server_ip># On the agent node nslookup <server_hostname> - Fix: Troubleshoot standard networking issues. Ensure IP addresses are correctly assigned, subnets are configured for routing, and DNS is functional.
- Why it works: The agent and server must be able to establish and maintain a TCP connection over the network.
6. SELinux or AppArmor Interference
- Diagnosis: Security modules like SELinux or AppArmor can sometimes prevent K3s processes from binding to ports or making network connections, especially if they have custom or restrictive policies. Check system logs (
sudo journalctl -xe,/var/log/audit/audit.logfor SELinux,/var/log/syslogordmesgfor AppArmor). - Fix:
- Temporarily disable (for testing):
- SELinux:
sudo setenforce 0 - AppArmor:
sudo systemctl stop apparmor && sudo systemctl disable apparmor
- SELinux:
- Permanent fix: Adjust the SELinux/AppArmor policies to allow K3s network operations. This is highly environment-specific. For SELinux, you might look for
avc: deniedmessages inaudit.logand useaudit2allowto create a custom policy module.
- Temporarily disable (for testing):
- Why it works: Security modules enforce access controls. If K3s operations violate these controls, they are blocked. Disabling them temporarily confirms if they are the cause.
After fixing these, the next error you might encounter is related to the Kubernetes API server itself being unhealthy or overloaded if the server node is undersized or experiencing its own issues.