K3s worker nodes can join your cluster through a surprisingly simple mechanism that relies on a pre-shared secret and a persistent connection.
Let’s see this in action. Imagine you have a K3s server already running. You want to add a new worker node. On the server node, you’d find its token.
sudo cat /var/lib/rancher/k3s/server/node-token
This will output something like K10abcdefg1234567890::server:xyz9876543210.
Now, on the worker node, you’ll install K3s, but this time you’ll tell it where the server is and give it that token.
curl -sfL https://get.k3s.io | K3S_URL=https://<server_ip>:6443 K3S_TOKEN=K10abcdefg1234567890::server:xyz9876543210 sh -
Replace <server_ip> with the actual IP address of your K3s server. This command downloads the K3s installer and then sets two environment variables: K3S_URL pointing to the server’s API endpoint and K3S_TOKEN with the secret you just retrieved. The installer uses these to configure the K3s agent.
Once installed and running, the K3s agent establishes a persistent WebSocket connection to the K3s server. This connection is authenticated using the provided K3S_TOKEN. The server then registers the agent, and the agent starts listening for commands and reporting its status back through this same channel. It’s this continuous, authenticated communication that keeps the worker node part of the cluster.
The beauty here is that it bypasses typical Kubernetes control plane complexities like certificates for node registration. K3s uses a single, shared token for authentication and relies on the agent’s ability to reach the server directly over the network. This makes initial setup incredibly streamlined, especially for edge deployments or environments where managing complex certificate lifecycles is a burden.
The K3S_TOKEN isn’t just for initial join. It’s the ongoing authentication mechanism for the agent’s connection to the server. If this token is compromised, an attacker could potentially inject malicious workloads or gain access to cluster information by impersonating a legitimate worker node. For this reason, it’s crucial to protect the node-token file on the server and ensure the K3S_TOKEN environment variable is set securely on the worker.
The server component, k3s server, acts as the central brain. It manages the Kubernetes API, etcd (or its embedded alternative), and orchestrates the state of the entire cluster. When a worker node joins, the server adds it to its internal registry and begins scheduling pods onto it based on resource availability and pod specifications. The worker node, running k3s agent, is essentially a lightweight Kubernetes node that listens to the server for instructions. It runs the kubelet, container runtime (containerd by default), and CNI plugins, all managed and configured by the server.
The K3S_URL is also more than just a destination; it dictates which Kubernetes API server the agent will attempt to connect to. This isn’t just about joining the cluster; it’s about the agent’s ongoing lifeline to the control plane. If the server IP changes or the port 6443 becomes inaccessible, the agent will lose its connection and effectively become isolated from the cluster’s management. This also means that if you’re using a load balancer for your K3s server, the K3S_URL should point to the load balancer’s address.
What’s often overlooked is that the node-token is also used by kubectl on the server to authenticate against its own API. So, if you ever need to kubectl from the server node itself and kubectl complains about authentication, it’s likely because the server is using that same token for its internal API access. You can also generate new tokens on the server if the existing one is compromised or needs rotation.
The persistent WebSocket connection is key because it allows for real-time communication. Pod status updates, events, and new pod deployments flow through this channel. Without it, the agent wouldn’t know when to start new containers or report back that existing ones have failed. This is the engine that drives the dynamic nature of Kubernetes, allowing it to react to changes and maintain desired states.
When you need to remove a worker node, you don’t typically "unjoin" it from the agent’s perspective. Instead, you stop the K3s agent process on the worker node itself. The K3s server will eventually detect the lost connection and mark the node as NotReady. You can then manually delete the node object from the cluster using kubectl delete node <node-name>. This is a deliberate design choice to ensure that nodes can be gracefully shut down or removed without requiring complex de-registration procedures on the agent side, which might fail if the network is the issue.
The next step after joining worker nodes is managing their resources and ensuring they are ready to receive workloads.