K3s in HA mode with embedded etcd is not a distributed database that magically synchronizes state; it’s a single etcd cluster that must remain perfectly consistent or the entire control plane grinds to a halt.
Let’s see it in action. Imagine you have two K3s servers, k3s-server-1 and k3s-server-2, both configured for HA with etcd.
On k3s-server-1:
curl --cacert /etc/rancher/k3s/server-ca.crt --cert /etc/rancher/k3s/client-kube.crt --key /etc/rancher/k3s/client-kube.key https://127.0.0.1:6443/v1-security/authn/token -X POST -d '{"user-id": "test-user", "user-info": {"username": "test-user", "uid": "test-uid", "groups": ["system:authenticated"]}}'
This request, hitting the local API server, is actually being routed and processed by the etcd cluster that both servers are members of. The token is generated based on the state stored in etcd.
Now, on k3s-server-2, you’d run a similar command, and it should produce an identical token for the same user.
curl --cacert /etc/rancher/k3s/server-ca.crt --cert /etc/rancher/k3s/client-kube.crt --key /etc/rancher/k3s/client-kube.key https://127.0.0.1:6443/v1-security/authn/token -X POST -d '{"user-id": "test-user", "user-info": {"username": "test-user", "uid": "test-uid", "groups": ["system:authenticated"]}}'
The key here is that the API server processes these requests using the shared etcd state. If that state is inconsistent between members, one server might generate a different token, or worse, fail to respond at all.
The core problem K3s HA mode with embedded etcd solves is providing a resilient Kubernetes control plane without external dependencies like managed etcd clusters or separate database solutions. It bundles etcd directly into the K3s server binary, simplifying deployment and management for smaller to medium-sized clusters.
Internally, K3s uses etcd for all its state management: cluster configuration, pod definitions, service endpoints, secrets, and everything else that makes up your Kubernetes cluster. When you run K3s in HA mode, you’re essentially setting up a clustered etcd instance where multiple K3s server nodes participate as etcd members. This cluster requires a quorum (a majority of members) to be available and in agreement to operate. For a 3-node etcd cluster, this means at least 2 members must be healthy.
The config.yaml on each server node is crucial. For HA with embedded etcd, you’ll typically see parameters like:
write-kubeconfig-mode: "0644"
token: "YOUR_CLUSTER_TOKEN"
server: "https://<server-ip-1>:6443,https://<server-ip-2>:6443,https://<server-ip-3>:6443"
datastore-endpoint: "https://<server-ip-1>:2379,https://<server-ip-2>:2379,https://<server-ip-3>:2379"
disable:
- "traefik" # Or other components you want to manage externally
tls-san:
- "<server-ip-1>"
- "<server-ip-2>"
- "<server-ip-3>"
- "kubernetes.default.svc"
- "kubernetes.default"
- "kubernetes"
The token is the shared secret used for joining the cluster. The server parameter lists all the K3s API server endpoints. Crucially, datastore-endpoint points to the etcd cluster endpoints for each server node. The tls-san field ensures the API server’s TLS certificate is valid for all specified hostnames and IP addresses.
The actual communication between K3s server nodes for etcd state is over the etcd client port, which defaults to 2379. When a K3s server starts in HA mode, it attempts to join or form an etcd cluster using the datastore-endpoint configuration. Each node advertises its etcd peer port (default 2380) to other members. For the cluster to form and stay healthy, these etcd ports must be accessible between all server nodes, and the etcd cluster must maintain its quorum.
When you add a new node to an existing HA K3s cluster, you provide the server URL of an existing node and the shared token. The new node contacts the existing API server, which then orchestrates its joining into the etcd cluster. The new node will download the necessary etcd member information and begin participating.
The mechanism K3s uses to ensure etcd leader election and health checks is standard Raft consensus. When a node is added, it’s assigned a unique member ID and begins participating in the consensus protocol. If a node becomes unhealthy or is removed, the remaining members re-evaluate their quorum. If a quorum is lost, the etcd cluster becomes read-only, and new writes to the Kubernetes API will fail.
A common pitfall is assuming that restarting one K3s server node is safe. In an HA setup, as long as the remaining nodes constitute a quorum, the cluster will continue to operate. However, if you restart more nodes than allowed by the quorum (e.g., two nodes in a three-node cluster), the etcd cluster will fracture, and the control plane will become unavailable. The etcd members will try to elect a leader, but without a majority, no leader can be established.
The most surprising true thing about K3s HA mode is that the server and datastore-endpoint configurations are often identical because K3s bundles the API server and etcd into the same binary and process on the server nodes. This means the API server is directly talking to the embedded etcd instance on the same machine, and other K3s servers are reaching out to that specific etcd instance over the network using the datastore-endpoint IPs.
If you encounter etcdserver: leader changed errors after bringing up a new K3s server node, it’s a strong indicator of network connectivity issues between the etcd peers on ports 2379 and 2380.