K3s’s multi-master setup isn’t about redundant Kubernetes API servers; it’s about distributing the state of the cluster across multiple etcd instances.
Let’s see it in action. Imagine you have two K3s masters, master1 and master2. On master1, you start K3s with:
curl -sfL https://get.k3s.io | sh -s - server \
--cluster-init \
--node-ip 192.168.1.100 \
--tls-san 192.168.1.100 \
--write-kubeconfig-mode 0644
The --cluster-init flag is key here. It tells K3s to bootstrap a new etcd cluster. Now, on master2, you start K3s pointing to the existing etcd cluster:
curl -sfL https://get.k3s.io | sh -s - server \
--node-ip 192.168.1.101 \
--server https://192.168.1.100:6443 \
--token <token_from_master1> \
--write-kubeconfig-mode 0644
The --server flag points to master1’s API server, and --token is the registration token K3s uses to allow nodes to join. K3s on master2 will automatically detect that an etcd cluster already exists and join it, becoming a peer. You can verify this by checking the etcd members on master1:
sudo ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379 \
--cacert=/var/lib/rancher/k3s/server/tls/etcd/server-ca.crt \
--cert=/var/lib/rancher/k3s/server/tls/etcd/server-client.crt \
--key=/var/lib/rancher/k3s/server/tls/etcd/server-client.key \
member list -w table
You should see both master1 and master2 listed as members of the etcd cluster.
This setup addresses the critical need for a fault-tolerant data store for Kubernetes. Unlike traditional Kubernetes where you might run multiple API server instances behind a load balancer, K3s’s multi-master relies on etcd’s distributed consensus to ensure data consistency and availability. If one master node goes down, the remaining etcd peers can continue to serve read and write requests to the cluster state, preventing a complete control plane outage. The Kubernetes API server itself, running on each master, becomes a stateless component that can be directed to any healthy master.
The magic behind distributing etcd is Raft. Each etcd member maintains a persistent log of all state changes. When a write operation occurs, etcd uses Raft to elect a leader among the etcd peers. This leader proposes the change to the other peers. Once a majority of peers acknowledge the change, it’s committed to the log and applied to the state machine. This consensus mechanism guarantees that all etcd members agree on the order of operations and the final state, even in the presence of network partitions or node failures, as long as a majority of nodes remain operational.
Consider the datastore-endpoint flag if you’re not using the embedded etcd. This is how you’d point K3s to an external, pre-existing etcd cluster (like one managed by Rancher or a separate etcd deployment). For example, on each master:
curl -sfL https://get.k3s.io | sh -s - server \
--datastore-endpoint https://etcd-0.example.com:2379,https://etcd-1.example.com:2379,https://etcd-2.example.com:2379 \
--datastore-cafile /path/to/etcd-ca.crt \
--datastore-certfile /path/to/k3s-client.crt \
--datastore-keyfile /path/to/k3s-client.key \
--tls-san <master_ip_address>
This allows K3s to leverage existing etcd infrastructure for its state, offering an alternative to managing etcd within the K3s cluster itself.
The --tls-san flag is crucial for enabling TLS encryption between the K3s API server and etcd. It ensures that the API server can trust the etcd endpoints. When using embedded etcd, K3s auto-generates these certificates. If you’re using an external etcd, you’ll need to provide your own CA certificate (--datastore-cafile) and client certificates (--datastore-certfile, --datastore-keyfile) that are trusted by your etcd cluster.
When you add a third master, master3, you’d use the same command as master2, but point it to master1 (or master2, as they are now peers):
curl -sfL https://get.k3s.io | sh -s - server \
--node-ip 192.168.1.102 \
--server https://192.168.1.100:6443 \
--token <token_from_master1> \
--write-kubeconfig-mode 0644
master3 will join the existing etcd cluster, bringing the total number of etcd members to three. This provides quorum, meaning the cluster can tolerate the failure of one etcd member while still maintaining consensus.
The most surprising aspect for many is that K3s doesn’t require a separate load balancer for the API servers in a multi-master setup when using embedded etcd. The --server flag used by worker nodes and subsequent masters simply points to one of the master nodes. If that master becomes unavailable, the K3s agent or kubectl client will typically retry, and the operating system’s TCP stack will eventually establish a connection to another available master. The true HA is in the data, not the API endpoint itself.
The next challenge you’ll encounter is managing client access to the HA control plane, often involving a load balancer in front of the API servers for stable kubectl and application connectivity.