The most surprising thing about MariaDB Galera is that it’s not a database, but a synchronous multi-master replication plugin that can attach to MariaDB (or Percona Server, or MySQL).
Let’s see it in action. Imagine we have two MariaDB nodes, node1 and node2, already set up with Galera. We want to add a third node, node3, and have it join the existing cluster.
On node1 (which is already part of the cluster), we’d check its Galera configuration. A typical my.cnf or galera.cnf might look like this:
[galera]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="my_galera_cluster"
wsrep_cluster_address="gcomm://192.168.1.101,192.168.1.102"
wsrep_node_name="node1"
wsrep_node_address="192.168.1.101"
wsrep_sst_method=rsync
Here’s what each of these means for our new node3:
wsrep_on=ON: This is the fundamental switch. It tells MariaDB to load and use the Galera replication provider. Fornode3, this must also beON.wsrep_provider=/usr/lib/galera/libgalera_smm.so: This points to the actual Galera library file. The path might differ depending on your installation, but it’s crucial thatnode3has this file and this directive correctly points to it.wsrep_cluster_name="my_galera_cluster": All nodes in a Galera cluster must share the samewsrep_cluster_name. This is how nodes identify which cluster they belong to. Ifnode3has a different name, it will try to form its own, isolated cluster.wsrep_cluster_address="gcomm://192.168.1.101,192.168.1.102": This is the heart of cluster discovery. It’s a comma-separated list of IP addresses of existing nodes in the cluster.node3will attempt to connect to one of these addresses to join. Fornode3, this should begcomm://192.168.1.101,192.168.1.102. If you’re addingnode3to a cluster withnode1andnode2, you don’t need to addnode3’s IP to this list onnode1ornode2fornode3to join.node3initiates the connection.wsrep_node_name="node3": Each node in a Galera cluster should have a uniquewsrep_node_name. This is primarily for identification and logging. It doesn’t affect cluster operation as long as it’s unique.wsrep_node_address="192.168.1.103": This specifies the IP address that this specific node (node3) will use for Galera communication. It should be the IP address ofnode3. If omitted, Galera tries to guess, which can lead to problems if the node has multiple network interfaces.wsrep_sst_method=rsync: This defines the State Snapshot Transfer (SST) method. When a new node joins, it needs a full copy of the data.rsyncis a common, simple method. Other options includextrabackup-v2(for Percona XtraDB Cluster) ormariabackup. The method chosen must be available and configured on all nodes. Ifrsyncis used, thersyncdaemon must be running and accessible fromnode3.
To bootstrap a new cluster, you start the first node with galera_new_cluster in its wsrep_cluster_address or by starting the mysqld daemon with --wsrep-new-cluster. For example, on node1 if it were the very first node:
mysqld_safe --wsrep-new-cluster &
Once node1 is up and running as a single-node cluster, subsequent nodes (node2, node3, etc.) join by pointing to node1’s address in their wsrep_cluster_address.
The actual data synchronization happens via the wsrep_sst_method. When node3 starts and connects to node1, node1 (or whichever node is designated as the SST donor) will initiate an SST to node3. For rsync, this means rsync will be invoked to copy the entire data directory from the donor to node3. Once the SST is complete, node3 can start processing transactions.
The wsrep_provider_options directive allows finer-grained tuning of Galera’s internal behavior. For example, evs.keepalive_period=3000 (milliseconds) controls how often nodes send keepalive packets to check if peers are alive. The default is 2000ms. Increasing this value can reduce network chatter but might delay detection of a failed node. Another common option is gcache.size, which determines the size of the transaction buffer (Write Set Cache) that Galera uses for replication. A larger gcache.size allows nodes to catch up more easily after being offline for longer periods, but consumes more memory. For instance, gcache.size=1G sets it to 1 Gigabyte.
When node3 successfully joins the cluster, its wsrep_cluster_status will change from Disconnected to Primary. You can check this with:
SHOW STATUS LIKE 'wsrep_cluster_status';
The most common pitfall when configuring Galera providers is misconfiguration of the wsrep_cluster_address. If node3 cannot reach any of the IPs listed, or if the wsrep_cluster_name doesn’t match, it will fail to join. Another frequent issue is the SST method not being properly configured or accessible between nodes, especially firewall rules blocking the necessary ports (default 3306 for MariaDB, 4567 for Galera replication, and the port used by the SST method, e.g., 873 for rsync).
After successfully configuring all nodes and joining them, the next challenge you’ll likely face is managing network partitions and understanding Galera’s quorum behavior.