The most surprising thing about MariaDB Galera is that it’s not a database, but a synchronous multi-master replication plugin that can attach to MariaDB (or Percona Server, or MySQL).

Let’s see it in action. Imagine we have two MariaDB nodes, node1 and node2, already set up with Galera. We want to add a third node, node3, and have it join the existing cluster.

On node1 (which is already part of the cluster), we’d check its Galera configuration. A typical my.cnf or galera.cnf might look like this:

[galera]
wsrep_on=ON
wsrep_provider=/usr/lib/galera/libgalera_smm.so
wsrep_cluster_name="my_galera_cluster"
wsrep_cluster_address="gcomm://192.168.1.101,192.168.1.102"
wsrep_node_name="node1"
wsrep_node_address="192.168.1.101"
wsrep_sst_method=rsync

Here’s what each of these means for our new node3:

  • wsrep_on=ON: This is the fundamental switch. It tells MariaDB to load and use the Galera replication provider. For node3, this must also be ON.
  • wsrep_provider=/usr/lib/galera/libgalera_smm.so: This points to the actual Galera library file. The path might differ depending on your installation, but it’s crucial that node3 has this file and this directive correctly points to it.
  • wsrep_cluster_name="my_galera_cluster": All nodes in a Galera cluster must share the same wsrep_cluster_name. This is how nodes identify which cluster they belong to. If node3 has a different name, it will try to form its own, isolated cluster.
  • wsrep_cluster_address="gcomm://192.168.1.101,192.168.1.102": This is the heart of cluster discovery. It’s a comma-separated list of IP addresses of existing nodes in the cluster. node3 will attempt to connect to one of these addresses to join. For node3, this should be gcomm://192.168.1.101,192.168.1.102. If you’re adding node3 to a cluster with node1 and node2, you don’t need to add node3’s IP to this list on node1 or node2 for node3 to join. node3 initiates the connection.
  • wsrep_node_name="node3": Each node in a Galera cluster should have a unique wsrep_node_name. This is primarily for identification and logging. It doesn’t affect cluster operation as long as it’s unique.
  • wsrep_node_address="192.168.1.103": This specifies the IP address that this specific node (node3) will use for Galera communication. It should be the IP address of node3. If omitted, Galera tries to guess, which can lead to problems if the node has multiple network interfaces.
  • wsrep_sst_method=rsync: This defines the State Snapshot Transfer (SST) method. When a new node joins, it needs a full copy of the data. rsync is a common, simple method. Other options include xtrabackup-v2 (for Percona XtraDB Cluster) or mariabackup. The method chosen must be available and configured on all nodes. If rsync is used, the rsync daemon must be running and accessible from node3.

To bootstrap a new cluster, you start the first node with galera_new_cluster in its wsrep_cluster_address or by starting the mysqld daemon with --wsrep-new-cluster. For example, on node1 if it were the very first node:

mysqld_safe --wsrep-new-cluster &

Once node1 is up and running as a single-node cluster, subsequent nodes (node2, node3, etc.) join by pointing to node1’s address in their wsrep_cluster_address.

The actual data synchronization happens via the wsrep_sst_method. When node3 starts and connects to node1, node1 (or whichever node is designated as the SST donor) will initiate an SST to node3. For rsync, this means rsync will be invoked to copy the entire data directory from the donor to node3. Once the SST is complete, node3 can start processing transactions.

The wsrep_provider_options directive allows finer-grained tuning of Galera’s internal behavior. For example, evs.keepalive_period=3000 (milliseconds) controls how often nodes send keepalive packets to check if peers are alive. The default is 2000ms. Increasing this value can reduce network chatter but might delay detection of a failed node. Another common option is gcache.size, which determines the size of the transaction buffer (Write Set Cache) that Galera uses for replication. A larger gcache.size allows nodes to catch up more easily after being offline for longer periods, but consumes more memory. For instance, gcache.size=1G sets it to 1 Gigabyte.

When node3 successfully joins the cluster, its wsrep_cluster_status will change from Disconnected to Primary. You can check this with:

SHOW STATUS LIKE 'wsrep_cluster_status';

The most common pitfall when configuring Galera providers is misconfiguration of the wsrep_cluster_address. If node3 cannot reach any of the IPs listed, or if the wsrep_cluster_name doesn’t match, it will fail to join. Another frequent issue is the SST method not being properly configured or accessible between nodes, especially firewall rules blocking the necessary ports (default 3306 for MariaDB, 4567 for Galera replication, and the port used by the SST method, e.g., 873 for rsync).

After successfully configuring all nodes and joining them, the next challenge you’ll likely face is managing network partitions and understanding Galera’s quorum behavior.

Want structured learning?

Take the full Mariadb course →