Semi-synchronous replication in MariaDB is a way to guarantee that committed transactions are actually written to at least one replica before the primary acknowledges the commit to the client. This prevents data loss if the primary crashes or becomes unavailable.
Here’s a look at how it works and how to set it up:
How it Works
Normally, MariaDB asynchronous replication works like this:
- Client commits a transaction on the primary.
- Primary writes to its own binary log, then immediately acknowledges the commit to the client.
- Replica(s) read the binary log from the primary and apply the transactions.
The problem is that if the primary crashes after acknowledging the commit but before the replica(s) have received and applied that transaction, that data is lost forever.
Semi-synchronous replication adds a crucial step:
- Client commits a transaction on the primary.
- Primary writes to its own binary log, then sends the transaction data to at least one replica.
- Replica(s) receive the transaction data, write it to their relay log, and acknowledge receipt back to the primary.
- Primary waits for at least one replica’s acknowledgment before acknowledging the commit to the client.
This ensures that at least one replica has a copy of the committed data before the client even knows the commit was successful.
Setting Up Semi-Synchronous Replication
MariaDB’s semi-synchronous replication is implemented using the rpl_semi_sync_master and rpl_semi_sync_slave plugins.
1. Install the Plugins (if not already installed)
These plugins are usually compiled into MariaDB by default. You can check if they are loaded with:
SHOW PLUGINS;
If they are not present, you might need to recompile MariaDB with appropriate options or install a pre-compiled version that includes them.
2. Configure the Primary (Master)
On your primary server, you need to enable the semi-sync master plugin and configure it.
-
Load the plugin:
INSTALL PLUGIN rpl_semi_sync_master SONAME 'rpl_semi_sync_master.so'; -
Configure the plugin:
The key setting is
rpl_semi_sync_master_timeout. This is the time (in milliseconds) the primary will wait for a replica acknowledgment before it gives up and reverts to asynchronous mode. If you set this too low, you might get false positives where the primary thinks it lost sync when it’s just a temporary network blip. If set too high, you increase the latency for your commits. A good starting point is5000(5 seconds).SET GLOBAL rpl_semi_sync_master_timeout = 5000;You also need to set the
rpl_semi_sync_master_enabledtoON.SET GLOBAL rpl_semi_sync_master_enabled = ON;To make these settings persistent across restarts, add them to your
my.cnformy.inifile under the[mariadb]or[mysqld]section:[mariadb] plugin_load_add = "rpl_semi_sync_master.so" rpl_semi_sync_master_timeout = 5000 rpl_semi_sync_master_enabled = ON
3. Configure the Replica (Slave)
On your replica server(s), you need to enable the semi-sync slave plugin.
-
Load the plugin:
INSTALL PLUGIN rpl_semi_sync_slave SONAME 'rpl_semi_sync_slave.so'; -
Enable the plugin:
SET GLOBAL rpl_semi_sync_slave_enabled = ON;For persistence, add to your
my.cnformy.ini:[mariadb] plugin_load_add = "rpl_semi_sync_slave.so" rpl_semi_sync_slave_enabled = ON
4. Monitor Semi-Synchronous Replication
You can check the status of your semi-synchronous replication using these status variables:
-
On the primary:
SHOW GLOBAL STATUS LIKE 'Rpl_semi_sync_master%';Key variables to watch:
Rpl_semi_sync_master_clients: Number of replicas connected that support semi-sync.Rpl_semi_sync_master_acks_ok: Number of transactions acknowledged by at least one replica.Rpl_semi_sync_master_acks_received: Number of acknowledgments received by the primary.Rpl_semi_sync_master_net_wait_times: Total time spent waiting for acknowledgments.Rpl_semi_sync_master_net_avg_wait_time: Average time spent waiting for acknowledgments.Rpl_semi_sync_master_errors: Number of times the primary timed out waiting for acknowledgments.
-
On the replica:
SHOW GLOBAL STATUS LIKE 'Rpl_semi_sync_slave%';Key variables to watch:
Rpl_semi_sync_slave_acks_sent: Number of acknowledgments sent by this replica.Rpl_semi_sync_slave_net_delay: The network delay for sending acknowledgments (can indicate network issues).
Important Considerations:
- Network Latency: Semi-synchronous replication adds latency to your commits because the primary waits for acknowledgment. High network latency between your primary and replicas will directly impact your application’s write performance.
- Replica Availability: If all your replicas become unavailable, the primary will eventually time out (based on
rpl_semi_sync_master_timeout) and revert to asynchronous replication. This is a safety mechanism to prevent your primary from halting all writes. - Multiple Replicas: For true data loss prevention, you should ideally have at least two replicas. This way, if one replica goes down, the primary can still receive an acknowledgment from the other.
- Failover: Semi-synchronous replication significantly improves the safety of failover. When you promote a replica, you can be much more confident that it has all the transactions the old primary committed.
The One Thing Most People Don’t Know
When the primary times out waiting for an acknowledgment, it doesn’t immediately stop accepting writes. Instead, it logs an error and temporarily switches to asynchronous replication. It will periodically attempt to re-establish semi-synchronous communication. If it succeeds, it will switch back to semi-synchronous mode. This behavior is controlled by rpl_semi_sync_master_fallback_to_unsafe_slave (which defaults to ON), allowing writes to continue albeit with a reduced guarantee. If this is OFF and a timeout occurs, the primary will refuse to commit any further transactions until it can successfully send and receive an acknowledgment from at least one replica.
Next Steps
Once you have semi-synchronous replication working, the next logical step is to explore more advanced replication topologies, such as multi-source replication or setting up read replicas for load balancing.