MariaDB delayed replication is your safety net for when things go spectacularly wrong on your primary database. It’s not just about backing up; it’s about having a point-in-time copy that’s intentionally behind, ready to be promoted when your main server takes a nosedive or, more commonly, when a bad query or accidental DROP TABLE wipes out your production data.
Let’s say you’ve accidentally run DELETE FROM users; without a WHERE clause on your primary. If you have a replica configured with a delay, you can stop replication on that replica, promote it to become the new primary, and then fix whatever went wrong on the old primary before re-joining it as a replica.
Here’s how to set it up and recover:
Setting Up Delayed Replication
The core of delayed replication is the CHANGE REPLICATION SOURCE TO command (or CHANGE MASTER TO in older versions) with the SOURCE_DELAY option.
1. Configure SOURCE_DELAY on your replica:
On your replica server, execute:
CALL mysql.rds_set_replication_source_delay(3600); -- Set a 1-hour delay
This command sets the delay in seconds. In this example, the replica will wait for 3600 seconds (1 hour) before applying any events it receives from the primary.
2. Start replication:
CALL mysql.rds_start_replication;
3. Verify the delay:
You can check the status of replication and the delay using:
SHOW REPLICA STATUS\G
Look for these key fields:
Replica_IO_Running: Should beYes.Replica_SQL_Running: Should beYes.Seconds_Behind_Source: This will show the current delay, which should be close to the value you set (e.g., 3595-3600 seconds).
Recovering from a Mistake
When disaster strikes (e.g., accidental DROP DATABASE), here’s the recovery process:
1. Stop replication on the delayed replica:
This is crucial. You don’t want the bad changes to propagate to your recovery point.
CALL mysql.rds_stop_replication;
2. Check the replica’s status to confirm it has stopped:
SHOW REPLICA STATUS\G
Ensure Replica_SQL_Running is No.
3. Identify the exact point of divergence:
You need to know when the mistake happened. The SHOW REPLICA STATUS\G output is your friend here.
Exec_Source_Log_Pos: The position in the primary’s binary log that the replica has just applied.Exec_Source_Log_File: The binary log file corresponding toExec_Source_Log_Pos.
The mistake occurred after this position. You’ll also want to look at the Seconds_Behind_Source value just before you stopped replication; this gives you a rough idea of how far back in time you are.
4. Promote the replica to primary:
This is the critical step where your delayed replica becomes the new source of truth.
-
On the delayed replica:
- Crucially, reset the replication credentials. You don’t want the old primary (which might still be running with bad data) trying to send it logs.
-- If you have binary logging enabled on the replica (which you should for promotion) -- and it's trying to be a primary, it needs a unique server-id. -- If you are promoting it, you might want to stop it from acting as a replica. -- The exact commands depend on your setup, but often involves stopping it and reconfiguring. -- For RDS, you might just promote it via the AWS console. -- For self-hosted, you might need to: -- 1. Stop the MySQL service. -- 2. Edit my.cnf to remove or comment out replication-related settings if it was configured as a replica. -- 3. Ensure it has a unique server-id. -- 4. Start the MySQL service. - Set it as read-write: By default, replicas are read-only.
SET GLOBAL read_only = OFF; - Take a backup of this promoted replica. This is your new, clean primary.
- Crucially, reset the replication credentials. You don’t want the old primary (which might still be running with bad data) trying to send it logs.
-
On the old primary (which is now offline or has bad data):
- Stop replication:
CALL mysql.rds_stop_replication; - Take a backup. This is your snapshot of the "bad" state.
- Clean up the bad data: This is where you fix the mistake. If it was a
DELETEwithoutWHERE, you might need to restore specific tables from a backup taken before the mistake, or carefully re-insert data.
- Stop replication:
5. Reconfigure the old primary as a new replica:
Once the old primary is cleaned up, you can reconfigure it to replicate from the new primary (your promoted delayed replica).
- On the old primary:
- Ensure it has a unique
server-iddifferent from the new primary. - Configure it to replicate from the new primary: You’ll need the new primary’s IP address, a replication user and password, and the correct binary log file and position. You can get the current binary log file and position from the new primary using
SHOW MASTER STATUS;.-- Example for self-hosted MySQL/MariaDB -- First, get the current binlog position from the NEW primary: -- SHOW MASTER STATUS; -- Note down File and Position -- On the OLD primary (now to be a replica): CALL mysql.rds_set_replication_source_delay(0); -- No delay for the new replica CALL mysql.rds_set_replication_source('new_primary_ip', 'replication_user', 'replication_password'); CALL mysql.rds_start_replication;
- Ensure it has a unique
6. Verify replication is working:
On the old primary (now the new replica), check SHOW REPLICA STATUS\G. Seconds_Behind_Source should be low, and Replica_IO_Running and Replica_SQL_Running should both be Yes.
The next hurdle you’ll likely face is ensuring consistent server-ids across your cluster and understanding how to manage replication users and permissions securely when re-establishing replication links.