MySQL 5.6 Delayed Replication – Making a Slave Deliberately Lag Behind a Master

In the majority of MySQL replication scenarios, you want your slave databases to be a mirror of your master databases. You usually don’t want your slave to be behind your master by more than a few seconds – and your main goal is for your slave to always be in sync with your master. Would you ever want your slave to deliberately be a few seconds, minutes or even hours behind your master? There have been several suggestions from MySQL users over the years regarding this functionality as “feature request” (even though most of the requests were submitted as MySQL “bugs”, which was the easiest way to submit such a request).

The first request (that I could find) was by Jason Garrett, back in August of 2006, and was logged as “bug 21639″. Jason wanted MySQL to “provide a parameter/setting which allows an administrator to specify how many seconds a replication slave will be behind a replication master. This will have the effect of delaying the replcation slave for this number of seconds. This is useful in circumstances where the replication master is at risk of major data change/loss, and allow an administrator to intervene and isolate the slave from the impact. ie. In an environment where an adminstrator may accidentally drop a table.”

Anders Henke followed up the next month (September 2006) with this request (bug 22072) – “Sometimes it’s also very nice being able to look on how some specific record did look last week – before you’ve made those critical changes yesterday, which might (or might not) relate to some issue being reported to you. And of course also without reloading that large database from backups.”

There have been a couple more requests related to delayed replication, and while there were some work-arounds, it is now a feature of 5.6.

On the MySQL web site (in blue below), it gives three examples of why you would want to use time-delayed replication (including what Kay and Anders wanted to do).

Scenario #1 – To protect against user mistakes on the master. A DBA can roll back a delayed slave to the time just before the disaster.

If you have worked with databases for any period of time, I am sure that you have had the experience of accidentally deleting some rows or truncating a table. And, if your database hasn’t had a recent backup, then that data could be lost. And, if you have a fast slave, then that errant command will be executed before you can figure out what you just did. So, you can then forget about using your slave as a backup for that lost data. By setting your slave to lag behind your master for 10 minutes (more or less), you should have enough time to go to the slave, stop it, and export the data that you just lost for import into the master database. Of course you can set this delay for a longer or shorter period, but it should be long enough for you to do what you need to do in order to at least stop the replication event from executing on the slave.

Scenario #2 – To test how the system behaves when there is a lag. For example, in an application, a lag might be caused by a heavy load on the slave. However, it can be difficult to generate this load level. Delayed replication can simulate the lag without having to simulate the load. It can also be used to debug conditions related to a lagging slave.

You can test how your particular application and system behaves when the slave is lagging behind the master. Normally you would generate a huge load on the master so that the slave is bogged down and thus a lag is generated. By delaying the slave, you can simulate this lag without having to generate the load on the master. You can also use this to debug any applications or conditions that are related to a lagging slave.

Scenario #3 – To inspect what the database looked like long ago, without having to reload a backup. For example, if the delay is one week and the DBA needs to see what the database looked like before the last few days’ worth of development, the delayed slave can be inspected.

Having a slave delayed by a long time – for example, a few days or longer – can give you a snapshot of what the database looked like before the last round of database development. If you are working on an application and are making changes that effect the database, you can compare the two database states. You might wonder “how would the application perform if I made these changes to the database?”. You could then make the changes and test the application on both servers to see the differences.

To set the amount of time that your slave will log behind the master, simply execute this command on the slave (you have to stop the slave first):

CHANGE MASTER TO MASTER_DELAY = N;

- where “N” is the number of seconds that you want the slave to lag behind the master.

On a slave, let’s execute this statement so that the slave is 10 minutes (600 seconds) behind the master. But, for comparison, let’s do a “show slave status\G” on the slave before we make the change:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.121
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000019
          Read_Master_Log_Pos: 798706
               Relay_Log_File: WEB_SERVER_01-relay-bin.000052
                Relay_Log_Pos: 63004
        Relay_Master_Log_File: mysql-bin.000019
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 798706
              Relay_Log_Space: 63491
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 4
                  Master_UUID: bf0fc6b6-3b3a-11e2-99fd-32f021d3be40
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 6CD03F68-3B38-11E2-99FA-588CB3DE3E9D:1-6:8-24:26-1497,
BF0FC6B6-3B3A-11E2-99FD-32F021D3BE40:1-1036
            Executed_Gtid_Set: 6CD03F68-3B38-11E2-99FA-588CB3DE3E9D:1-6:8-11:13-16:18:20-24:26-1497,
BF0FC6B6-3B3A-11E2-99FD-32F021D3BE40:1-1036,
E6DF8108-4BB2-11E2-AB25-ECEFE5B0C1B4:1-378
1 row in set (0.00 sec)

Now, let’s execute the statement: (we have to stop the slave first, execute the statement, and restart the slave)

mysql> STOP SLAVE;
Query OK, 0 rows affected (0.07 sec)

mysql> CHANGE MASTER TO MASTER_DELAY = 600;
Query OK, 0 rows affected (0.06 sec)

mysql> START SLAVE;
Query OK, 0 rows affected (0.01 sec)

Here is the “show slave status\G” command after the statement was executed:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.121
                  Master_User: replicate
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000019
          Read_Master_Log_Pos: 798706
               Relay_Log_File: WEB_SERVER_01-relay-bin.000002
                Relay_Log_Pos: 314
        Relay_Master_Log_File: mysql-bin.000019
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 798706
              Relay_Log_Space: 526
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 4
                  Master_UUID: bf0fc6b6-3b3a-11e2-99fd-32f021d3be40
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 600
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 6CD03F68-3B38-11E2-99FA-588CB3DE3E9D:1-6:8-11:13-16:18:20-24:26-1497,
BF0FC6B6-3B3A-11E2-99FD-32F021D3BE40:1-1036,
E6DF8108-4BB2-11E2-AB25-ECEFE5B0C1B4:1-378
1 row in set (0.01 sec)

SHOW SLAVE STATUS has three fields that provide information about the delay, and you can see their values in the above output: (from http://dev.mysql.com/doc/refman/5.6/en/replication-delayed.html in blue)

1. SQL_Delay: A nonnegative integer indicating the number of seconds that the slave must lag the master.

                    SQL_Delay: 600

You can see above that it has changed to 600.

2. SQL_Remaining_Delay: When Slave_SQL_Running_State is Waiting until MASTER_DELAY seconds after master executed event, this field contains an integer indicating the number of seconds left of the delay. At other times, this field is NULL.

The value of “SQL_Remaining_Delay” from my “show slave status\G” command shows “NULL“.

          SQL_Remaining_Delay: NULL

The value is “NULL” because nothing has changed on the master that hasn’t been updated on the slave, so it doesn’t show the remaining seconds that the slave will lag behind the master. I changed a record on the master, executed the “show slave status\G” command again, and you can see the SQL_Remaining_Delay status changed:

                    SQL_Delay: 600
          SQL_Remaining_Delay: 592
      Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event

Also, if you are running the Enterprise version of MySQL and are using MySQL Enterprise Monitor (MEM), you will now see a change in the “Time Behind” column under the replication tab for this particular server, which is labeled as VM-Mac-1081-123.

Here is a snapshot from MEM before I updated the master:

And here is a screenshot from MEM after I updated the master, which shows the slave starting to lag behind the master, as it is waiting 600 seconds before applying the update:

Once the slave has executed all statements on the master, in MEM you will see that the “Time Behind” column for VM-Mac-1081-123 will reset back to 00:00:00. This does not mean that the slave isn’t still lagging behind the master by 600 seconds. It just shows that the slave is current with the master.

In my MEM example, I am running a slave (VM-Mac-1081-123) off another slave (with IP address of 192.168.1.121), which explains why MEM is stating that 192.168.1.121 is not being monitored. That MySQL instance is being monitored, but it is a slave to another master (192.168.1.2), and is being monitored below the screenshot that I provided and is not shown in this example.

3. Slave_SQL_Running_State: A string indicating the state of the SQL thread (analogous to Slave_IO_State). The value is identical to the State value of the SQL thread as displayed by SHOW PROCESSLIST.

You can see that the Slave_SQL_Running_State value from my “show slave status\G” command is as follows:

      Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event

The above value shows that we have pending statements on the master that have not been applied to the slave.

Let’s run the SHOW PROCESSLIST statement, but at this time we do not have any pending statements on the master: (I removed all other processes that aren’t related to this discussion)

mysql> show processlist;
+-----+-------------+-----------------+-------+---------+------+-----------------------------------------------------------------------------+------------------+
| Id  | User        | Host            | db    | Command | Time | State                                                                       | Info             |
+-----+-------------+-----------------+-------+---------+------+-----------------------------------------------------------------------------+------------------+
| 130 | system user |                 | NULL  | Connect |  379 | Slave has read all relay log; waiting for the slave I/O thread to update it | NULL             |
| 129 | system user |                 | NULL  | Connect |  379 | Waiting for master to send event                                            | NULL             |
+-----+-------------+-----------------+-------+---------+------+-----------------------------------------------------------------------------+------------------+

So, even with the slave delay, since we do not have any pending statements from the master, the value of “Slave_SQL_Running_State” looks like the normal value that you have when you are not running a delay and the slave is current with the master. Now I will change something on the master, and re-run the SHOW PROCESSLIST statement:

mysql> show processlist;
+-----+-------------+-----------------+-------+---------+------+----------------------------------------------------------------+------------------+
| Id  | User        | Host            | db    | Command | Time | State                                                          | Info             |
+-----+-------------+-----------------+-------+---------+------+----------------------------------------------------------------+------------------+
| 130 | system user |                 | NULL  | Connect | 1784 | Waiting until MASTER_DELAY seconds after master executed event | NULL             |
| 129 | system user |                 | NULL  | Connect | 2503 | Waiting for master to send event                               | NULL             |
+-----+-------------+-----------------+-------+---------+------+----------------------------------------------------------------+------------------+

You can see that the value of “State” has changed from:

Slave has read all relay log; waiting for the slave I/O thread to update it

to:

Waiting until MASTER_DELAY seconds after master executed event“.

After our MASTER_DELAY value of 600 seconds has passed, the value of “State” will revert back.

And when I run my “show slave status\G” statement again, the value for “Slave_SQL_Running_State” has changed, since I have a pending statement on the master:

      Slave_SQL_Running_State: Waiting until MASTER_DELAY seconds after master executed event

I now have a slave that is lagging 10 minutes behind the master. If I ever have a situation where I accidentally delete some data, I now have 10 minutes to stop the slave and try to retrieve my data from the slave. In a situation like this, I could try to stop the slave from executing the delete statement, but it might be easier to just export the data I need and re-import it back into the master. My deletion command will eventually be executed on the slave, but so will my importing of the deleted data.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
About these ads

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 33 other followers

%d bloggers like this: