Using MySQL Utilities Workbench Script mysqldbcompare To Compare Two Databases In Replication

In my last two posts, I wrote about setting up replication with MySQL 5.6 using Global Transaction Identifiers. Even when I set up replication “the old-fashioned way“, one thought always enters my mind – did all of the data copy over to the slave? And, even after the master/slave has been running for a while, I am always wondering if the data in the slave matches the master. Or did the change that I made to that table make it over to the slave? It is probably more of a case of paranoia on my part, as MySQL replication is very reliable and works really well.

A few months ago, I started writing about the MySQL Utilities. If you haven’t heard about the MySQL Utilities:

“MySQL Utilities is a package of utilities that are used for maintenance and administration of MySQL servers. These utilities encapsulate a set of primitive commands, and bundles them so they can be used to perform macro operations with a single command. MySQL Utilities may be installed via MySQL Workbench, or as a standalone package. The utilities are written in Python, available under the GPLv2 license, and are extendable using the supplied library. They are designed to work with Python 2.x greater than 2.6.” (from the introduction to MySQL Utilities page)

In order to reduce my paranoia (I will never be able to eliminate it), I can simply use the mysqldbcompare utility. The mysqldbcompare utility “compares the objects and data from two databases to find differences. It identifies objects having different definitions in the two databases and presents them in a diff-style format of choice. Differences in the data are shown using a similar diff-style format. Changed or missing rows are shown in a standard format of GRID, CSV, TAB, or VERTICAL.” (from: mysqldbcompare — Compare Two Databases and Identify Differences)

You don’t have to use mysqldcompare to compare two databases in replication. You may compare any two databases, even if they are on the same server. But for this post, I will be comparing two databases on two separate servers in a master/slave replication topology.

The utility is fairly easy to use. You just identify the two servers and which databases you want to compare. You will need to refer to my earlier post on using the MySQL Utilities for more information on how to execute the scripts.

I will be comparing a database that is on my master server (at 192.168.1.2) and the same database that is on one of the slaves (at 192.168.1.122) connected to that master. Instead of using root to execute the scripts, I create and use a MySQL user named “scripts” to use when I run a script. The syntax for mysqldbcompare is fairly easy, and you can refer to the mysqldbcompare man page for more of the commands and their usage:

I need to specify the servers, user name and passwords:

--server1=scripts:scripts999@192.168.1.2 
--server2=scripts:scripts999@192.168.1.122 

The name of the databases to compare (database_server1:database_server2):

cbgc:cbgc 

Do not stop the script at the first difference that is found. Process all objects.

--run-all-tests 

Specify the server to show transformations to match the other server.

--changes-for=server2 

Specify the difference display format. Permitted format values are unified, context, differ, and sql. The default is unified.

--difftype=sql

Now that I have decided on which options to use, I can run the scripts from within the MySQL Workbench Utilities shell:

$ mysqldbcompare --server1=scripts:scripts999@192.168.1.2 \
  --server2=scripts:scripts999@192.168.1.122 \
  cbgc:cbgc --run-all-tests --changes-for=server2 --difftype=sql
# server1 on 192.168.1.2: ... connected.
# server2 on 192.168.1.122: ... connected.
# Checking databases cbgc on server1 and cbgc on server2
#
#                                                   Defn    Row     Data   
# Type      Object Name                             Diff    Count   Check  
# ------------------------------------------------------------------------- 
# TABLE     Activity_Affiliate                      pass    pass    pass    
# TABLE     FedEx_2nd_Day                           pass    pass    pass    
# TABLE     FedEx_Express_Saver                     pass    pass    pass    
# TABLE     FedEx_Ground                            pass    pass    pass    
# TABLE     FedEx_Home                              pass    pass    pass    
# TABLE     FedEx_Priority_Overnight                pass    pass    pass    
# TABLE     FedEx_Standard_Overnight                pass    pass    pass    
# TABLE     Orders                                  pass    pass    pass    
# TABLE     USPS                                    pass    pass    pass    
# TABLE     activity                                pass    pass    pass    
# TABLE     comics                                  pass    pass    pass    
# TABLE     coupons                                 pass    pass    pass    
# TABLE     customer                                pass    pass    pass    
# TABLE     giftcert                                pass    pass    pass   

Databases are consistent.
#
# ...done

The output shows that my databases are consistent. I have a rather small database, and I re-executed the script again with the “time” command, and here are the time results:

real	0m4.519s
user	0m0.429s
sys	0m0.068s

It took about 4.5 seconds to execute on my database which is about 25 megabytes in size. Obviously, the time will increase relative to the complexity and size of your database. And the time will increase relative to the number of differences that the script finds.

In order to show you what happens when the databases are not in sync, I will now stop the slave database (at 192.168.1.122), and then add a line to one of the tables in the master database (at 192.168.1.2), and re-run the script (while the slave is still stopped).

On the slave server:

mysql> stop slave;
Query OK, 0 rows affected, 1 warning (0.00 sec)

On the master server:

mysql> insert into `cbgc`.`activity` ( `Customer_Serial_Number`, `Customer_Activity_Action`) \
          values ( '1201201', 'Test Visit');
Query OK, 1 row affected (0.96 sec)

Now I can run the mysqldbcompare script again. The slave is still turned off:

$ mysqldbcompare --server1=scripts:scripts999@192.168.1.2 \
  --server2=scripts:scripts999@192.168.1.122 cbgc:cbgc \ 
  --run-all-tests --changes-for=server2 --difftype=sql
# server1 on 192.168.1.2: ... connected.
# server2 on 192.168.1.122: ... connected.
# Checking databases cbgc on server1 and cbgc on server2
#
#                                                   Defn    Row     Data   
# Type      Object Name                             Diff    Count   Check  
# ------------------------------------------------------------------------- 
# TABLE     Activity_Affiliate                      pass    pass    pass    
# TABLE     FedEx_2nd_Day                           pass    pass    pass    
# TABLE     FedEx_Express_Saver                     pass    pass    pass    
# TABLE     FedEx_Ground                            pass    pass    pass    
# TABLE     FedEx_Home                              pass    pass    pass    
# TABLE     FedEx_Priority_Overnight                pass    pass    pass    
# TABLE     FedEx_Standard_Overnight                pass    pass    pass    
# TABLE     Orders                                  pass    pass    pass    
# TABLE     USPS                                    pass    pass    pass    
# TABLE     activity                                FAIL    FAIL    FAIL    
#
# Transformation for --changes-for=server2:
#

ALTER TABLE cbgc.activity 
  DROP PRIMARY KEY, 
  ADD PRIMARY KEY(serial_id), 
AUTO_INCREMENT=7542;

# Row counts are not the same among cbgc.activity and cbgc.activity.
#
# Transformation for --changes-for=server2:
#

INSERT INTO cbgc.activity (serial_id, Customer_Serial_Number, Customer_Activity_Action, 
Customer_Activity_Date_Time, Customer_Activity_Info, Notes, HTTP_REFERER) 
VALUES('7541', '1201201', 'Test Visit', NULL, NULL, NULL, NULL);


# TABLE     comics                                  pass    pass    pass    
# TABLE     coupons                                 pass    pass    pass    
# TABLE     customer                                pass    pass    pass    
# TABLE     giftcert                                pass    pass    pass   

# Database consistency check failed.
#
# ...done

The script alerted me to two issues. It tells me that I have one row of data missing (from the insert statement), but it also notices that my auto-increment on the activity database needs to be updated as well. If I simply run the insert statement, my auto-increment value will be incorrect. So, I need to run the auto-increment change first, and then I can execute the insert statement.

There are other choices of output that you can use besides sql – unified, context and differ. You can try each one and see which one will fit your needs. I prefer the sql output.

Since I stopped the slave, I can just start it again and it will catch up with the master. But, if you are working with an active master, you might want to put a read lock on the database, run the script again, and then make your changes. This is especially true if the changes involve something dynamic like a primary key or auto-increment column, as those values could change while you are trying to run the statements.

Let’s see what happens when we someone else makes some changes and we aren’t aware of the changes. But first, I will start the slave so it can catch up to the master (for the earlier missing statements). We will then stop the slave and let someone else make a few changes. And then we can run the mysqldbcompare utility again:

$ mysqldbcompare --server1=scripts:scripts999@192.168.1.2 \
  --server2=scripts:scripts999@192.168.1.122 cbgc:cbgc \
  --run-all-tests --changes-for=server2 --difftype=sql
# server1 on 192.168.1.2: ... connected.
# server2 on 192.168.1.122: ... connected.
# Checking databases cbgc on server1 and cbgc on server2
#
#                                                   Defn    Row     Data   
# Type      Object Name                             Diff    Count   Check  
# ------------------------------------------------------------------------- 
# TABLE     Activity_Affiliate                      pass    pass    pass    
# TABLE     FedEx_2nd_Day                           pass    pass    pass    
# TABLE     FedEx_Express_Saver                     pass    pass    pass    
# TABLE     FedEx_Ground                            pass    pass    pass    
# TABLE     FedEx_Home                              pass    pass    pass    
# TABLE     FedEx_Priority_Overnight                pass    pass    pass    
# TABLE     FedEx_Standard_Overnight                pass    pass    pass    
# TABLE     Orders                                  pass    pass    pass    
# TABLE     USPS                                    pass    pass    pass    
# TABLE     activity                                FAIL    pass    pass    
#
# Transformation for --changes-for=server2:
#

ALTER TABLE cbgc.activity 
  DROP PRIMARY KEY, 
  DROP COLUMN Dummy_Field, 
  ADD PRIMARY KEY(serial_id);


# TABLE     comics                                  pass    pass    pass    
# TABLE     coupons                                 FAIL    pass    pass    
#
# Transformation for --changes-for=server2:
#

ALTER TABLE cbgc.coupons 
  CHANGE COLUMN Coupon_Notes Coupon_Notes varchar(100) NULL;


# TABLE     customer                                pass    pass    pass    
# TABLE     giftcert                                pass    pass    pass   

Databases are consistent.
#
# ...done

This time, the script took about 20 seconds to run:

real	0m20.058s
user	0m0.452s
sys	0m0.224s

We can see from the output that a column named Dummy_Field was dropped. We can also see that the Coupon_Notes column has changed (or is different on the master). I can now take these changes and implement them on the slave (again assuming that these changes aren’t waiting to be sent to the slave). In my case, once I start the slave, the changes will propagate over to the slave, but there may be cases where that transaction was lost or skipped on the slave (for example, if you had to do a SET GLOBAL sql_slave_skip_counter = N on the slave). You now have the ability to easily make the slave the same as the master by executing these differences. If the changes are too complicated, then you might want to look at re-creating your slave.

There are many different options and output possibilities, so I would encourage you to just try mysqldbcompare and see how it works for you. And as always, feel free to post your feedback in the comments section.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
About these ads

4 Responses to Using MySQL Utilities Workbench Script mysqldbcompare To Compare Two Databases In Replication

  1. 2 questions…

    I’m trying to get this to work between our development database and our production database. We make changes to development and then before pushing the code changes I need to verify that the database are the same and all schema changes have been updated on production

    1. Is it possible to pass a test when the auto-increment changes. Many tables are failing due to the auto-increment values changing. This is fine for our needs since all I want to check for is new tables and changes to table schemas

    2. Is it possible to run this on databases with different names, ie… database_dev:database_prod – Same database just different names

    • Tony Darnell says:

      You can run this on individual tables or databases with different names. Just separate the names with a colon: database_server1:database_server2

      I am not sure what you are asking for question number one.

  2. Anand Roongta says:

    I am trying to compare two local databases using the mysqldbcompare utility. The check always shows database consistency check fails. I have tried check and repair command. But to no avail.

    I have used the following code and get

    C:\Program Files\MySQL\MySQL Workbench CE 5.2.47\utilities>mysqldbcompare –s
    erver1=root@localhost –server2=root@localhost ebookshop:ebookpub –run-all-tests

    # server1 on localhost: … connected.
    # server2 on localhost: … connected.
    # Checking databases ebookshop on server1 and ebookpub on server2
    #
    # Object definitions differ. (–changes-for=server1)
    #

    — ebookshop
    +++ ebookpub
    @@ -1,1 +1,1 @@
    -CREATE DATABASE `ebookshop` /*!40100 DEFAULT CHARACTER SET latin1 */
    +CREATE DATABASE `ebookpub` /*!40100 DEFAULT CHARACTER SET latin1 */

    # WARNING: Objects in server1.ebookshop but not in server1.ebookpub:
    # TABLE: books
    #
    # WARNING: Objects in server1.ebookpub but not in server1.ebookshop:
    # TABLE: books1
    #
    # Defn Row Data
    # Type Object Name Diff Count Check
    # ————————————————————————-

    # Database consistency check failed.
    #
    # …done
    I am fairly new to this area. Help is really appreciated.

    • Tony Darnell says:

      The program is working property – it stops after it finds a mismatch. It is showing you that the table “books” is in the ebookshop database but not in the ebookpub database – and vice versa for books1.

      If you change the books1 database to “books”, and re-run the script, it will then proceed and check the remaining databases/tables.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 34 other followers

%d bloggers like this: