Uncategorized | Scripting MySQL

Adding a replicated MySQL database instance using a Group Replication server as the source

March 25, 2019 Leave a comment

You say you want a Replication?

One of the best features of MySQL is the ability to use MySQL‘s built-in database replication feature to automatically replicate data from one server (source/master) to another (slave/replica). Group Replication was added in MySQL 5.7 as a way to provide a high-availability solution using a new variation of MySQL replication.

(In some earlier posts, I explained how to setup Group Replication using three MySQL database servers and how to create a sandbox for testing MySQL 8.0 InnoDB Cluster.)

The “regular version” of MySQL replication is still very powerful and relatively easy to setup. Some advantages of replication in MySQL include:

Scale-out solutions – spreading the load among multiple slaves to improve performance. In this environment, all writes and updates must take place on the master server. Reads, however, may take place on one or more slaves. This model can improve the performance of writes (since the master is dedicated to updates), while dramatically increasing read speed across an increasing number of slaves.
Data security – because data is replicated to the slave, and the slave can pause the replication process, it is possible to run backup services on the slave without corrupting the corresponding master data.
Analytics – live data can be created on the master, while the analysis of the information can take place on the slave without affecting the performance of the master.
Long-distance data distribution – you can use replication to create a local copy of data for a remote site to use, without permanent access to the master.

Source: https://dev.mysql.com/doc/refman/8.0/en/replication.html

Even if you have Group Replication installed, you still might want to replicate that data to a separate server for various reasons. This post will explain a little about MySQL replication, and show you how to connect a replica (slave) server to a single server in a Group Replication group.

What is replication and how does it work?

To use MySQL replication, you need to understand something called the binary log (also called the binlog).

The binary log contains “events” that describe database changes such as table creation operations or changes to table data. It also contains events for statements that potentially could have made changes (for example, a DELETE which matched zero rows), unless row-based logging is used. The binary log also contains information about how long each statement took that updated data. The binary log has two important purposes:

For replication, the binary log on a master replication server provides a record of the data changes to be sent to slave servers. The master server sends the events contained in its binary log to its slaves, which execute those events to make the same data changes that were made on the master. See Section 17.2, “Replication Implementation”.
Certain data recovery operations require use of the binary log. After a backup has been restored, the events in the binary log that were recorded after the backup was made are re-executed. These events bring databases up to date from the point of the backup. <a target=new href=https://dev.mysql.com/doc/refman/8.0/en/point-in-time-recovery.htmlSee Section 7.5, “Point-in-Time (Incremental) Recovery Using the Binary Log”.

Source: https://dev.mysql.com/doc/refman/8.0/en/binary-log.html

The replication process is simple. The replica (slave) database connects to the source (master) database via the I/O thread and retrieves the events from the binary log which have occurred since the last time it connected to the source (master) database. The source (master) database performs a binlog dump (which contains the new events), and the replica’s I/O thread transfers this dump to the replica (slave) server, and then stores these events on the replica’s MySQL instance in a “binary-log-type” file known as the relay log. Another thread (the SQL thread) then reads the events from the relay log and applies these changes to the database. This type of replication is asynchronous replication, because it is a one-way transfer of data (See the image below). There is another replication option called semi-synchronous, which ensures that the transaction on the source (master) database is written to the relay log of the replica (slave) before the transaction is committed.

For more information about the threads, see: https://dev.mysql.com/doc/refman/8.0/en/replication-implementation-details.html

The replica (slave) database has two options for retrieving the transactions from the source (master) database. The first is an older method, which requires the replica (slave) server to keep track of the last binary log accessed on the master for retrieving the transactions. The replica (slave) also has to keep track of the last position within the last binary log it used. For example, the last binary log was mysql-bin.000343 and the last position in that log was 49583. So, the next time the replica (slave) connects to the source (master) database, it asks for all of the transactions which occurred after this position (49583) in this binary log (mysql-bin.000343).

Global Transaction Identifiers

The second method involves using global transaction identifiers (GTID), which was introduced in MySQL version 5.6. A GTID is a unique identifier created and associated with each transaction committed on the server of origin (the source/master database). This identifier is unique not only to the server on which it originated, but is unique across all servers in a given replication topology. Source: https://dev.mysql.com/doc/refman/8.0/en/replication-gtids-concepts.html

The GTID is comprised of a server’s UUID (a 36-character unique server identifier), a colon, and an incremental number. A GTID for a single transaction would look something like this: 3E11FA47-71CA-11E1-9E33-C80AA9429562:23 (as in: UUID:transaction number).

With GTID replication, the replica (slave) doesn’t need to keep track of the last binary log being used, nor does it need to remember the position with that binary log. Since the GTID’s contain incremental numbers, the replica (slave) only has to remember the last GTID it processed. For example, the source (master) server may have processed a group of GTID’s such as 3E11FA47-71CA-11E1-9E33-C80AA9429562:1-30493. If the last GTID the replica (slave) retrieved was 3E11FA47-71CA-11E1-9E33-C80AA9429562:30201, then the replica (slave) will tell the source (master) that it needs every transaction (GTID) after 3E11FA47-71CA-11E1-9E33-C80AA9429562:30201.

Also, using GTID’s makes it much easier to see how many transactions have been executed on the source (master) database, and how many transactions have been retrieved and applied by the replica (slave) database. On the replica (slave) database, after replication has been started, the “SHOW SLAVE STATUS” command will display something like this:

mysql> SHOW SLAVE STATUS\G
…
Retrieved_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-35683
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-34652
…

This tells the database administrator (DBA) that the replica (slave) server has retrieved 35683 transactions from the source (master) database, but has only applied (executed) 34652 transactions.

GTID’s also make it easier to skip bad transactions on the slave. Let’s say that the last transaction, 34652, was “bad”. The DBA would need to view that transaction by looking at it from the relay log using the mysqlbinlog tool (example: mysqlbinlog –include-gtids=8e2f4761-c55c-422f-8684-d086f6a1db0e:34652). And, let’s assume it was safe to delete this transaction, then the DBA would issue this command, effectively skipping this transaction and setting the next GTID to be used as 8e2f4761-c55c-422f-8684-d086f6a1db0e:34653:

STOP SLAVE;
SET GTID_NEXT=”8e2f4761-c55c-422f-8684-d086f6a1db0e:34653″;
BEGIN; COMMIT;
SET GTID_NEXT=”AUTOMATIC”;
START SLAVE;

Group Replication

Group Replication (GR) was introduced in MySQL version 5.7, and GR allows you to have a minimum of three servers in a high-availability group (with a maximum of nine servers). Groups can operate in a single-primary mode with automatic primary election, where only one server accepts updates at a time. Alternatively, for more advanced users, groups can be deployed in multi-primary mode, where all servers can accept updates, even if they are issued concurrently. Source: https://dev.mysql.com/doc/refman/8.0/en/group-replication.html

Group Replication requires the use of Global Transaction Identifiers (GTID’s). With “regular” single-source replication (master/slave), you can also use GTID’s – or the older method of specifying the binary log and position in the binary log. However, for replicating from a Group Replication group member, you must use GTID’s.

Replicating from a member in Group Replication

To replicate the data from a member of a Group Replication group, you simply need to point the replica (slave) database to one of the nodes in the group. This is the same method you would use if you were simply replicating off a single server. However, if the Group Replication server you are using for the source (master) goes down, then you will need to point the replica (slave) database to another member of the group. Since Group Replication uses the same UUID in the GTIDs, you can simply point the replica (slave) to any of the other servers in the group and replication will continue without any problems.

Configuring the replica (slave) server

You will need to add the following to your MySQL configuration file (my.cnf or my.ini) under the [mysqld] section, to enable replication using GTID’s:

gtid-mode = on
enforce-gtid-consistency = 1
master-info-repository=TABLE
relay-log-info-repository=TABLE

binlog_format=row
server-id = 3456 # this must be a unique number for each server
log-slave-updates
relay-log-recovery

For this example, I will be using a newly-created (without any data) Group Replication group, and I will be adding a new replica (slave) server to it.

NOTE: If you have a Group Replication group with data on it, then you will need to import a copy of your current data over to the new replica (slave) server before starting replication. For more information – see “What if my Group Replication (GR) group already has data?”

After you have added the variables above to the MySQL server and rebooted the instance, you are now ready to make the server a replica (slave) of one of the servers in the Group Replication group. But first, we need to add a replication user to the Group Replication group.

Configuring the source (master) server

The new replica (slave) server will be replicating off a new three-node Group Replication (GR) group. I have already started Group Replication, and the GR is in single-primary mode.

Before I turn on replication to my new replica (slave) server, I can use the SHOW MASTER STATUS command to look at the primary server to see if any transactions have been executed yet (under the Executed_Gtid_Set heading):

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
File: mysql-bin.000001
Position: 1164
Binlog_Do_DB:
Binlog_Ignore_DB:
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3
1 row in set (0.00 sec)

The Executed_Gtid_Set shows three transactions. When I look at these transactions using mysqlbinlog, I can see all three transactions are a SET TIMESTAMP command:

# mysqlbinlog mysql-bin.000001
...
SET TIMESTAMP=1553124003/*!*/;
COMMIT
...

When you start Group Replication and a single member to the group, a view of the group is created, and this SET TIMESTAMP is executed. Since I have three members in this GR group, I will have three SET TIMESTAMP transactions. These transactions will be sent over to the new slave, but the transactions are harmless. I could tell the replica (slave) to ignore the transactions, but for this example, I will use them to see if replication is working.

When I created the Group Replication group, I already created a replication user named rpl_user on all three servers in the group. I can use this same user for my new single replica (slave). You don’t need to create this user on the replica (slave) server. However, I created the user with a domain wildcard (%) – so you might want to restrict user access via a domain or IP address.

Starting the replica (slave) server

To start replication, you need to tell the replica (slave) server which server you will use to retrieve data. You do this with the CHANGE MASTER command:

mysql> CHANGE MASTER TO 
    -> MASTER_HOST = '192.168.1.152',
    -> MASTER_PORT = 3306,
    -> MASTER_USER = 'rpl_user',
    -> MASTER_PASSWORD = 'R3plic4tion!',
    -> MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.04 sec)

You are now ready to start the replica (slave) server with the START SLAVE command:

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

Next, we need to see if replication has started. You can check this with the SHOW SLAVE STATUS command:

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.152
                  Master_User: rpl_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 1164
               Relay_Log_File: Rep01-relay-bin.000002
                Relay_Log_Pos: 1370
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 1164
              Relay_Log_Space: 1570
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 152
                  Master_UUID: 247898e0-4cb7-11e9-97a9-12f28adcadd1
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
1 row in set (0.00 sec)

As you can see above under the Retrieved_Gtid_Set – the replica (slave) has already retrieved and processed the three transactions which were on the Group Replication group.

           Retrieved_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3

You can also see that the I/O and SQL threads are running:

             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

Let’s test replication by adding a database on one of the Group Replication nodes, and then checking to see if it replicates to the replica (slave).

I am going to create the new database on the primary write server – which has an IP address of 192.168.1.151. We pointed the replica (slave) to 192.168.1.152, so the new database will get replicated to 192.168.1.152, and then over to our replica (slave) server (which has an IP address of 192.168.1.220).

On 192.168.1.151:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 1164
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-3
1 row in set (0.00 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
4 rows in set (0.00 sec)

mysql> create database rep_test;
Query OK, 1 row affected (0.01 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| rep_test           |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

And we can see the new database has been replicated to the MySQL server on 192.168.1.152:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| rep_test           |
| sys                |
+--------------------+
5 rows in set (0.00 sec)

And it is now also on the new replica (slave) server:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| rep_test           |
| sys                |
+--------------------+
5 rows in set (0.02 sec)

That’s all you need to do. Replication is now up and running!

What if my Group Replication (GR) group already has data?

If you have an existing GR group with data, you need the new replica (slave) to have a copy of the data that is already in the group – to give the new replica (slave) a starting point for replication. The binary logs will probably not have all of the transactions that have been applied to the database – and even if it did, it would take an extremely long time to replicate those events. It is much easier to restore a backup to the new replica (slave) server.

Once you load the new replica (slave) with the data from a backup, you can then start replication from the last GTID that was executed on the source (master) before you performed the backup. To get a copy of the data to move over to the new replica (slave), you can use the mysqldump utility – or if you are an Enterprise Edition subscriber, you can use the mysqlbackup utility (which is much faster than mysqldump). Using mysqldump will take a considerable amount of time if you have a lot of data (Example: mysqldump will take about an hour to dump 15-20 gigabytes and 4-5 hours to restore 15-20 gigabytes – while mysqlbackup would take less than 10 minutes for both processes. Your actual results may vary.)

After you restore the backup to the replica (slave) server, you can then start replication. Prior to the backup, I executed a few transactions on the Group Replication servers to increase the GTID numbers. I used this mysqlbackup command to backup one of the read-only servers in the Group Replication group:

/usr/local/meb/bin/mysqlbackup --user=mysqlbackup --password --backup-dir=/Users/tonydarnell/hotbackups backup-and-apply-log --with-timestamp --encrypt-password

Also look at the screen output from the backup, and you will see that it contains the GTID information of the source (master) database:

...
190322 21:07:48 MAIN    INFO: GTID_EXECUTED is 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-45695
...

This number also matches the SHOW MASTER STATUS\G output from one of the Group Replication servers:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 13044
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-45695
1 row in set (0.00 sec)

I can now restore this backup to the new replica (slave) server using the mysqlbackup utility. First, I will shut down the MySQL instance, copy the backup to this server, and then I can run the copy-back command:

# /usr/local/meb/bin/mysqlbackup --defaults-file=/etc/my.cnf --backup-dir=/Users/tonydarnell/Desktop/2019-03-22_21-07-46 copy-back
MySQL Enterprise Backup  Ver 8.0.15-commercial for macos10.14 on x86_64 (MySQL Enterprise - Commercial)
Copyright (c) 2003, 2019, Oracle and/or its affiliates. All rights reserved.
...
mysqlbackup completed OK! with 7 warnings

Here are a few things to consider when restoring data:

If you don’t delete the old data directory, you will need to use the force variable at the end of the command to overwrite the existing MySQL data directory.
The MySQL config files (my.cnf or my.ini) are not restored. If you don’t have the same settings for variables like innodb_data_file_path, you will get an error when you try and restart MySQL after the restore.
Be sure to change the owner of your MySQL directories. If you run mysqlbackup as root, the directories will be owned by root and you won’t be able to start MySQL.
```
# cd /usr/local/mysql
# ls -l
total 1224
...
drwxr-x---  26 root    wheel      884 Mar 22 21:30 data
...
# chown -R mysql data
```

The backup from the source (master) server has been restored to the replica (slave), and I have restarted the MySQL database instance. Now I can reset the slave, use the CHANGE MASTER TO command, start the slave, and I will have the same starting point as the source (master) database.

mysql> reset slave;
Query OK, 0 rows affected (0.01 sec)

mysql> CHANGE MASTER TO 
    -> MASTER_HOST = '192.168.1.152',
    -> MASTER_PORT = 3306,
    -> MASTER_USER = 'rpl_user',
    -> MASTER_PASSWORD = 'R3plic4tion!',
    -> MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW SLAVE STATUS\G
... 
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
...
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
... 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-45695
...
mysql>

If you look at the Executed_Gtid_Set, you can see the replica (slave) database has 45695 transactions from the source (master) database. But, since we haven’t started replication yet, and the replica (slave) database has not connected to the source (master) database, the Retrieved_Gtid_Set is blank. Since we restored the first 45695 transactions, those will already be included in the Executed_Gtid_Set.

After more transactions are executed on the source (master) database, when you execute the SHOW SLAVE STATUS command later, you will see something like the following – where the Retrieved_Gtid_Set doesn’t begin with the number one, but instead it begins with the next number (45696) after the original set of GTID’s (45695) that were restored from the backup. Since the replica (slave) didn’t actually retrieve the first 45695 transactions, those won’t appear in the Retrieved_Gtid_Set, but you can see all of the GTID’s are in the Executed_Gtid_Set.

 
mysql> SHOW SLAVE STATUS\G
...
           Retrieved_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:45696-45753
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-45753
...

And we can compare this list to the source (master) database:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 13044
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-45753
1 row in set (0.00 sec)

Now we know the replica (slave) database is caught up to the source (master) database.

Restoring a slave using mysqldump

Using mysqldump to backup your source (master) database and restore it to a replica (slave) database isn’t as easy as using mysqlbackup. With mysqldump, you can’t backup and restore the mysql.user table, which contains all of your user information. So, you can’t backup the mysql database and restore it. Since I only have one database (rep_test) on my Group Replication source (master) database, I can use the following command to create a mysqldump backup file.

/usr/local/mysql/bin/mysqldump --databases rep_test --set-gtid-purged=ON --master-data --add-drop-database --add-drop-table --triggers --user=root -p > /users/tonydarnell/Desktop/2019_03_22_2300_dbdump.txt

This creates a backup file which also includes the GTID information from the source (master) server. If you had more databases, you would need to list them after the databases variable. I have a new install of MySQL version 8.0.15 that I will be using to restore this data, and after copying the backup file to the new replica (slave) server, the restore command is very simple:

# mysql -uroot -p < 2019_03_22_2300_dbdump.txt
Enter password:

I can now take a look at the transactions which have been applied to the server by using the SHOW MASTER STATUS\G command:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 155
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-26
1 row in set (0.00 sec)

And – this Executed_Gtid_Set matches the source (master) database (this was before I added thousands of records as shown in the mysqlbackup example). But – I don’t have any of the users that were created on the master, because I can’t export and import the mysql database. It is easier and probably more secure if you just add whatever users you need on this new replica (slave) database.

If you don’t have access to the users you need, there is a way to get the information from the mysql.user table.

mysql> select user, host from mysql.user;
+------------------+----------------+
| user             | host           |
+------------------+----------------+
| mysqlbackup      | %              |
| root             | %              |
| rpl_user         | %              |
| mysql.infoschema | localhost      |
| mysql.session    | localhost      |
| mysql.sys        | localhost      |
| root             | localhost      |
| root             | macvm151.local |
+------------------+----------------+
8 rows in set (0.00 sec)

In this example, I am only going to re-create the user named mysqlbackup. I could use the SHOW CREATE USER command, but it doesn’t give me the exact format I need to re-create the user:

mysql> show create user mysqlbackup\G
*************************** 1. row ***************************
CREATE USER for mysqlbackup@%: CREATE USER 'mysqlbackup'@'%' IDENTIFIED WITH 'caching_sha2_password' AS '$A$005$:\n*~t^;+s/,+g1L6EDOPfmWHQxa/z7C.mHStg15xDyMq7UzHsi2hTKM10' REQUIRE NONE PASSWORD EXPIRE DEFAULT ACCOUNT UNLOCK PASSWORD HISTORY DEFAULT PASSWORD REUSE INTERVAL DEFAULT PASSWORD REQUIRE CURRENT DEFAULT
1 row in set (0.00 sec)

Instead of trying to extract what I need from the mysql table, I have found it is easier to just re-create the user, and then re-apply the grants. BUT – I don’t really want this to be stored in the binary log, so I can prevent it from writing to the binary log with the SET SQL_LOG_BIN=0 command:

SET SQL_LOG_BIN=0;
CREATE USER 'mysqlbackup'@'%' IDENTIFIED WITH sha256_password BY 'new-password';
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

(I turned on writing to the binary log with the SET SQL_LOG_BIN=1 command.

As for the grants, I have found it is easier to just grab the grants from the mysql.tables_priv table:

mysql> show grants for mysqlbackup;
+----------------------------------------------------------------------------------------------------+
| Grants for mysqlbackup@%                                                                           |
+----------------------------------------------------------------------------------------------------+
| GRANT RELOAD, PROCESS, SUPER, REPLICATION CLIENT ON *.* TO `mysqlbackup`@`%`                       |
| GRANT INSERT, CREATE, DROP, ALTER ON `mysql`.`backup_history_new` TO `mysqlbackup`@`%`             |
| GRANT INSERT, CREATE, DROP ON `mysql`.`backup_history_old` TO `mysqlbackup`@`%`                    |
| GRANT SELECT, INSERT, UPDATE, CREATE, DROP, ALTER ON `mysql`.`backup_history` TO `mysqlbackup`@`%` |
| GRANT INSERT, UPDATE, CREATE, DROP ON `mysql`.`backup_progress` TO `mysqlbackup`@`%`               |
| GRANT SELECT ON `performance_schema`.`replication_group_members` TO `mysqlbackup`@`%`              |
+----------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

And, with a little editing, I can extract the GRANT commands and run these as well – and again, I am going to suppress these from writing to the binary logs.

SET SQL_LOG_BIN=0;
GRANT RELOAD, PROCESS, SUPER, REPLICATION CLIENT ON *.* TO `mysqlbackup`@`%`;
GRANT INSERT, CREATE, DROP, ALTER ON `mysql`.`backup_history_new` TO `mysqlbackup`@`%`;
GRANT INSERT, CREATE, DROP ON `mysql`.`backup_history_old` TO `mysqlbackup`@`%`;
GRANT SELECT, INSERT, UPDATE, CREATE, DROP, ALTER ON `mysql`.`backup_history` TO `mysqlbackup`@`%`;
GRANT INSERT, UPDATE, CREATE, DROP ON `mysql`.`backup_progress` TO `mysqlbackup`@`%`;
GRANT SELECT ON `performance_schema`.`replication_group_members` TO `mysqlbackup`@`%`;
FLUSH PRIVILEGES;
SET SQL_LOG_BIN=1;

I now have all the users I need, and I can start replication:

mysql> select user, host from mysql.user;
+------------------+-----------+
| user             | host      |
+------------------+-----------+
| mysqlbackup      | %         |
| mysql.infoschema | localhost |
| mysql.session    | localhost |
| mysql.sys        | localhost |
| root             | localhost |
+------------------+-----------+
5 rows in set (0.00 sec)

mysql> CHANGE MASTER TO 
    -> MASTER_HOST = '192.168.1.152',
    -> MASTER_PORT = 3306,
    -> MASTER_USER = 'rpl_user',
    -> MASTER_PASSWORD = 'R3plic4tion!',
    -> MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.152
                  Master_User: rpl_user
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000002
          Read_Master_Log_Pos: 9463
               Relay_Log_File: Rep01-relay-bin.000002
                Relay_Log_Pos: 408
        Relay_Master_Log_File: mysql-bin.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 9463
              Relay_Log_Space: 616
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 152
                  Master_UUID: 247898e0-4cb7-11e9-97a9-12f28adcadd1
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-26
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
       Master_public_key_path: 
        Get_master_public_key: 0
1 row in set (0.00 sec)

I am going to test replication by adding another database on the source (master):

mysql> create database rpl_test2;
Query OK, 1 row affected (0.01 sec)

I can now see the new database on the replica (slave) server:

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| rep_test           |
| rpl_test2          |
| sys                |
+--------------------+
6 rows in set (0.00 sec)

And when I check the GTID’s, I can see that GTID 27 (the transaction which created the database rpl_test2) has been retrieved and applied:

mysql> show slave status\G
...
           Retrieved_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:27
            Executed_Gtid_Set: 8e2f4761-c55c-422f-8684-d086f6a1db0e:1-27
...

Since I imported the first 26 GTID’s, those will not appear in the Retrieved_Gtid_Set. But, the Executed_Gtid_Set shows that the new replica (slave) database has all of the same transactions as the source (master) database.

Note: For more information on mysqldump replication options, see: https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html#mysqldump-replication-options

Using a trigger to parse non-conforming data in MySQL

September 21, 2018 1 Comment

In another post, I explained how to use the MySQL Query Re-write Plugin to manipulate data that didn’t exactly match SQL standards of MySQL. In this post, I am going to give you another example on how to use a trigger to parse non-conforming data so the data can be written to a MySQL database.

A customer came to me with a problem. They were using third-party software which produced multiple rows of comma-separated data (like a .csv file), but all of the data wasn’t on a single row. Instead, the data consisted of multiple rows of data, separated by commas and with line feeds after each row of data. Here is a shortened example of the data (the original data was 32-lines long):

Header information which may, or may not, contain one or more commas
Item Name,Type,Description,Value
Potato Chips - Sealed Bag,,,
Manifest Number,Number,MANIFEST_NUMBER,8480014
Manufacturer Code,Number,BRAND_CODE,80375993
Information 1,Number,INFO_ONE,15869563
Information 2,Number,INFO_TWO,6569569665
Information 3,Number,INFO_THREE,562
Planned Start Date,String,PROD_START_DATE,9/13/2018
Planned Start Time,String,PROD_START_TIME,081234

For some reason, the output couldn’t be separated, but the software could use an ODBC connection. They needed to be able to insert this data into a MySQL database, but they only needed certain values – they didn’t need a lot of the descriptive information. Of the example above, they only wanted to insert the information in yellow:

“Header information which may, or may not, contain one or more commas.”
Item Name	Type	Description	Value
Potato Chips – Sealed Bag
Manifest Number	Number	MANIFEST_NUMBER	8480014
Manufacturer Code	Number	BRAND_CODE	80375993
Information 1	Number	INFO_ONE	15869563
Information 2	Number	INFO_TWO	6569569665
Information 3	Number	INFO_THREE	562
Planned Start Date	String	PROD_START_DATE	9/13/2018
Planned Start Time	String	PROD_START_TIME	081234

At first, I suggested using a script (Python, Perl, PHP, etc.) to parse the data before inserting into the database. But, the software which produced the data had the ability to insert the data directly into a MySQL database – but all of the data could only be inserted into one field. They didn’t have an option to add an extra step outside of the database connection – and they were new to MySQL, so they needed some help.

I decided to write a trigger for the table, in order to parse the data as it came into the database, but before it was inserted into a row. I also wanted to keep the original data in a column as well.

The first step was to create a database and a table. I mapped the column names to the first value of each row, skipping the header information which wasn’t needed. The column names are in blue, and the data is in yellow:

“Header information which may, or may not, contain one or more commas.”
Item Name	Type	Description	Value
Potato Chips – Sealed Bag
Manifest Number	Number	MANIFEST_NUMBER	8480014
Manufacturer Code	Number	BRAND_CODE	80375993
Information 1	Number	INFO_ONE	15869563
Information 2	Number	INFO_TWO	6569569665
Information 3	Number	INFO_THREE	562
Planned Start Date	String	PROD_START_DATE	9/13/2018
Planned Start Time	String	PROD_START_TIME	081234

I can then match the columns I want to the data values:

“Header information which may, or may not, contain one or more commas.”
Item Name	Potato Chips – Sealed Bag
Manifest Number	8480014
Manufacturer Code	80375993
Information 1	15869563
Information 2	6569569665
Information 3	562
Planned Start Date	9/13/2018
Planned Start Time	081234

Now, I can create the database and the table. I substituted the spaces in the field names with underscores (_). I also added a primary key column (id_MANIFEST_DATA) and a proper DATETIME column – as I want to combine the two columns PROD_START_DATE and PROD_START_TIME into one column to match MySQL’s DATETIME format. This way, they can perform a search on this column later (if necessary).

mysql> create database MANIFEST_DATA_DB;
Query OK, 1 row affected (0.00 sec)

mysql> use MANIFEST_DATA_DB;
Database changed

mysql> CREATE TABLE `MANIFEST_DATA_DB`.`MANIFEST_DATA_TABLE` (
    ->   `id_MANIFEST_DATA` INT NOT NULL AUTO_INCREMENT,
    ->   `MANIFEST_DATA` VARCHAR(4096) NULL,
    ->   `ITEM_NAME` VARCHAR(1024) NULL,
    ->   `MANIFEST_NUMBER` INT NULL,
    ->   `MANUFACTURER_CODE` VARCHAR(1024) NULL,
    ->   `INFO_ONE` CHAR(32) NULL,
    ->   `INFO_TWO` CHAR(32) NULL,
    ->   `INFO_THREE` CHAR(32) NULL,
    ->   `PROD_START_DATE` CHAR(10) NULL,
    ->   `PROD_START_TIME` CHAR(6) NULL,
    ->   `PROD_TIMESTAMP` DATETIME NULL,
    ->   PRIMARY KEY (`id_MANIFEST_DATA`))
    -> AUTO_INCREMENT = 1000000;
Query OK, 0 rows affected (0.00 sec)

The initial “non-conforming data” will be inserted into the MANIFEST_DATA field, so all I have to do is to create a trigger to parse this field before the data is inserted.

Even though the data is on separate lines, parsing this data will be relatively easy, since the data is comma-delimited. I can use the SUBSTRING_INDEX function to create an array to store all of the data. But, since the first line of the data may or may not contain a comma, instead of counting commas from the beginning of the data, I will start at the end. Also, in this example, they don’t need the first line of data, as it is header information.

Let’s take a look at why I want to count backwards. Here are three rows of data – where the first column may or may not contain a comma – or it might contain two or more commas. I really only want to capture the last two columns of data.

"Hello, it's me",12345,ABCDE
"Hello it's me",67890,FGHIJ
"Hello, yes, it's me",112233,A1B2C3

If I parse the data based upon commas and start at the beginning, I will get different results when counting commas from the beginning of the data when the first line of data contains a comma. And I only want the data in green:

1	2	3	4	5
“Hello it’s me”	67890	FGHIJ
“Hello	it’s me”	12345	ABCDE
“Hello	yes	it’s me”	112233	A1B2C3

But if I count backwards, I will get the same result set regardless of how many commas are in the first line of data:

-5	-4	-3	-2	-1
		“Hello it’s me”	67890	FGHIJ
	“Hello	it’s me”	12345	ABCDE
“Hello	yes	it’s me”	112233	A1B2C3

In the actual data I want to sue, I don’t want to store the first row of data anyway – so it can be ignored. If I did need the first line, I would have to search for the quotes and parse that column separately.

Since the initial data will contain a line feed, I will want to replace the line feeds with a comma, so I can have all of the data on a single line and be able to use the SUBSTRING_INDEX function. Here is the original data again, but this time, I have added a column for the line feeds. Also, I am going to count backwards from the last data value. The numbers are in (bold):

/tr>

(-37)“Header information which may, or may not, contain one or more commas.”
(-36) Item Name	(-35) Type	(-34) Description	(-33) Value	-line feed-
(-32) Potato Chips – Sealed Bag	(-31)	(-30)	(-29)	-line feed-
(-28) Manifest Number	(-27) Number	(-26) MANIFEST_NUMBER	(-25) 8480014	-line feed-
(-24) Manufacturer Code	(-23) Number	(-22) BRAND_CODE	(-21) 80375993	-line feed-
(-20) Information 1	(-19) Number	(-18) INFO_ONE	(-17) 15869563	-line feed-
(-16) Information 2	(-15) Number	(-14) INFO_TWO	(-13) 6569569665	-line feed-
(-12) Information 3	(-11) Number	(-10) INFO_THREE	(-9) 562	-line feed-
(-8) Planned Start Date	(-7) String	(-6) PROD_START_DATE	(-5) 9/13/2018	-line feed-
(-4) Planned Start Time	(-3) String	(-2) PROD_START_TIME	(-1) 081234

Of course, you don’t have to map out the fields like I did – you can just count backwards.

The SUBSTRING_INDEX function works similar to grabbing a value out of an array – except the count value is where you want to stop grabbing data. You specify the string you want to parse, the delimiter you want to use, and the count of the value from the string where you want to stop grabbing data.

The format is:

SUBSTRING_INDEX(string,delimiter,count)

Note: When the count value is a positive number, the count value is the ending position within the string. It doesn’t select the element itself.

Here is an example using the phrase “Hello, I must be going, see you later, bye.” which contains four values separated by commas. I want to capture the third value of “see you later”.

1	2	3	4
Hello	I must be going	See you later	bye

To select “see you later”, I choose the third column as the place where I will stop grabbing data – but, since this is the stopping point, I will get the rest of the phrase up until the third column.

mysql> select SUBSTRING_INDEX("Hello, I must be going, see you later, bye.",",",3) as PARSE;
+---------------------------------------+
| PARSE                                 |
+---------------------------------------+
| Hello, I must be going, see you later |
+---------------------------------------+
1 row in set (0.00 sec)

I don’t want all of the phrase, I only want “see you later”. With a positive count value of three, I am getting the ending point – it stops at the third delimited value. If I use a negative count value, then I am getting the starting point of the string. If I count backwards, which would give me a negative count value (and I want to do this anyway because the first line of data in the main example may have multiple commas), I can use the value of negative two (-2) and get this:

mysql> select SUBSTRING_INDEX("Hello, I must be going, see you later, bye.",",",-2) as PARSE;
+----------------------+
| PARSE                |
+----------------------+
|  see you later, bye. |
+----------------------+
1 row in set (0.00 sec)

That gives me a starting point for grabbing “see you later”, but I don’t want any data (the single word “bye”) after this. So, I can wrap an additional SUBSTRING_INDEX around the first one, and then only choose the first delimited set of data from the inner SUBSTRING_INDEX.

It looks like this:

The statement select SUBSTRING_INDEX(“Hello, I must be going, see you later, bye.”,”,”,-2) grabs both of these columns in blue:

1	2	3	4
Hello	I must be going	See you later	bye

But then I can use the second SUBSTRING_INDEX command to only select the first column of the remaining two columns:

SUBSTRING_INDEX(SUBSTRING_INDEX(“Hello, I must be going, see you later, bye.”,”,”,-2), “,”, 1)

1	2
See you later	bye

I can test it:

mysql> select SUBSTRING_INDEX(SUBSTRING_INDEX("Hello, I must be going, see you later, bye.",",",-2), ",", 1) as PARSE;
+----------------+
| PARSE          |
+----------------+
|  see you later |
+----------------+
1 row in set (0.01 sec)

This is essentially how I am going to parse the long line of data (after I convert the line feeds to commas). I will go backwards to the starting point, and then only select the first column beginning at that starting point.

I can then use the table I created to map put the data values I want to extract.

Column Name	Backwards Value
ITEM_NAME	-32
MANIFEST_NUMBER	-25
MANUFACTURER_CODE	-21
INFO_ONE	-17
INFO_TWO	-13
INFO_THREE	-9
PROD_START_DATE	-5
PROD_START_TIME	-1

Example: To extract the PROD_START_TIME, which is at position -1, I will use this in the trigger:

SET NEW.PROD_START_TIME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -1), ",", 1);

I can now create my trigger to parse all of the data being inserted into the MANIFEST_DATA column. In the trigger, I will use some temporary variables to parse the MANIFEST_DATA, and I will also use the SUBSTRING function to parse the PROD_START_DATE and PROD_START_TIME fields to rearrange them into the correct MySQL DATETIME format. I will use the CONCAT function to combine them into a new DATETIME field. I have some notes in the stored procedure to help you figure out what I am doing.

DELIMITER $$
  
CREATE TRIGGER _convert_MANIFEST_DATA
BEFORE INSERT ON MANIFEST_DATA_DB.MANIFEST_DATA_TABLE
FOR EACH ROW
BEGIN

# Declare temporary variables

# This is the temporary field which contains all of the data
DECLARE _parse_MANIFEST_DATA varchar(4096);

# The temporary date and time fields needed to create the DATETIME field named PROD_TIMESTAMP
DECLARE _parse_TIME_DATE_DATE char(10);
DECLARE _parse_TIME_DATE_TIME char(10);

DECLARE _parse_TIME_DATE_date_final char(10);

DECLARE _parse_TIME_DATE_hour char(2);
DECLARE _parse_TIME_DATE_minutes char(2);
DECLARE _parse_TIME_DATE_sec char(2);

# Take the incoming large dataset which has line feeds and
# Replace the line feeds with a comma
set _parse_MANIFEST_DATA = replace(NEW.MANIFEST_DATA, "\n", ",");

# Parse each new column from the temporary field
SET NEW.ITEM_NAME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -35), ",", 1);
SET NEW.MANIFEST_NUMBER = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -25), ",", 1);
SET NEW.MANUFACTURER_CODE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -21), ",", 1);
SET NEW.INFO_ONE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -17), ",", 1);
SET NEW.INFO_TWO = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -13), ",", 1);
SET NEW.INFO_THREE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -9), ",", 1);
SET NEW.PROD_START_DATE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -5), ",", 1);
SET NEW.PROD_START_TIME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -1), ",", 1);

# Get the values from these two fields in order to combine them into a DATETIME field
SET _parse_TIME_DATE_DATE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -5), ",", 1);
SET _parse_TIME_DATE_TIME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -1), ",", 1);

# Convert the date from MM/DD/YYYY to YYYY-MM-DD
SET _parse_TIME_DATE_date_final = STR_TO_DATE(_parse_TIME_DATE_DATE, '%c/%d/%Y');

# Parse the time so we can add colons between the hour, minutes and seconds
SET _parse_TIME_DATE_hour = SUBSTRING(_parse_TIME_DATE_TIME, 1, 2);
SET _parse_TIME_DATE_minutes = SUBSTRING(_parse_TIME_DATE_TIME, 3, 2);
SET _parse_TIME_DATE_sec = SUBSTRING(_parse_TIME_DATE_TIME, 5, 2);

# Combine the newly-formatted date with the newly-formatted time so it conforms to the DATETIME format
SET NEW.PROD_TIMESTAMP = CONCAT(_parse_TIME_DATE_date_final, " ", _parse_TIME_DATE_hour, ":", _parse_TIME_DATE_minutes, ":", _parse_TIME_DATE_sec) ;

END$$

DELIMITER ;

Now I can insert the sample data into the database – where all of the data will go into the MANIFEST_DATA column, and the trigger will populate the rest of the columns by parsing the data.

mysql> INSERT INTO MANIFEST_DATA_TABLE (MANIFEST_DATA) VALUES ('Header information which may, or may not, contain one or more commas
    '> Item Name,Type,Description,Value
    '> Potato Chips - Sealed Bag,,,
    '> Manifest Number,Number,MANIFEST_NUMBER,8480014
    '> Manufacturer Code,Number,BRAND_CODE,80375993
    '> Information 1,Number,INFO_ONE,15869563
    '> Information 2,Number,INFO_TWO,6569569665
    '> Information 3,Number,INFO_THREE,562
    '> Planned Start Date,String,PROD_START_DATE,9/13/2018
    '> Planned Start Time,String,PROD_START_TIME,081234');
Query OK, 1 row affected (0.07 sec)

When I look at the contents of the table, I can see the trigger executed successfully. All of the “non-conforming” data is stored in the MANIFEST_DATA field, but the other fields were populated:

mysql> select * from MANIFEST_DATA_TABLE\G
*************************** 1. row ***************************
 id_MANIFEST_DATA: 1000000
    MANIFEST_DATA: Header information which may, or may not, contain one or more commas
Item Name,Type,Description,Value
Potato Chips - Sealed Bag,,,
Manifest Number,Number,MANIFEST_NUMBER,8480014
Manufacturer Code,Number,BRAND_CODE,80375993
Information 1,Number,INFO_ONE,15869563
Information 2,Number,INFO_TWO,6569569665
Information 3,Number,INFO_THREE,562
Planned Start Date,String,PROD_START_DATE,9/13/2018
Planned Start Time,String,PROD_START_TIME,081234
        ITEM_NAME: Type
  MANIFEST_NUMBER: 8480014
MANUFACTURER_CODE: 80375993
         INFO_ONE: 15869563
         INFO_TWO: 6569569665
       INFO_THREE: 562
  PROD_START_DATE: 9/13/2018
  PROD_START_TIME: 081234
   PROD_TIMESTAMP: 2018-09-13 08:12:34
1 row in set (0.00 sec)

Of course, this is only a short example of all of the parsing you can do with triggers in MySQL. And, if you want to look at your trigger, you can query the INFORMATION_SCHEMA.TRIGGERS table:

mysql> SELECT * FROM INFORMATION_SCHEMA.TRIGGERS WHERE TRIGGER_NAME='_convert_MANIFEST_DATA'\G
*************************** 1. row ***************************
           TRIGGER_CATALOG: def
            TRIGGER_SCHEMA: MANIFEST_DATA_DB
              TRIGGER_NAME: _convert_MANIFEST_DATA
        EVENT_MANIPULATION: INSERT
      EVENT_OBJECT_CATALOG: def
       EVENT_OBJECT_SCHEMA: MANIFEST_DATA_DB
        EVENT_OBJECT_TABLE: MANIFEST_DATA_TABLE
              ACTION_ORDER: 1
          ACTION_CONDITION: NULL
          ACTION_STATEMENT: BEGIN

# Declare temporary variables

# This is the temporary field which contains all of the data
DECLARE _parse_MANIFEST_DATA varchar(4096);

# The temporary date and time fields needed to create the DATETIME field named PROD_TIMESTAMP
DECLARE _parse_TIME_DATE_DATE char(10);
DECLARE _parse_TIME_DATE_TIME char(10);

DECLARE _parse_TIME_DATE_date_final char(10);

DECLARE _parse_TIME_DATE_hour char(2);
DECLARE _parse_TIME_DATE_minutes char(2);
DECLARE _parse_TIME_DATE_sec char(2);

# Take the incoming large dataset which has line feeds and
# Replace the line feeds with a comma
set _parse_MANIFEST_DATA = replace(NEW.MANIFEST_DATA, "
", ",");

# Parse each new column from the temporary field
SET NEW.ITEM_NAME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -35), ",", 1);
SET NEW.MANIFEST_NUMBER = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -25), ",", 1);
SET NEW.MANUFACTURER_CODE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -21), ",", 1);
SET NEW.INFO_ONE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -17), ",", 1);
SET NEW.INFO_TWO = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -13), ",", 1);
SET NEW.INFO_THREE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -9), ",", 1);
SET NEW.PROD_START_DATE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -5), ",", 1);
SET NEW.PROD_START_TIME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -1), ",", 1);

# Get the values from these two fields in order to combine them into a DATETIME field
SET _parse_TIME_DATE_DATE = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -5), ",", 1);
SET _parse_TIME_DATE_TIME = SUBSTRING_INDEX(SUBSTRING_INDEX(_parse_MANIFEST_DATA, ',', -1), ",", 1);

# Convert the date from MM/DD/YYYY to YYYY-MM-DD
SET _parse_TIME_DATE_date_final = STR_TO_DATE(_parse_TIME_DATE_DATE, '%c/%d/%Y');

# Parse the time so we can add colons between the hour, minutes and seconds
SET _parse_TIME_DATE_hour = SUBSTRING(_parse_TIME_DATE_TIME, 1, 2);
SET _parse_TIME_DATE_minutes = SUBSTRING(_parse_TIME_DATE_TIME, 3, 2);
SET _parse_TIME_DATE_sec = SUBSTRING(_parse_TIME_DATE_TIME, 5, 2);

# Combine the newly-formatted date with the newly-formatted time so it conforms to the DATETIME format
SET NEW.PROD_TIMESTAMP = CONCAT(_parse_TIME_DATE_date_final, " ", _parse_TIME_DATE_hour, ":", _parse_TIME_DATE_minutes, ":", _parse_TIME_DATE_sec) ;

END
        ACTION_ORIENTATION: ROW
             ACTION_TIMING: BEFORE
ACTION_REFERENCE_OLD_TABLE: NULL
ACTION_REFERENCE_NEW_TABLE: NULL
  ACTION_REFERENCE_OLD_ROW: OLD
  ACTION_REFERENCE_NEW_ROW: NEW
                   CREATED: 2018-09-20 22:13:28.54
                  SQL_MODE: ONLY_FULL_GROUP_BY,STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_ENGINE_SUBSTITUTION
                   DEFINER: root@localhost
      CHARACTER_SET_CLIENT: utf8mb4
      COLLATION_CONNECTION: utf8mb4_0900_ai_ci
        DATABASE_COLLATION: utf8mb4_0900_ai_ci
1 row in set (0.02 sec)

	Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
	Tony is the author of Twenty Forty-Four: The League of Patriots Visit http://2044thebook.com for more information.
	Tony is the editor/illustrator for NASA Graphics Standards Manual Remastered Edition Visit https://amzn.to/2oPFLI0 for more information.

Filed under MySQL, Scripting, Uncategorized

Manipulating queries with non-conforming data via MySQL Query Rewrite Plugin, triggers and stored procedures

September 7, 2018 Leave a comment

The MySQL database is used in thousands of third-party applications, but what can you do when you want to use MySQL with an application, but that application’s queries or data doesn’t match MySQL’s data type or SQL format?

This post will show you three ways to alter a query or mismatched data when you don’t have control of the application’s source code. Of course, there are hundreds of different ways to do what I am about to show you. In this example, I will show you how to use the MySQL Query Rewrite Plugin along with a trigger to alter the non-conforming data. I will also show you an example of manipulating data with a stored procedure.

A customer emailed me with a problem. They wanted to use MySQL for a third-party application, but they didn’t have access to the source code. Their main problem was the application’s TIMESTAMP format didn’t conform to MySQL’s TIMESTAMP format. To be specific, this application produced a TIMESTAMP value that included a trailing time zone, such as “2018-09-05 17:00:00 EDT”. MySQL has two column data types where you can store both the date and time in one column: TIMESTAMP and DATETIME – but MySQL cannot handle TIMESTAMP or DATETIME data with a trailing time zone.

When a TIMESTAMP value is being inserted into a row, MySQL converts the TIMESTAMP value from the current time zone set by the MySQL server (see Time Zone Support) to UTC (Coordinated Universal Time) for storage, and converts the data back from UTC to the current time zone (of the server) when retrieved. (This conversion does not occur for other types such as DATETIME.) By default, the current time zone for each connection is the server’s local time. The time zone can be set on a per-connection basis, and as long as the time zone setting remains constant, you will get back the same value you stored. If you store a TIMESTAMP value, and then change the time zone and retrieve the value, the retrieved value is different from the value you stored. This occurs because the same time zone was not used for conversion in both directions. The current time zone is available as the value of the time_zone system variable. For more information, see Section 5.1.12, “MySQL Server Time Zone Support”.

(From: https://dev.mysql.com/doc/refman/8.0/en/datetime.html)

The customer told me that this application would only be sending data with two different trailing time zones – Central and Eastern. With daylight-savings in use in both of these time zones, this would give us four possible trailing time zone values – CDT, CST, EDT and EST. What we want to do is to intercept the query, and write this TIMESTAMP data to a different column, and then convert the value to UTC time to be stored in the correct column in the database. Because we don’t have access to the source code, I am assuming we have full access to the MySQL database.

NOTE: Since we are using time zone information, if you want to duplicate this post, be sure to load the MySQL time zone information. See: https://dev.mysql.com/doc/refman/8.0/en/time-zone-support.html

The MySQL Rewrite Plugin

In MySQL version 5.7, a plugin named the “Query Rewrite Plugin” was introduced. This plugin can examine SQL statements received by the server and modify those statements before the server executes them. In other words, this gives you the ability to intercept “bad” queries and re-format them to be “good” queries for use with MySQL – or to rewrite the queries to do whatever you need. Think of it as a way to change the source code without actually having the source code.

Installing the plugin is fairly easy. In MySQL version 8.0, you install (or uninstall) the plugin via an SQL script provided with your MySQL installation. The script is named install_rewriter.sql and is located in the “share” directory under your MySQL home directory.

# cd /usr/local/mysql/share  (your directory may be different)
# mysql -u root -p < install_rewriter.sql
Enter password: (enter root password here)

The script only takes a few seconds to load (The uninstall script is named uninstall_rewriter.sql). To check and make sure the plugin was installed, run this command from within MySQL:

mysql> SHOW GLOBAL VARIABLES LIKE 'rewriter_enabled';
+------------------+-------+
| Variable_name    | Value |
+------------------+-------+
| rewriter_enabled | ON    |
+------------------+-------+
1 row in set (0.00 sec)

The plugin was installed correctly if the column named “Value” is set to “ON“.

For this example, I am going to create a small table with three columns, and assume that this is an table from a third-party application. The date_time_value column is where the application would normally store the timestamp information.

mysql> create database test;
 Query OK, 1 row affected (0.01 sec)
mysql> use test;
 Database changed
mysql> CREATE TABLE `time_example` (
  `idtime` int(11) NOT NULL AUTO_INCREMENT,
  `action_record` varchar(30) NOT NULL,
  `date_time_value` timestamp NULL DEFAULT NULL,
  PRIMARY KEY (`idtime`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=UTF8MB4;
Query OK, 0 rows affected (0.03 sec)

The date_time_value column will obviously not be able to store timestamp data with a trailing time zone, but let’s see what happens when we try and insert a row of data – and let’s pretend that this is the query the application uses.

mysql> insert into test.time_example (action_record, date_time_value) 
 values ('Arrived at work', '2018-09-05 17:00:00 EDT');
Error Code: 1292. Incorrect datetime value: '2018-09-05 17:00:00 EDT' 
 for column 'date_time_value' at row 1

Of course, we get an error because the format for the timestamp is incorrect.

What we want to do is to alter the table and add a column to store this improperly-formatted timestamp data.

mysql> ALTER TABLE `test`.`time_example` 
    -> ADD COLUMN `date_time_storage` VARCHAR(23) NULL AFTER `date_time_value`;
Query OK, 0 rows affected (0.05 sec)
Records: 0  Duplicates: 0  Warnings: 0

Now that we have a column (date_time_storage) to store the “bad” timestamp data, we need to modify the incoming query so that it writes the timestamp data into the new column.

Note: In MySQL 8.0+, with the Query Rewrite Plugin, you can modify SELECT, INSERT, REPLACE, UPDATE, and DELETE statements. (Prior to MySQL 8.0.12 you could only modify SELECT statements)

This is the query the application is sending to the database:

insert into test.time_example (action_record, date_time_value) values (?, ?);

We want to modify the query to use the new date_time_storage column, instead of the date_time_value column. The new query would look like this:

insert into test.time_example (action_record, date_time_storage) values (?, ?);

Now that we have our old (bad) and new (good) queries, we can insert this into the rewrite_rules table of the query_rewrite database.

INSERT INTO query_rewrite.rewrite_rules
    (pattern, replacement, pattern_database) VALUES(
    'insert into test.time_example (action_record, date_time_value) values (?, ?)',
    'insert into test.time_example (action_record, date_time_storage) values (?, ?)',
    'time_example'
    );
1 row(s) affected, 1 warning(s): 1105 Query 'insert into test.time_example 
 (action_record, date_time_value) values ('Left building', '2018-09-05 17:00:00 EDT')' 
 rewritten to 'insert into test.time_example (action_record, date_time_storage) 
 values ('Left building', '2018-09-05 17:00:00 EDT')' by a query rewrite plugin

(More examples may be found on this page: Query Rewrite Plugin Usage)

We need to execute a stored procedure named flush_rewrite_rules to make this query-rewrite change permanent: (See: https://dev.mysql.com/doc/refman/8.0/en/rewriter-query-rewrite-plugin-usage.html)

mysql> CALL query_rewrite.flush_rewrite_rules();
Query OK, 1 row affected (0.00 sec)

We can confirm the INSERT INTO query_rewrite.rewrite_rules by looking at the rewrite_rules table:

mysql> SELECT * FROM query_rewrite.rewrite_rules\G
*************************** 1. row ***************************
                id: 1
           pattern: insert into test.time_example (action_record, date_time_value) values (?, ?)
  pattern_database: time_example
       replacement: insert into test.time_example (action_record, date_time_storage) values (?, ?)
           enabled: YES
           message: NULL
    pattern_digest: e823e987338aeae6d47f7a729e78f532d3ff3721237c15981bcd11fc2607efda
normalized_pattern: insert into `test`.`time_example` (`action_record`,`date_time_value`) values (?,?)
1 row in set (0.00 sec)

Next, let’s run the same query as before, and see if it puts the timestamp data that is supposed to go into the date_time_value column into the new date_time_storage column:

mysql> insert into test.time_example (action_record, date_time_value) 
 values ('Arrived at work', '2018-09-05 17:00:00 EDT');
Query OK, 1 row affected, 1 warning (0.01 sec)

And now the table contains this data:

mysql> select * from time_example;
+--------+-----------------+-----------------+-------------------------+
| idtime | action_record   | date_time_value | date_time_storage       |
+--------+-----------------+-----------------+-------------------------+
|      1 | Arrived at work | NULL            | 2018-09-05 17:00:00 EDT |
+--------+-----------------+-----------------+-------------------------+
1 rows in set (0.00 sec)

We now have the timestamp with the time zone data stored in the MySQL database, but we need to convert this to a proper format, and put the result into the date_time_value column.

To do this, we can use a trigger.

Normally, you would want your application to produce data in the correct format, but in this example, we don’t have access to the source code. So, we can create a trigger to convert the “incorrectly-formatted” data in date_time_storage to the correct data and store it in date_time_value.

NOTE: These examples won’t work if your TIMESTAMP uses microseconds (6-digits) precision (example: ‘1970-01-01 00:00:01.000000’) – but you can modify the code to accommodate microseconds.

Here is the SQL to create the trigger:

DELIMITER $$
  
CREATE TRIGGER _time_zone_convert_insert2
AFTER INSERT ON time_example
FOR EACH ROW
BEGIN

DECLARE _date_time_no_tz varchar(20);

SET _date_time_no_tz = SUBSTRING(NEW.date_time_storage, 1, 20);

IF NEW.date_time_storage like '%EDT' THEN
    SET NEW.date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT');
END IF;

IF NEW.date_time_storage like '%EST' THEN
    SET NEW.date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT');
END IF;

IF NEW.date_time_storage like '%CDT' THEN
    SET NEW.date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT');
END IF;

IF NEW.date_time_storage like '%CST' THEN
    SET NEW.date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT');
END IF;

END$$

DELIMITER ;

Now that we have a trigger in place, let’s insert another line into the database – BUT, we still want to use the SQL from the application. The query will try and write to the date_time_value column, but the Query Rewrite Plugin will intercept the original query and substitute our new query instead – which will insert the timestamp data into the date_time_storage column, and then the trigger will convert the timestamp and place the correct value into the date_time_value column.

mysql> INSERT INTO time_example (action_record, date_time_value) 
 VALUES ('Lunch Break', '2018-09-05 18:00:00 EDT');
Query OK, 1 row affected (0.00 sec)

The table now contains a true timestamp column with the correct timestamp value in UTC. (The old row didn’t change)

mysql> SELECT * FROM test.time_example;
+--------+------------+---------------------+-------------------------+
| idtime | product_id | date_time_value     | date_time_storage       |
+--------+------------+---------------------+-------------------------+
|      1 | time now1  | NULL                | 2018-09-05 18:00:00 EDT |
|      2 | time now2  | 2018-09-05 22:00:00 | 2018-09-05 18:00:00 EDT |
+--------+------------+---------------------+-------------------------+
2 rows in set (0.00 sec)

But what about stored procedures?

The easiest way to handle the time zone conversion is with a trigger. But, to show you how stored procedures can do the same thing, I have an example of a stored procedure. In this example, I will be passing the values of the idtime and date_time_storage columns.

This example will be similar to the one above – I created a table named time_example, but this time, I am including the extra column:

'CREATE TABLE `time_example` (
  `idtime` int(11) NOT NULL AUTO_INCREMENT,
  `action_record` varchar(30) NOT NULL,
  `date_time_value` timestamp NULL DEFAULT NULL,
  `date_time_storage` varchar(23) DEFAULT NULL,
  PRIMARY KEY (`idtime`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8'

I then inserted a row, where I am storing the time stamp with the time zone information:

mysql> insert into test.time_example (action_record, date_time_storage) 
 values ('Left work', '2018-09-05 17:00:00 EDT’);
Query OK, 1 row affected (0.00 sec)

Here is what the row looks like:

mysql> SELECT * FROM test.time_example;
+--------+------------+-----------------+-------------------------+
| idtime | product_id | date_time_value | date_time_storage       |
+--------+------------+-----------------+-------------------------+
|      1 | Left work  | NULL            | 2018-09-05 17:00:00 EDT |
+--------+------------+-----------------+-------------------------+
1 row in set (0.00 sec)

Again, the date_time_storage column is a temporary storage column. I will call the stored procedure, and provide the idtime and date_time_storage values. The stored procedure which will look at the last three characters in the date_time_storage column, and then convert the time to UTC, which is then stored in the date_time_value column.

call _check_time_zone('1','2018-09-05 17:00:00 EDT');

Now the row looks like this, where the date_time_value column is now stored as UTC:

mysql> SELECT * FROM test.time_example;
+--------+------------+---------------------+-------------------------+
| idtime | product_id | date_time_value     | date_time_storage       |
+--------+------------+---------------------+-------------------------+
|      1 | Left work  | 2018-09-05 21:00:00 | 2018-09-05 17:00:00 EDT |
+--------+------------+---------------------+-------------------------+
1 row in set (0.00 sec)

And here is the code to create the stored procedure:

DELIMITER $$
CREATE DEFINER=`root`@`localhost` 
PROCEDURE `_check_time_zone`(IN _id_time INT, IN _date_time_storage VARCHAR(23))
BEGIN

DECLARE _date_time_no_tz varchar(20);

SET _date_time_no_tz = SUBSTRING(_date_time_storage, 1, 20);

IF _date_time_storage like '%EDT' THEN 
UPDATE time_example SET date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT')
WHERE idtime = _id_time;
END IF;

IF _date_time_storage like '%EST' THEN 
UPDATE time_example SET date_time_value = CONVERT_TZ(_date_time_no_tz,'EST5EDT','GMT')
WHERE idtime = _id_time;
END IF;

IF _date_time_storage like '%CDT' THEN 
UPDATE time_example SET date_time_value = CONVERT_TZ(_date_time_no_tz,'CST5CDT','GMT')
WHERE idtime = _id_time;
END IF;

IF _date_time_storage like '%CST' THEN 
UPDATE time_example SET date_time_value = CONVERT_TZ(_date_time_no_tz,'CST5CDT','GMT')
WHERE idtime = _id_time;
END IF;

IF _date_time_storage like '%UTC' THEN 
UPDATE time_example SET date_time_value = _date_time_no_tz
WHERE idtime = _id_time;
END IF;

END $$
DELIMITER ;

	Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
	Tony is the author of Twenty Forty-Four: The League of Patriots Visit http://2044thebook.com for more information.
	Tony is the editor/illustrator for NASA Graphics Standards Manual Remastered Edition Visit https://amzn.to/2oPFLI0 for more information.

Filed under MySQL, Uncategorized

MySQL 8.0 InnoDB Cluster – Creating a sandbox and testing MySQL Shell, Router and Group Replication

May 30, 2018 Leave a comment

MySQL’s InnoDB Cluster was released on Apr 12, 2017, with version 5.7 and is also included in MySQL version 8.0.

MySQL InnoDB cluster provides a complete high availability solution for MySQL. MySQL Shell includes AdminAPI which enables you to easily configure and administer a group of at least three MySQL server instances to function as an InnoDB cluster. Each MySQL server instance runs MySQL Group Replication, which provides the mechanism to replicate data within InnoDB clusters, with built-in failover. AdminAPI removes the need to work directly with Group Replication in InnoDB clusters, but for more information see Chapter 18, Group Replication which explains the details. MySQL Router can automatically configure itself based on the cluster you deploy, connecting client applications transparently to the server instances. In the event of an unexpected failure of a server instance the cluster reconfigures automatically. In the default single-primary mode, an InnoDB cluster has a single read-write server instance – the primary. Multiple secondary server instances are replicas of the primary. If the primary fails, a secondary is automatically promoted to the role of primary. MySQL Router detects this and forwards client applications to the new primary. Advanced users can also configure a cluster to have multiple-primaries. (source: Introducing InnoDB Cluster )

The following diagram shows an overview of how these technologies work together:

I am not going to go into any more details about how the InnoDB Cluster works, as there are plenty of articles over on the MySQL Server Team and MySQL High Availability blogs.

This post will walk you through setting up MySQL InnoDB Cluster in a sandbox environment. I used a Mac running 10.13, but these commands should also work on Linux. The only differences will be the location of the root directory and where MySQL is installed.

To begin, you will need to install the MySQL server (version 8.0), MySQL Router and MySQL Shell. You may download these from the MySQL downloads page. And, I am not going to walk you through the steps of installing these – as on the Mac, the installation is very easy. For any other OS, you will need to refer to the man pages for each package.

Okay – so I have installed all three components of the MySQL InnoDB Cluster – the MySQL server, MySQL Router and MySQL Shell (all three were versions 8.0.11). For ease of going through this tutorial, I would recommend opening five terminal windows – one for the MySQL Shell, one for regular OS access, and three for each of the sandboxes that I will create. The sandboxes will all run on one server, and each sandbox will have a different port number for the MySQL instance. In this example, I am using ports 3310, 3320 and 3330.

To make this tutorial easy to follow, simply enter/use the commands which appear as bold red. I have included the output from these commands separately.

One note: before you start, make sure that all of the directories in the the mysql home directory – /usr/local – are owned by the mysql user. You don’t have to create the mysql user separately, but you will notice the user name will be _mysql.

cd /usr/local
chown -R mysql mysql*

Let’s begin by logging in as root on my Mac and open the MySQL shell:

mysqlsh

~:root # mysqlsh
MySQL Shell 8.0.11

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.

I will be creating three sandbox instances to demo how MySQL InnoDB Cluster works. I will deploy the first sandbox instance using port 3310 and will need to enter a root password

dba.deploySandboxInstance(3310);

 MySQL  JS > dba.deploySandboxInstance(3310);
A new MySQL sandbox instance will be created on this host in 
/var/root/mysql-sandboxes/3310

Warning: Sandbox instances are only suitable for deploying and 
running on your local machine for testing purposes and are not 
accessible from external networks.

Please enter a MySQL root password for the new instance: 
Deploying new MySQL instance...

Instance localhost:3310 successfully deployed and started.
Use shell.connect('root@localhost:3310'); to connect to the instance.

Next, deploy the second sandbox instance using port 3320 – and all three instances will need to have the same root password.

dba.deploySandboxInstance(3320);

 MySQL  JS > dba.deploySandboxInstance(3320);
A new MySQL sandbox instance will be created on this host in 
/var/root/mysql-sandboxes/3320

Warning: Sandbox instances are only suitable for deploying and 
running on your local machine for testing purposes and are not 
accessible from external networks.

Please enter a MySQL root password for the new instance: 
Deploying new MySQL instance...

Instance localhost:3320 successfully deployed and started.
Use shell.connect('root@localhost:3320'); to connect to the instance.

Finally, deploy the third sandbox instance using port 3330.

dba.deploySandboxInstance(3330);

 MySQL  JS > dba.deploySandboxInstance(3330);
A new MySQL sandbox instance will be created on this host in 
/var/root/mysql-sandboxes/3330

Warning: Sandbox instances are only suitable for deploying and 
running on your local machine for testing purposes and are not 
accessible from external networks.

Please enter a MySQL root password for the new instance: 
Deploying new MySQL instance...

Instance localhost:3330 successfully deployed and started.
Use shell.connect('root@localhost:3330'); to connect to the instance.

From within the mysql shell, while in Javascript mode, connect to the first host on port 3310.

shell.connect(“root@localhost:3310”);

 MySQL  JS > shell.connect("root@localhost:3310");
Please provide the password for 'root@localhost:3310': 
Creating a session to 'root@localhost:3310'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 20
Server version: 8.0.11 MySQL Community Server - GPL
No default schema selected; type \use  to set one.

I will create the cluster starting with the 3310 sandbox – and the name of the cluster will be simply ‘mycluster’.

cluster = dba.createCluster(‘mycluster’);

MySQL localhost:3310 ssl JS > cluster = dba.createCluster('mycluster');
A new InnoDB cluster will be created on instance 'root@localhost:3310'.

Validating instance at localhost:3310...
Instance detected as a sandbox.
Please note that sandbox instances are only suitable for deploying test clusters for use within the same host.

This instance reports its own address as MacVM151.local

Instance configuration is suitable.
Creating InnoDB cluster 'mycluster' on 'root@localhost:3310'...
Adding Seed Instance...

Cluster successfully created. Use Cluster.addInstance() to add MySQL instances.
At least 3 instances are needed for the cluster to be able to withstand up to
one server failure.

Next, add the other two sandboxes to the cluster – the ones on ports 3320 and 3330.

Adding 3320…

cluster.addInstance(“root@localhost:3320”);

 MySQL  localhost:3310 ssl  JS > cluster.addInstance("root@localhost:3320");
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@localhost:3320': 
Adding instance to the cluster ...

Validating instance at localhost:3320...
Instance detected as a sandbox.
Please note that sandbox instances are only suitable for deploying test clusters for use within the same host.

This instance reports its own address as MacVM151.local

Instance configuration is suitable.
The instance 'root@localhost:3320' was successfully added to the cluster.

And 3330…

cluster.addInstance(“root@localhost:3330”);

 MySQL  localhost:3310 ssl  JS > cluster.addInstance("root@localhost:3330")
A new instance will be added to the InnoDB cluster. Depending on the amount of
data on the cluster this might take from a few seconds to several hours.

Please provide the password for 'root@localhost:3330': 
Adding instance to the cluster ...

Validating instance at localhost:3330...
Instance detected as a sandbox.
Please note that sandbox instances are only suitable for deploying test clusters for use within the same host.

This instance reports its own address as MacVM151.local

Instance configuration is suitable.
The instance 'root@localhost:3330' was successfully added to the cluster.

And now I can check the cluster status – and you can see 3310 is read-write, and the other two are read-only.

cluster.status();

 MySQL  localhost:3310 ssl  JS > cluster.status();
{
    "clusterName": "mycluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "localhost:3310", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3320": {
                "address": "localhost:3320", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3330": {
                "address": "localhost:3330", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@localhost:3310"
}

I have opened another terminal window, where I can also check the processes for mysqld, and you can see all three sandbox instances are running.

ps -ef|grep mysqld

~:root # ps -ef|grep mysqld
  501  2124   853   0  9:25PM ttys000    0:00.00 grep mysqld
  501  2078     1   0  8:55PM ttys002    0:07.28 /usr/local/mysql/bin/mysqld --defaults-file=/Users/tonydarnell/mysql-sandboxes/3310/my.cnf
  501  2098     1   0  9:16PM ttys002    0:03.98 /usr/local/mysql/bin/mysqld --defaults-file=/Users/tonydarnell/mysql-sandboxes/3320/my.cnf
  501  2106     1   0  9:16PM ttys002    0:03.67 /usr/local/mysql/bin/mysqld --defaults-file=/Users/tonydarnell/mysql-sandboxes/3330/my.cnf

From the terminal, go to /var/root and you can see the mysql-sandboxes directory, and the contents of each directory

cd /var/root
pwd
ls -ld mysql-sandboxes
ls -ld mysql-sandboxes/*
ls -ld mysql-sandboxes/3310/*

~:root # cd /var/root

~:root # pwd
/var/root

~:root # ls -ld mysql-sandboxes
drwxr-xr-x  5 _mysql  wheel  170 Apr 24 11:25 mysql-sandboxes

~:root # ls -ld mysql-sandboxes/*
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 12:07 mysql-sandboxes/3310
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 mysql-sandboxes/3320
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 mysql-sandboxes/3330

~:root # ls -ld mysql-sandboxes/3310/*
-rw-r-----   1 root    wheel     5 Apr 24 12:07 mysql-sandboxes/3310/3310.pid
-rw-------   1 _mysql  wheel   746 Apr 24 12:07 mysql-sandboxes/3310/my.cnf
drwxr-xr-x   2 _mysql  wheel    68 Apr 24 11:24 mysql-sandboxes/3310/mysql-files
drwxr-x---  41 _mysql  wheel  1394 Apr 24 12:07 mysql-sandboxes/3310/sandboxdata
-rwx------   1 _mysql  wheel   126 Apr 24 11:24 mysql-sandboxes/3310/start.sh
-rwx------   1 _mysql  wheel   196 Apr 24 11:24 mysql-sandboxes/3310/stop.sh

I want make sure I change ownership of the sandboxes to the mysql user.

cd /var/root
chown -R mysql mysql-sandboxes/
ls -ld mysql-sandboxes/

~:root # cd /var/root
~:root # chown -R mysql mysql-sandboxes/
~:root # ls -ld mysql-sandboxes/
drwxr-xr-x  5 _mysql  wheel  170 Apr 24 11:25 mysql-sandboxes/

Now, I want to verify that mysql router isn’t running.

ps -ef|grep router

~:root # ps -ef|grep router
    0  2766  2356   0 11:31AM ttys002    0:00.00 grep router

Before I start mysqlrouter I will want to include the router bin directory in the $PATH for root
I will edit root’s .profile file and add: /usr/local/mysql-router/bin to the $PATH.

I can now start the mysqlrouter in bootstrap mode. If you run this as root you will need to specify the user with the –user variable.

mysqlrouter –bootstrap localhost:3310 –directory /usr/local/myrouter –user=mysql

/usr/local:root # mysqlrouter --bootstrap localhost:3310 --directory /usr/local/myrouter --user=mysql
Please enter MySQL password for root: 

Bootstrapping MySQL Router instance at '/usr/local/myrouter'...
MySQL Router  has now been configured for the InnoDB cluster 'mycluster'.

The following connection information can be used to connect to the cluster.

Classic MySQL protocol connections to cluster 'mycluster':
- Read/Write Connections: localhost:6446
- Read/Only Connections: localhost:6447
X protocol connections to cluster 'mycluster':
- Read/Write Connections: localhost:64460
- Read/Only Connections: localhost:64470

/usr/local/myrouter/start.sh

Next, I will start the router.

/usr/local:root # /usr/local/myrouter/start.sh

I want to check the processes to make sure that router is running.

ps -ef|grep router

/usr/local:root # ps -ef|grep router
    0  2145     1   0  9:32PM ttys000    0:00.02 sudo ROUTER_PID=/usr/local/myrouter/mysqlrouter.pid /usr/local/bin/mysqlrouter -c /usr/local/myrouter/mysqlrouter.conf --user=mysql
   74  2146  2145   0  9:32PM ttys000    0:00.22 /usr/local/bin/mysqlrouter -c /usr/local/myrouter/mysqlrouter.conf --user=mysql
    0  2148  2136   0  9:32PM ttys000    0:00.00 grep router

I can now connect to the router and see which of the sandbox instances I connect to from the router.

If you already have a mysql shell window open – use this command: shell.connect(“root@localhost:6446”).

Or, from the command prompt – use this mysqlsh –uri root@localhost:6446

/usr/local/myrouter:root # mysqlsh --uri root@localhost:6446
Creating a session to 'root@localhost:6446'
Enter password: 
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 135
Server version: 8.0.11 MySQL Community Server - GPL
No default schema selected; type \use  to set one.
MySQL Shell 8.0.11

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.


 MySQL  localhost:6446 ssl  JS >

Switch to sql mode and check to see which port is being used.

\sql

select @@port;

 MySQL  localhost:6446 ssl  JS > \sql
Switching to SQL mode... Commands end with ;
MySQL  localhost:6446 ssl  SQL > select @@port;
+--------+
| @@port |
+--------+
|   3310 |
+--------+
1 row in set (0.0005 sec)

I can see that I am connected to port 3310 – which is the read/write instance of the InnoDB cluster.

If you want to check the status of the cluster – you can’t do that from router – you will have to connect to ports 3310, 3320 or 3330.

If you try to check the status of the cluster while in SQL mode, you get this error:

 MySQL  localhost:6446 ssl  SQL > cluster.status();
ERROR: 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'cluster.status()' at line 1

If you try while in javascript mode, you get this error:

 MySQL  localhost:6446 ssl  JS > cluster.status();
ReferenceError: cluster is not defined

So, I will want to connect back to the cluster itself – but first I need to go to javascript mode.

\js
shell.connect(“root@localhost:3310”);

 MySQL  localhost:6446 ssl  SQL > \js
Switching to JavaScript mode...

 MySQL  JS > shell.connect("root@localhost:3310");
Please provide the password for 'root@localhost:3310': 
Creating a session to 'root@localhost:3310'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 193
Server version: 8.0.11 MySQL Community Server - GPL
No default schema selected; type \use  to set one.

I can check the status of the cluster again from javascript mode.

cluster=dba.getCluster();
cluster.status();

 MySQL  localhost:3310 ssl  JS > cluster.status();
{
    "clusterName": "mycluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "localhost:3310", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3320": {
                "address": "localhost:3320", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3330": {
                "address": "localhost:3330", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@localhost:3310"
}

I am going to create a database, a table and then insert data into the table so I can see how group replication will replicate the changes from the read-write server to the other two servers.

I am opening three terminal windows each open to a separate port – 3310, 3320, and 3330.

mysql -uroot -p -P3310 -h127.0.0.1

/usr/local/myrouter:root #  mysql -uroot -p -P3310 -h127.0.0.1
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 219
Server version: 8.0.11 MySQL Community Server - GPL

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

mysql -uroot -p -P3320 -h127.0.0.1

~:root # mysql -uroot -p -P3320 -h127.0.0.1
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 109
Server version: 8.0.11 MySQL Community Server - GPL

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

mysql -uroot -p -P3330 -h127.0.0.1

/usr/local:root # mysql -uroot -p -P3330 -h127.0.0.1
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 99
Server version: 8.0.11 MySQL Community Server - GPL

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql>

On each of the three instances – 3310, 3320 and 3330 – I will look at what databases I already have on each instance (they should only contain the default mysql databases).

show databases;

mysql> show databases;
+-------------------------------+
| Database                      |
+-------------------------------+
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |
+-------------------------------+
5 rows in set (0.00 sec)

On the read-write server, which is on port 3310, I will create a database named test_01.

create database test_01;

mysql> create database test_01;
Query OK, 1 row affected (0.05 sec)

Now, I can check to see if the database was created on 3310, and then check on the other two to see that it has been replicated. I will run this on all three instances.

show databases;

mysql> show databases;
+-------------------------------+
| Database                      |
+-------------------------------+
| information_schema            |
| mysql                         |
| mysql_innodb_cluster_metadata |
| performance_schema            |
| sys                           |
| test_01                       |
+-------------------------------+
6 rows in set (0.00 sec)

I can see the test_01 database, and the new database doesn’t have any tables, so I will run this on all three run this to show zero tables:

use test_01;show tables;

mysql> use test_01;show tables;
Database changed
Empty set (0.01 sec);

I am going to create a table named “employees” on 3310.

CREATE TABLE `employees` (
`id_emp` int(11) NOT NULL AUTO_INCREMENT,
`name_first` varchar(45) DEFAULT NULL,
`name_middle` varchar(45) DEFAULT NULL,
`name_last` varchar(45) DEFAULT NULL,
`phone_home` varchar(45) DEFAULT NULL,
`phone_cell` varchar(45) DEFAULT NULL,
PRIMARY KEY (`id_emp`)
) ENGINE=InnoDB AUTO_INCREMENT=10000;

mysql> CREATE TABLE `employees` (
    ->   `id_emp` int(11) NOT NULL AUTO_INCREMENT,
    ->   `name_first` varchar(45) DEFAULT NULL,
    ->   `name_middle` varchar(45) DEFAULT NULL,
    ->   `name_last` varchar(45) DEFAULT NULL,
    ->   `phone_home` varchar(45) DEFAULT NULL,
    ->   `phone_cell` varchar(45) DEFAULT NULL,
    ->   PRIMARY KEY (`id_emp`)
    -> ) ENGINE=InnoDB AUTO_INCREMENT=10000;
Query OK, 0 rows affected (0.10 sec)

On all three instances – 3310, 3320 and 3330 – I will run this to show that the employee table creation was propagated to the other two servers via replication.

use test_01;show tables;

mysql> use test_01;show tables;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
+-------------------+
| Tables_in_test_01 |
+-------------------+
| employees         |
+-------------------+
1 row in set (0.00 sec)

And, on all three instances – 3310, 3320 and 3330 – I will run this to show that the employee table is empty.

select * from employees;

mysql> select * from employees;
Empty set (0.01 sec)

Now I can insert a row into the employees table on 3310.

INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES (‘John’, ‘H’, ‘Smith’, ‘404-555-1212’, ‘404-555-2020’);

mysql> INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES ('John', 'H', 'Smith', '404-555-1212', '404-555-2020');
Query OK, 1 row affected (0.00 sec)

And, on all three instances – 3310, 3320 and 3330 – I will run this to show that the insert statement was propagated to the other two servers.

select * from employees;

mysql> select * from employees;
+--------+------------+-------------+-----------+--------------+--------------+
| id_emp | name_first | name_middle | name_last | phone_home   | phone_cell   |
+--------+------------+-------------+-----------+--------------+--------------+
|  10006 | John       | H           | Smith     | 404-555-1212 | 404-555-2020 |
+--------+------------+-------------+-----------+--------------+--------------+
1 row in set (0.00 sec)

On another terminal – login as root and let’s take a look at the sandbox files to see what was created.

cd /var/root
pwd
ls -ld mysql-sandboxes/*

~:root # cd /var/root
~:root # pwd
/var/root
~:root # ls -ld mysql-sandboxes/*
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:24 mysql-sandboxes/3310
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 mysql-sandboxes/3320
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 mysql-sandboxes/3330

Now from within shell – connect to the router. You might have to start a new shell – quit the old one if you have it open – otherwise, you might still be connected to 3310.

mysqlsh

~:root # mysqlsh
MySQL Shell 8.0.11

Copyright (c) 2016, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type '\help' or '\?' for help; '\quit' to exit.

shell.connect(“root@localhost:6446”);

 MySQL  JS > shell.connect("root@localhost:6446");
Please provide the password for 'root@localhost:6446': 
Creating a session to 'root@localhost:6446'
Fetching schema names for autocompletion... Press ^C to stop.
Your MySQL connection id is 146
Server version: 8.0.11 MySQL Community Server - GPL
No default schema selected; type \use  to set one.

I am going to kill the first box sandbox (using port 3310) which was the read-write instance.

dba.killSandboxInstance(3310);

 MySQL  localhost:6446 ssl  JS > dba.killSandboxInstance(3310);
The MySQL sandbox instance on this host in 
/var/root/mysql-sandboxes/3310 will be killed


Killing MySQL instance...

Instance localhost:3310 successfully killed.

# switch to sql mode

\sql

 MySQL  localhost:6446 ssl  JS > \sql
Switching to SQL mode... Commands end with ;

Now, I can check to see which port is now being used by the router.

select @@port;

 MySQL  localhost:6446 ssl  SQL > select @@port;
+--------+
| @@port |
+--------+
|   3320 |
+--------+
1 row in set (0.0004 sec)

I will switch to another terminal window and despite killing the sandbox, the sandbox files for 3310 weren’t removed.

ls -ld /var/root/mysql-sandboxes/*

~:root # ls -ld /var/root/mysql-sandboxes/*
drwxr-xr-x  7 _mysql  wheel  238 Apr 24 11:58 /var/root/mysql-sandboxes/3310
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 /var/root/mysql-sandboxes/3320
drwxr-xr-x  8 _mysql  wheel  272 Apr 24 11:25 /var/root/mysql-sandboxes/3330

I will switch back the mysqlsh window, and switch to javascript mode.

\js

 MySQL  localhost:6446 ssl  SQL > \js
Switching to JavaScript mode...

I can now check the status of the cluster.

cluster=dba.getCluster();

 MySQL  localhost:6446 ssl  JS > cluster=dba.getCluster();

cluster.status();

You can see how the instance for 3310 is now labeled as MISSING and how 3320 is now the read-write instance.

 MySQL  localhost:6446 ssl  JS > cluster.status();
{
    "clusterName": "mycluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "localhost:3320", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "(MISSING)"
            }, 
            "localhost:3320": {
                "address": "localhost:3320", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3330": {
                "address": "localhost:3330", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@localhost:6446"
}

Let’s add 3310 back into the cluster – and after you do this, if you quickly do another cluster status, you will see it is in recovery mode.

dba.startSandboxInstance(3310);

 MySQL  localhost:6446 ssl  JS > dba.startSandboxInstance(3310);
The MySQL sandbox instance on this host in 
/var/root/mysql-sandboxes/3310 will be started


Starting MySQL instance...

Instance localhost:3310 successfully started.

# if you do another cluster.status(); very quickly
# you can now see that 3310 is in RECOVERING mode

cluster.status();

 MySQL  localhost:6446 ssl  JS > cluster.status();
{
    "clusterName": "mycluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "localhost:3320", 
        "ssl": "REQUIRED", 
        "status": "OK_NO_TOLERANCE", 
        "statusText": "Cluster is NOT tolerant to any failures. 1 member is not active", 
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "RECOVERING"
            }, 
            "localhost:3320": {
                "address": "localhost:3320", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3330": {
                "address": "localhost:3330", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@localhost:6446"
}

And then do another cluster.status() – and you can see how 3310 has rejoined the cluster, but it is now a read-only node.

cluster.status();

 MySQL  localhost:6446 ssl  JS > cluster.status();
{
    "clusterName": "mycluster", 
    "defaultReplicaSet": {
        "name": "default", 
        "primary": "localhost:3320", 
        "ssl": "REQUIRED", 
        "status": "OK", 
        "statusText": "Cluster is ONLINE and can tolerate up to ONE failure.", 
        "topology": {
            "localhost:3310": {
                "address": "localhost:3310", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3320": {
                "address": "localhost:3320", 
                "mode": "R/W", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }, 
            "localhost:3330": {
                "address": "localhost:3330", 
                "mode": "R/O", 
                "readReplicas": {}, 
                "role": "HA", 
                "status": "ONLINE"
            }
        }
    }, 
    "groupInformationSourceMember": "mysql://root@localhost:6446"
}

Now that 3310 is back online, if I try and do an insert from 3310 – I get an error – because it is a read-only node.

INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES (‘John’, ‘H’, ‘Smith’, ‘404-555-1212’, ‘404-555-2020’);

mysql> INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES ('John', 'H', 'Smith', '404-555-1212', '404-555-2020');
ERROR 1290 (HY000): The MySQL server is running with the --super-read-only option so it cannot execute this statement

I can do an insert on 3320 as it is the read-write node, and check to see if it was replicated to the other two servers.

INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES (‘Susan’, ‘K’, ‘James’, ‘912-555-8565’, ‘912-555-9986’);

mysql> INSERT INTO `test_01`.`employees` (`name_first`, `name_middle`, `name_last`, `phone_home`, `phone_cell`) VALUES ('Susan', 'K', 'James', '912-555-8565', '912-555-9986');
Query OK, 1 row affected (0.09 sec)

And now I can check the employees table to see both rows of data – on all of the nodes.

use test_01; select * from employees;

mysql> select * from employees;                                                              
+--------+------------+-------------+-----------+--------------+--------------+
| id_emp | name_first | name_middle | name_last | phone_home   | phone_cell   |
+--------+------------+-------------+-----------+--------------+--------------+
|  10003 | John       | H           | Smith     | 404-555-1212 | 404-555-2020 |
|  10004 | Susan      | K           | James     | 912-555-8565 | 912-555-9986 |
+--------+------------+-------------+-----------+--------------+--------------+
2 rows in set (0.00 sec)

The InnoDB Cluster (sandbox version) is now up and running!

If you want to start over, kill all three mysqld processes, where pid1 and pi2 are the process ID’s from the ps -ef statement.

ps -ef| grep mysqld
kill -9 pid1 pid2

Double-check to make sure you killed the mysqld processes:

ps -ef| grep mysqld

~:root # ps -ef| grep mysqld
    0  2342     1   0 10:05PM ttys000    2:34.77 /usr/local/mysql/bin/mysqld --defaults-file=/var/root/mysql-sandboxes/3320/my.cnf --user=root
    0  2347     1   0 10:05PM ttys000    2:29.65 /usr/local/mysql/bin/mysqld --defaults-file=/var/root/mysql-sandboxes/3330/my.cnf --user=root
    0  2706     1   0  9:58AM ttys000    0:41.80 /usr/local/mysql/bin/mysqld --defaults-file=/var/root/mysql-sandboxes/3310/my.cnf --user=root
    0  2721  2356   0 11:17AM ttys002    0:00.00 grep mysqld
~:root # kill -9 2342 2347 2706
~:root # ps -ef| grep mysqld
    0  2723  2356   0 11:17AM ttys002    0:00.00 grep mysqld

Remove the sandbox files.

cd /var/root
ls -ld mysql-sandboxes/
rm -r mysql-sandboxes/
ls -ld mysql-sandboxes/

~:root # cd /var/root
~:root # ls -ld mysql-sandboxes/
drwxr-xr-x  5 _mysql  wheel  170 Apr 23 22:05 mysql-sandboxes/
~:root # rm -r mysql-sandboxes/
~:root # ls -ld mysql-sandboxes/
ls: mysql-sandboxes/: No such file or directory

ps -ef|grep router
kill -9 2645 2646 (whatever the PIDs are for router)
ps -ef|grep router

~:root # ps -ef|grep router
    0  2645     1   0  9:27AM ttys000    0:00.01 sudo ROUTER_PID=/usr/local/myrouter/mysqlrouter.pid /usr/local/bin/mysqlrouter -c /usr/local/myrouter/mysqlrouter.conf --user=mysql
   74  2646  2645   0  9:27AM ttys000    0:39.63 /usr/local/bin/mysqlrouter -c /usr/local/myrouter/mysqlrouter.conf --user=mysql
    0  2764  2356   0 11:31AM ttys002    0:00.00 grep router
~:root # kill -9 2646

Remove the /usr/local/myrouter directory.

cd /usr/local/
ls -ld myrouter
rm -r /usr/local/myrouter

/usr/local:root # cd /usr/local/
/usr/local:root # ls -ld myrouter
drwx------  10 _mysql  _mysql  340 Apr 24 11:40 myrouter
/usr/local:root # rm -r /usr/local/myrouter
/usr/local:root # ls -ld myrouter
ls: myrouter: No such file or directory

You can now start the whole process over again.

Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, MySQL Enterprise Edition, MySQL InnoDB Cluster, MySQL Replication, Uncategorized

MySQL Enterprise Edition Database Firewall – Control and Monitor SQL Statement Executions

April 23, 2018 Leave a comment

As of MySQL 5.6.24, MySQL Enterprise Edition includes MySQL Enterprise Firewall, an application-level firewall (it runs within the mysql database process) that enables database administrators to permit or deny SQL statement execution based on matching against whitelists of accepted statement patterns. This helps harden MySQL Server against attacks such as SQL injection or attempts to exploit applications by using them outside of their legitimate query workload characteristics.

Each MySQL account registered with the firewall has its own whitelist of statement patterns (a tokenized representation of a SQL statement), enabling protection to be tailored per account. For a given account, the firewall can operate in recording or protecting mode, for training in the accepted statement patterns or protection against unacceptable statements. The diagram illustrates how the firewall processes incoming statements in each mode.

MySQL Enterprise Firewall Operation

(from https://dev.mysql.com/doc/refman/5.6/en/firewall.html)

If you do not have a MySQL Enterprise Edition license, you may download a trial version of the software via Oracle eDelivery. The MySQL Firewall is included in the MySQL Product Pack, specifically for MySQL Database 5.6.24 or higher.

MySQL Enterprise Firewall has these components:

A server-side plugin named MYSQL_FIREWALL that examines SQL statements before they execute and, based on its in-memory cache, renders a decision whether to execute or reject each statement.
Server-side plugins named MYSQL_FIREWALL_USERS and MYSQL_FIREWALL_WHITELIST implement INFORMATION_SCHEMA tables that provide views into the firewall data cache.
System tables named firewall_users and firewall_whitelist in the mysql database provide persistent storage of firewall data.
A stored procedure named sp_set_firewall_mode() registers MySQL accounts with the firewall, establishes their operational mode, and manages transfer of firewall data between the cache and the underlying system tables.
A set of user-defined functions provides an SQL-level API for synchronizing the cache with the underlying system tables.
System variables enable firewall configuration and status variables provide runtime operational information.

(from https://dev.mysql.com/doc/refman/5.6/en/firewall-components.html)

Installing the Firewall

Installing the firewall is fairly easy. After you install MySQL version 5.6.24 or greater, you simply execute an SQL script that is located in the $MYSQL_HOME/share directory. There are two versions of the script, one for Linux and one for Windows (the firewall isn’t supported on the Mac yet).

The scripts are named win_install_firewall.sql for Windows and linux_install_firewall.sql for linux. You may execute this script from the command line or via MySQL Workbench. For the command line, be sure you are in the directory where the script is located.

shell> mysql -u root -p mysql < win_install_firewall.sql
Enter password: (enter root password here)

The script create the firewall tables, functions, stored procedures and installs the necessary plugins. The script contains the following:

# Copyright (c) 2015 Oracle and/or its affiliates. All rights reserved.
# Install firewall tables
USE mysql;
CREATE TABLE IF NOT EXISTS mysql.firewall_whitelist( USERHOST VARCHAR(80) NOT NULL, RULE text NOT NULL) engine= MyISAM;
CREATE TABLE IF NOT EXISTS mysql.firewall_users( USERHOST VARCHAR(80) PRIMARY KEY, MODE ENUM ('OFF', 'RECORDING', 'PROTECTING', 'RESET') DEFAULT 'OFF') engine= MyISAM;

INSTALL PLUGIN mysql_firewall SONAME 'firewall.dll';
INSTALL PLUGIN mysql_firewall_whitelist SONAME 'firewall.dll';
INSTALL PLUGIN mysql_firewall_users SONAME 'firewall.dll';

CREATE FUNCTION set_firewall_mode RETURNS STRING SONAME 'firewall.dll';
CREATE FUNCTION normalize_statement RETURNS STRING SONAME 'firewall.dll';
CREATE AGGREGATE FUNCTION read_firewall_whitelist RETURNS STRING SONAME 'firewall.dll';
CREATE AGGREGATE FUNCTION read_firewall_users RETURNS STRING SONAME 'firewall.dll';
delimiter //
CREATE PROCEDURE sp_set_firewall_mode (IN arg_userhost VARCHAR(80), IN arg_mode varchar(12))
BEGIN
IF arg_mode = "RECORDING" THEN
  SELECT read_firewall_whitelist(arg_userhost,FW.rule) FROM mysql.firewall_whitelist FW WHERE FW.userhost=arg_userhost;
END IF;
SELECT set_firewall_mode(arg_userhost, arg_mode);
if arg_mode = "RESET" THEN
  SET arg_mode = "OFF";
END IF;
INSERT IGNORE INTO mysql.firewall_users VALUES (arg_userhost, arg_mode);
UPDATE mysql.firewall_users SET mode=arg_mode WHERE userhost = arg_userhost;

IF arg_mode = "PROTECTING" OR arg_mode = "OFF" THEN
  DELETE FROM mysql.firewall_whitelist WHERE USERHOST = arg_userhost;
  INSERT INTO mysql.firewall_whitelist SELECT USERHOST,RULE FROM INFORMATION_SCHEMA.mysql_firewall_whitelist WHERE USERHOST=arg_userhost;
END IF;
END //
delimiter ;

After you run the script, the firewall should be enabled. You may verify it by running this statement:

mysql> SHOW GLOBAL VARIABLES LIKE 'mysql_firewall_mode';
+-------------------------------+-------+
| Variable_name                 | Value |
+-------------------------------+-------+
| mysql_firewall_max_query_size |  4096 |
| mysql_firewall_mode           |    ON |
| mysql_firewall_trace          |   OFF |
+-------------------------------+-------+

Testing the Firewall

To test the firewall, you may use a current mysql user, but we are going to create a test user for this example – webuser@localhost. (The user probably doesn’t need all privileges, but for this example we will grant everything to this user)

CREATE USER 'webuser'@'localhost' IDENTIFIED BY 'Yobuddy!';
'GRANT ALL PRIVILEGES ON *.* TO 'webuser'@'localhost' WITH GRANT OPTION'

OPTIONAL: For our test, we will be using the sakila schema provided by MySQL. You may download the sakila database schema (requires MySQL 5.0 or later) at http://dev.mysql.com/doc/index-other.html. If you don’t want to use the sakila database, you may use your own existing database or create a new database.

After downloading the sakila schema, you will have two files, named sakila-schema.sql and sakila-data.sql. Execute the sakila-schema.sql first, and then sakila-data.sql to populate the database with data. If you are using the command line, simply do the following: (substitute UserName for a mysql user name)

# mysql -uUserName -p < sakila-schema.sql
# mysql -uUserName -p < sakila-data.sql

After creating the sakila schema and importing the data, we now set the firewall to record those queries which we want to allow:

mysql> CALL `mysql`.`sp_set_firewall_mode`("webuser@localhost","RECORDING")
+-----------------------------------------------+
| read_firewall_whitelist(arg_userhost,FW.rule) |
+-----------------------------------------------+
| Imported users: 0  Imported rules: 0          |
+-----------------------------------------------+
1 row in set (0.14 sec)

+-------------------------------------------+
| set_firewall_mode(arg_userhost, arg_mode) |
+-------------------------------------------+
| OK                                        |
+-------------------------------------------+
1 row in set (0.22 sec)
Query OK, 5 rows affected (0.28 sec)

We can check to see the firewall mode via this statement, to be sure we are in the recording mode:

mysql> SELECT * FROM MYSQL.FIREWALL_USERS;
+-------------------+------------+
| USERHOST          | MODE       |
+-------------------+------------+
| webuser@localhost |  RECORDING |
+-------------------+------------+
1 row in set (0.02 sec)

Now that we have recording turned on, let’s run a few queries:

mysql> use sakila
Database changed
mysql> show tables;
+----------------------------+
| Tables_in_sakila           |
+----------------------------+
| actor                      |
| actor_info                 |
| address                    |
| category                   |
| city                       |
| country                    |
| customer                   |
| customer_list              |
| film                       |
| film_actor                 |
| film_category              |
| film_list                  |
| film_text                  |
| inventory                  |
| language                   |
| nicer_but_slower_film_list |
| payment                    |
| rental                     |
| sales_by_film_category     |
| sales_by_store             |
| staff                      |
| staff_list                 |
| store                      |
+----------------------------+
23 rows in set (0.00 sec)

mysql> select * from actor limit 2;
+----------+------------+-----------+---------------------+
| actor_id | first_name | last_name | last_update         |
+----------+------------+-----------+---------------------+
|        1 | PENELOPE   | GUINESS   | 2006-02-15 04:34:33 |
|        2 | NICK       | WAHLBERG  | 2006-02-15 04:34:33 |
+----------+------------+-----------+---------------------+
2 rows in set (0.13 sec)

mysql> select first_name, last_name from actor where first_name like 'T%';
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| TIM        | HACKMAN   |
| TOM        | MCKELLEN  |
| TOM        | MIRANDA   |
| THORA      | TEMPLE    |
+------------+-----------+
4 rows in set (0.00 sec)

We turn off the recording by turning on the protection mode:

mysql> CALL `mysql`.`sp_set_firewall_mode`("webuser@localhost","PROTECTING");
+-------------------------------------------+
| set_firewall_mode(arg_userhost, arg_mode) |
+-------------------------------------------+
| OK                                        |
+-------------------------------------------+
1 row in set (0.00 sec)

We can check to see the firewall mode via this statement:

mysql> SELECT * FROM MYSQL.FIREWALL_USERS;
+-------------------+------------+
| USERHOST          | MODE       |
+-------------------+------------+
| webuser@localhost | PROTECTING |
+-------------------+------------+
1 row in set (0.02 sec)

And we can look at our whitelist of statements:

mysql>  SELECT * FROM MYSQL.FIREWALL_WHITELIST;
+-------------------+-------------------------------------------------------------------+
| USERHOST          | RULE                                                              |
+-------------------+-------------------------------------------------------------------+
| webuser@localhost | SELECT * FROM actor LIMIT ?                                       |
| webuser@localhost | SELECT SCHEMA ( )                                                 |
| webuser@localhost | SELECT first_name , last_name FROM actor WHERE first_name LIKE ?  |
| webuser@localhost | SHOW TABLES                                                       |
+-------------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)

The firewall is now protecting against non-whitelisted queries. We can execute a couple of the queries we previously ran, which should be allowed by the firewall.

mysql> show tables;
+----------------------------+
| Tables_in_sakila           |
+----------------------------+
| actor                      |
| actor_info                 |
| address                    |
| category                   |
| city                       |
| country                    |
| customer                   |
| customer_list              |
| film                       |
| film_actor                 |
| film_category              |
| film_list                  |
| film_text                  |
| inventory                  |
| language                   |
| nicer_but_slower_film_list |
| payment                    |
| rental                     |
| sales_by_film_category     |
| sales_by_store             |
| staff                      |
| staff_list                 |
| store                      |
+----------------------------+
23 rows in set (0.01 sec)

Now we run two new queries, which should be blocked by the firewall.

mysql> select * from rental;
ERROR 1045 (42000): Firewall prevents statement

mysql> select * from staff;
ERROR 1045 (42000): Firewall prevents statement

The server will write an error message to the log for each statement that is rejected. Example:

2015-03-21T22:59:05.371772Z 14 [Note] Plugin MYSQL_FIREWALL reported:
'ACCESS DENIED for webuser@localhost. Reason: No match in whitelist.
Statement: select * from rental '

You can use these log messages in your efforts to identify the source of attacks.

To see how much firewall activity you have, you may look look at the status variables:

mysql> SHOW GLOBAL STATUS LIKE 'Firewall%';
+-------------------------+-------+
| Variable_name           | Value |
+-------------------------+-------+
| Firewall_access_denied  | 42    |
| Firewall_access_granted | 55    |
| Firewall_cached_entries | 78    |
+-------------------------+-------+

The variables indicate the number of statements rejected, accepted, and added to the cache, respectively.

The MySQL Enterprise Firewall Reference is found at https://dev.mysql.com/doc/refman/5.6/en/firewall-reference.html.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, MySQL Enterprise Edition, Scripting, Uncategorized

MySQL Enterprise Audit – parsing audit information from log files, inserting into MySQL table via LOAD DATA INFILE and Perl script

November 6, 2017 Leave a comment

The MySQL Enterprise Audit plug-in is part of the MySQL Enterprise Edition (available through a paid license). Basically, Enterprise Audit tracks everything that is happening on your MySQL server, and can be used to protect/detect the misuse of information, and to meet popular compliance regulations including HIPAA, Sarbanes-Oxley, and the PCI Data Security Standard.

MySQL Enterprise Audit uses the open MySQL Audit API to enable standard, policy-based monitoring and logging of connection and query activity executed on specific MySQL servers. Designed to meet the Oracle audit specification, MySQL Enterprise Audit provides an out of box, easy to use auditing and compliance solution for applications that are governed by both internal and external regulatory guidelines.

When installed, the audit plugin enables MySQL Server to produce a log file containing an audit record of server activity. The log contents include when clients connect and disconnect, and what actions they perform while connected, such as which databases and tables they access.
(from https://dev.mysql.com/doc/refman/5.6/en/audit-log-plugin.html)

When you enable MySQL Enterprise Audit, log files are generated in your MySQL data directory. You can use tools like MySQL Workbench (Enterprise Edition) or Oracle Audit Vault to import the log data, to view the information and to generate reports.

I was talking with a client, and he wanted to know if the audit data could be stored in a table. Currently (as of MySQL 5.6.25), the audit information is stored as XML in the audit log files. There are several ways to do this, and I will cover two methods.

The first is to use the LOAD XML [LOCAL] INFILE command. You will need to create a table to store the audit information:

CREATE TABLE audit_log (
   RECORD_ID varchar(40) NOT NULL,
   NAME varchar(64),
   TIMESTAMP timestamp,
   COMMAND_CLASS varchar(64),
   CONNECTION_ID bigint unsigned,
   DB varchar(64),
   HOST varchar(60),
   IPv4 int unsigned,
   IPv6 varbinary(16),
   MYSQL_VERSION varchar(64),
   OS_LOGIN varchar(64),
   PRIV_USER varchar(16),
   PROXY_USER varchar(16),
   SERVER_ID int unsigned,
   SQLTEXT longtext,
   STARTUP_OPTIONS text,
   STATUS int unsigned,
   STATUS_CODE int unsigned,
   USER varchar(168),
   VERSION int unsigned,
   PRIMARY KEY(RECORD_ID)
) DEFAULT CHARSET utf8mb4;

You can then load the data as:

LOAD XML LOCAL INFILE 'audit.log'
    INTO TABLE audit_log
    CHARACTER SET utf8mb4
    ROWS IDENTIFIED BY ''
         (RECORD_ID, NAME, @TIMESTAMP, COMMAND_CLASS, CONNECTION_ID, DB, HOST, @IP, MYSQL_VERSION, OS_LOGIN, PRIV_USER, PROXY_USER, SERVER_ID, SQLTEXT, STARTUP_OPTIONS, STATUS, STATUS_CODE, USER, VERSION)
     SET TIMESTAMP = CONVERT_TZ(STR_TO_DATE(@TIMESTAMP, '%Y-%m-%dT%H:%i:%s UTC'), 'UTC', 'Australia/Sydney'),
         IPv4 = IF(IS_IPV4(@IP), INET_ATON(@IP), NULL),
         IPv6 = IF(IS_IPV6(@IP), INET6_ATON(@IP), NULL);

Important notes for the above example:

The example converts the IP address to the numeric version and stores it in IPv4 or IPv6 depending on the type of IP address. It is also possible to store the IP address in a shared varchar() column.
Replace the target time zone in CONVERT_TZ() with the time zone of you system.
The use of CONVERT_TZ() requires named time zones to be loaded or that you use a numeric offset such as +10:00.

RECORD_ID is guaranteed unique with the following limitations:

If you change audit_log_format, the counter will reset. However as the timestamp is also included in the RECORD_ID this will in general not be an issue.
It is only unique for a given MySQL instance. If you want to import the audit logs for multiple instances, you can for example add the server_uuid to the table and use as part of the primary key.
server_uuid is available starting from MySQL 5.6.
The matching between the elements in the XML file and the column names is case sensitive. All the elements in the audit log are in upper case.
The ROWS IDENTIFIED BY ” clause is required unless the table is named AUDIT_RECORD as the table name is the default elements to look for as rows.

Here are some links for more reading on using LOAD XML [LOCAL] INFILE:

https://dev.mysql.com/doc/refman/5.6/en/audit-log-plugin-options-variables.html#sysvar_audit_log_format
https://dev.mysql.com/doc/refman/5.6/en/load-xml.html
https://dev.mysql.com/doc/refman/5.6/en/date-and-time-functions.html#function_convert-tz
https://dev.mysql.com/doc/refman/5.6/en/time-zone-support.html
https://dev.mysql.com/doc/refman/5.6/en/replication-options.html#sysvar_server_uuid

For the second option, I wrote a quick Perl script that would parse the XML log files and insert the information into a MySQL database. You will need to set the size of your audit log files in your my.cnf or my.ini configuration file via the audit_log_rotate_on_size variable. You might need to adjust the size of your log files based upon database activity and how well the script parses the log files. If your log files are very large, the Perl script might have issues processing it, and you might want to decrease the size of your log files and run the script more frequently.

CAVEAT
Enterprise Audit does require a license from MySQL. If you are interested in an Enterprise subscription, contact me via the comment section below. If you are an Enterprise customer, you will need to configure Enterprise Audit first. See the Enterprise Audit online documentation page for more information, or contact MySQL Support.

For the data fields, I used the audit log file format information found at The Audit Log File page on MySQL.com.

My MySQL server doesn’t have a lot of activity, so I tried to configure the size of the data fields as best as possible to accommodate the possible size of the data in each field. There may be instances where you will have to increase the size of these fields or change their data types. The largest field is the SQL_TEXT field which will contain your SQL statements. Every table has a max row size of 65,535 bytes. So, the largest possible size of the SQL_TEXT field could be for this example is around 63,200 bytes (65,535 bytes minus the sum of the size of all of the other fields, and minus the 1-byte or 2-byte length prefix used for each varchar field). In this example, the SQL_TEXT field is set to 8,096 bytes, so you may need to increase or decrease this value.

I used varchar data types for each field, excluding the primary key field named ID. I did not spend a lot of time on the database schema, so you might want to modify it a little. I am sure that some of the fields are integers, but I did not have enough data in my log files to positively determine all of the possible values for each field. I did read the online manual, and it stated that CONNECTION_ID, SERVER_ID, STATUS, STATUS_CODE and VERSION were unsigned integers – but I left them as varchar.

NOTICE
This script requires the use of the new format for the audit log files, which is available in MySQL versions 5.6.20 or later.

I created a database along with two tables; one to store the log file information, and a history table to keep track of what files had already been parsed and inserted into MySQL, as well as the number of log file entries. The CREATE DATABASE and CREATE TABLE syntax is as follows:

CREATE DATABASE `audit_information` /*!40100 DEFAULT CHARACTER SET latin1 */

CREATE TABLE `audit_parsed` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `COMMAND_CLASS` varchar(64) DEFAULT NULL,
  `CONNECTIONID` varchar(32) DEFAULT NULL,
  `DB_NAME` varchar(64) DEFAULT NULL,
  `HOST_NAME` varchar(256) DEFAULT NULL,
  `IP_ADDRESS` varchar(16) DEFAULT NULL,
  `MYSQL_VERSION` varchar(64) DEFAULT NULL,
  `COMMAND_NAME` varchar(64) DEFAULT NULL,
  `OS_LOGIN` varchar(64) DEFAULT NULL,
  `OS_VERSION` varchar(256) DEFAULT NULL,
  `PRIV_USER` varchar(16) DEFAULT NULL,
  `PROXY_USER` varchar(16) DEFAULT NULL,
  `RECORD_ID` varchar(64) DEFAULT NULL,
  `SERVER_ID` varchar(32) DEFAULT NULL,
  `SQL_TEXT` varchar(8096) DEFAULT NULL,
  `STARTUP_OPTIONS` varchar(1024) DEFAULT NULL,
  `COMMAND_STATUS` varchar(64) DEFAULT NULL,
  `STATUS_CODE` varchar(11) DEFAULT NULL,
  `DATE_TIMESTAMP` varchar(24) DEFAULT NULL,
  `USER_NAME` varchar(128) DEFAULT NULL,
  `LOG_VERSION` varchar(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

CREATE TABLE `audit_history` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `AUDIT_LOG_NAME` varchar(64) DEFAULT NULL,
  `PARSED_DATE_TIME` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `LOG_ENTRIES` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

The Perl script finds the non-active log files (which end in .xml – example: audit.log.14357895017796690.xml), parses the data, creates an SQL file with INSERT statements, imports the data via the mysql command-line program, and then moves the log file(s) and SQL file(s) to a directory. The history table records what files have been processed, so you don’t accidentally process the same file twice.

In the beginning of the Perl script, there are several values you need to replace to match your system. The values are under the section titled “values needed”. Here is the Perl script (named audit.pl):

#!/usr/bin/perl
# audit.pl

use DBI;
use CGI;
use XML::Simple;

#----------------------------------------------------------
# values needed
$Database = "audit_information";
$MYSQL_DATA_DIR = "/usr/local/mysql/data";
$MySQL_Host_IP_Name = "192.168.1.2";
$mysql_user = "root";
$mysql_password = "password_needed";

# directory to store old audit files after parsing
$audit_directory = "$MYSQL_DATA_DIR/audit_files";

# make an audit_files directory if one does not exist
mkdir($audit_directory) unless(-d $audit_directory);
#----------------------------------------------------------


#----------------------------------------------------------
#for each file do this
@files = @files = ;;
foreach $file_name_to_parse (@files) {

	#----------------------------------------------------------
	# check to see if file has already been parsed
	$dbh1 = ConnectToMySql($Database);
	$query1 = "select AUDIT_LOG_NAME from audit_history where AUDIT_LOG_NAME = '$file_name_to_parse'";
	$sth1 = $dbh1->prepare($query1);
	$sth1->execute();

          while (@data = $sth1->fetchrow_array()) {
            
         	   $audit_log_name = $data[0];

			}

	# if length of audit_log_name is less than 1, process file
	if (length($audit_log_name) $PARSED_FILE") or die print "Couldn't open log_file: $!";
		
		$count = 0;
		
		# XML::Simple variable - SuppressEmpty => 1   ignore empty values
		$xml = XML::Simple->new(SuppressEmpty => 1);
		$data = $xml->XMLin("$file_name_to_parse");
		
		foreach $info (@{$data->{AUDIT_RECORD}})
		{
			# replace tick marks ' with \' in the SQL TEXT
			$info->{"SQLTEXT"} =~ s/'/\\'/g;
		
			print LOGOUT "INSERT INTO audit_information.AUDIT_PARSED (COMMAND_CLASS, CONNECTIONID, DB_NAME, HOST_NAME, IP_ADDRESS, MYSQL_VERSION, COMMAND_NAME, OS_LOGIN, OS_VERSION, PRIV_USER, PROXY_USER, RECORD_ID, SERVER_ID, SQL_TEXT, STARTUP_OPTIONS, COMMAND_STATUS, STATUS_CODE, DATE_TIMESTAMP, USER_NAME, LOG_VERSION) values ('" . $info->{"COMMAND_CLASS"} . "', '" . $info->{"CONNECTION_ID"} . "', '" . $info->{"DB"} . "', '" . $info->{"HOST"} . "', '" . $info->{"IP"} . "', '" . $info->{"MYSQL_VERSION"} . "', '" . $info->{"NAME"} . "', '" . $info->{"OS_LOGIN"} . "', '" . $info->{"OS_VERSION"} . "', '" . $info->{"PRIV_USER"} . "', '" . $info->{"PROXY_USER"} . "', '" . $info->{"RECORD_ID"} . "', '" . $info->{"SERVER_ID"} . "', '" . $info->{"SQLTEXT"} . "', '" . $info->{"STARTUP_OPTIONS"} . "', '" . $info->{"STATUS"} . "', '" . $info->{"STATUS_CODE"} . "', '" . $info->{"TIMESTAMP"} . "', '" . $info->{"USER"} . "', '" . $info->{"VERSION"} . "');\n";
			$count++;
		
		# end foreach $info (@{$data->{AUDIT_RECORD}})
		}
		
		# load parsed file into MySQL - hide warnings
		system("mysql -u$mysql_user -p$mysql_password  /dev/null 2>&1");
										
		$dbh2 = ConnectToMySql($Database);
		$query2 = "insert into audit_information.audit_history (AUDIT_LOG_NAME, LOG_ENTRIES) values ('$file_name_to_parse', '$count')";
		
		# optional print output - uncomment if desired
		# print "$query2\n";
																													
		$sth2 = $dbh2->prepare($query2);
		$sth2->execute();

		# close audit log file
		close(INFILE);

		# optional print output - uncomment if desired
		# print "Moving audit log ($file_name_to_parse) and log file ($PARSED_FILE) to $audit_directory.\n";
		
		# strip directories off $file_name_to_parse
		@file_name_to_move_array = split("\/",$file_name_to_parse);
		$directory_count = $#file_name_to_move_array;
		$file_name_to_move = $file_name_to_move_array[$directory_count];
		
		
		# optional print output - uncomment if desired
		# print "mv $file_name_to_move $file_name_to_parse\n";
		# print "mv $PARSED_FILE $audit_directory\n";

		# move audit log files and parsed log files to $audit_directory
		system("mv $file_name_to_parse $audit_directory");
		system("mv $PARSED_FILE $audit_directory");

	# end - if (length($audit_log_name) < 1)
	}

	else

	{
		# optional print output - uncomment if desired
		# print "$audit_log_name already processed\n";
		system("mv $file_name_to_parse $audit_directory");
	}

# end - foreach $file_name_to_parse (@files) 
}

sub ConnectToMySql {

   $connectionInfo="dbi:mysql:$Database;$MySQL_Host_IP_Name:3306";

   # make connection to database
   $l_dbh = DBI->connect($connectionInfo,$mysql_user,$mysql_password);
   return $l_dbh;

}

It should not matter where you execute audit.pl, as long as you have correctly entered the required values in the script. You might get errors if you try to run this script on a log file that has not been rotated, which is the current log file in your MySQL data directory. The current log file is named audit.log.

# pwd
/usr/local/mysql/data
# ls -l audit.log
-rw-rw----  1 mysql  _mysql  9955118 Jul  2 15:25 audit.log

The script will only work on files ending in .xml. For testing, I used four small (and identical) audit log files:

# pwd
/usr/local/mysql/data
# ls -l *xml
-rw-rw----  1 mysql  wheel   15508 Jul  2 12:20 audit.log.14357895017796690.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796691.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796692.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796693.xml

I have commented-out the print statements in the Perl script, but if you uncomment them, running the script gives you this output for each log file:

# perl audit.pl
Parsing - /usr/local/mysql/data/audit.log.14357895017796690.xml
insert into audit_information.audit_history (AUDIT_LOG_NAME, LOG_ENTRIES) values ('/usr/local/mysql/data/audit.log.14357895017796690.xml', '34')
Moving audit log (/usr/local/mysql/data/audit.log.14357895017796690.xml) and log file (/usr/local/mysql/data/audit.log.14357895017796690_parsed.sql) to /usr/local/mysql/data/audit_files.
mv audit.log.14357895017796690.xml /usr/local/mysql/data/audit.log.14357895017796690.xml
mv /usr/local/mysql/data/audit.log.14357895017796690_parsed.sql /usr/local/mysql/data/audit_files
....

After running my test script, the following data is what is in the audit_history table:

mysql> use audit_information
Database changed
mysql> select * from audit_history;
+----+-------------------------------------------------------+---------------------+-------------+
| ID | AUDIT_LOG_NAME                                        | PARSED_DATE_TIME    | LOG_ENTRIES |
+----+-------------------------------------------------------+---------------------+-------------+
|  1 | /usr/local/mysql/data/audit.log.14357895017796690.xml | 2015-07-02 15:25:07 | 34          |
|  2 | /usr/local/mysql/data/audit.log.14357895017796691.xml | 2015-07-02 15:25:08 | 34          |
|  3 | /usr/local/mysql/data/audit.log.14357895017796692.xml | 2015-07-02 15:25:08 | 34          |
|  4 | /usr/local/mysql/data/audit.log.14357895017796693.xml | 2015-07-02 15:25:09 | 34          |
+----+-------------------------------------------------------+---------------------+-------------+
4 rows in set (0.00 sec)

And here is an example of one line from the audit_parsed table.

mysql> select * from audit_parsed limit 1 \G
*************************** 1. row ***************************
             ID: 1
  COMMAND_CLASS: select
   CONNECTIONID: 10093
        DB_NAME: 
      HOST_NAME: localhost
     IP_ADDRESS: 127.0.0.1
  MYSQL_VERSION: 
   COMMAND_NAME: Query
       OS_LOGIN: 
     OS_VERSION: 
      PRIV_USER: 
     PROXY_USER: 
      RECORD_ID: 1614933_2015-07-01T22:08:58
      SERVER_ID: 
       SQL_TEXT: SELECT (UNIX_TIMESTAMP(now()) - CAST(variable_value AS SIGNED))*1000 as serverStartMillis 
  FROM information_schema.global_status 
 WHERE variable_name='uptime'
STARTUP_OPTIONS: 
 COMMAND_STATUS: 0
    STATUS_CODE: 0
 DATE_TIMESTAMP: 2015-07-01T22:08:58 UTC
      USER_NAME: root[root] @ localhost [127.0.0.1]
    LOG_VERSION: 
1 row in set (0.00 sec)

After parsing the log files, you can then write your own queries for searching through your audit data. You can even include this script in cron, so it runs and parses your files automatically. But as always, test this script and use it with caution before putting it in a production environment. You could also modify the Perl script to filter out values you did not want or need to store.

If you do use this script or if you have any suggestions or other questions, please leave a comment below.

Thanks to Jesper Krogh for providing the information on the LOAD XML [LOCAL] INFILE.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, MySQL Enterprise Audit, MySQL Enterprise Edition, Perl, Scripting, Uncategorized

MySQL 5.7 multi-source replication – automatically combining data from multiple databases into one

July 8, 2017 1 Comment

MySQL’s multi-source replication allows you to replicate data from multiple databases into one database in parallel (at the same time). This post will explain and show you how to set up multi-source replication. (WARNING: This is a very long and detailed post. You might want to grab a sandwich and a drink.)

In most replication environments, you have one master database and one or more slave databases. This topology is used for high-availability scenarios, where the reads and writes are split between multiple servers. Your application sends the writes to the master, and reads data from the slaves. This is one way to scale MySQL horizontally for reads, as you can have more than one slave. Multi-source replication allows you to write to multiple MySQL instances, and then combine the data into one server.

Here is a quick overview of MySQL multi-source replication:

MySQL Multi-Source Replication enables a replication slave to receive transactions from multiple sources simultaneously. Multi-source replication can be used to back up multiple servers to a single server, to merge table shards, and consolidate data from multiple servers to a single server. Multi-source replication does not implement any conflict detection or resolution when applying the transactions, and those tasks are left to the application if required. In a multi-source replication topology, a slave creates a replication channel for each master that it should receive transactions from. (from https://dev.mysql.com/doc/refman/5.7/en/replication-multi-source-overview.html)

In this post, I will demonstrate how to setup multi-source replication with two masters and one slave (as shown in the right side of the above picture). This will involve a new installation of MySQL 5.7.10 for each server.

I am not going to explain how to install MySQL, but you do need to follow the post-installation instructions for your operating system. If you don’t run the mysqld initialize post-installation process for each install, you will run into a lot of problems (I will explain this later). I will start with what you need to do post-installation, after the server is up and running. In this example, I have turned off GTID’s, and I will enable GTID later in the process. I have written several posts on replication, so I am going to assume you have some knowledge on how to setup replication, what GTID’s are, and how replication works. I will also show you some errors you may encounter.

You may visit these posts to learn more about replication:

MySQL Replication with Global Transaction Identifiers – Step-by-Step Install and Addition of Slaves – Part One
MySQL Replication with Global Transaction Identifiers – Step-by-Step Install and Addition of Slaves – Part Two
MySQL Replication – Creating a New Master/Slave Topology with or without Virtual Machines
MySQL Replication – Multi-Threaded Slaves (Parallel Event Execution)
MySQL 5.6 Delayed Replication – Making a Slave Deliberately Lag Behind a Master

Prior to installation, you will need to make sure the repositories on the slave are being stored in a table. I have this enabled on all three servers, but it is only required for the slave. If you don’t have this enabled, you can enable it via your configuration (my.cnf or my.ini) file:

[mysqld]
master-info-repository=TABLE
relay-log-info-repository=TABLE

If you did not enable this earlier, you will want to modify your configuration file and restart MySQL. You can check to see if this is enabled via this command:

mysql> SHOW VARIABLES LIKE '%repository%';
+---------------------------+-------+
| Variable_name             | Value |
+---------------------------+-------+
| master_info_repository    | TABLE |
| relay_log_info_repository | TABLE |
+---------------------------+-------+
2 rows in set (0.00 sec)

You will also need to modify your configuration files (my.cnf or my.ini) to make sure each server has a unique server_id. I use the last three digits of the IP address for each server as my server_id, as in this example:

[mysqld]
server-id=141

To view the server_id for a given server, execute this command:

mysql> SHOW VARIABLES WHERE VARIABLE_NAME = 'server_id';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 141   |
+---------------+-------+
1 row in set (0.00 sec)

I will be using three servers, and each one has MySQL 5.7.10 installed:

Server #1 – Slave – IP 192.168.1.141
Server #2 – Master #1 – IP 192.168.1.142
Server #3 – Master #2 – IP 192.168.1.143

I will refer to each of these servers as either Slave, Master #1 or Master #2.

NOTE: With MySQL 5.7, if you use a GUI-installation, a password is generated for root during the install process, and it should appear in the installation GUI. If you don’t see the password, it will be in your error log file. Also, when you run the post-installation process, a new root password may be generated again, and this password will also be located in your error log file.

# grep root error.log
2015-12-09T05:34:01.639797Z 1 [Note] A temporary password is generated for root@localhost: T<e-hd0cgI!d

You will need this password to continue using MySQL, and you will need to change it before issuing any other commands. Here is the command to change the password for root:

ALTER USER 'root'@'localhost' IDENTIFIED BY 'password';

The key to making multi-source replication work is to ensure you don’t have the same primary keys on your two masters. This is true especially if you are using AUTO_INCREMENT columns. If both masters have the same primary key for two different records, the data could be corrupted once it reaches the slave. I will show you one way to setup alternating key values using AUTO_INCREMENT. Of course, there are other ways to do this, including having your application generate the value for the keys.

If you don’t turn off GTID’s (via your configuration file) prior to running the post-installation steps, you will encounter a problem in that GTID’s will be created for the mysqld initialize process, and these transactions will be replicated to the slave. Let’s assume you enabled GTID’s from the start, before you ran the post-installation steps, and then you attempted to start replication on the slave. When you run the SHOW MASTER STATUS command, you will see something like this, showing you the 138 transactions which were executed on the master, and would now be replicated to the slave:

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000002
         Position: 1286
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 73fdfd2a-9e36-11e5-8592-00a64151d129:1-138
1 row in set (0.00 sec)

On the slave, you would see an error in the SHOW SLAVE STATUS:

Last_Error: Error 'Can't create database 'mysql'; database exists' on query. Default database: 'mysql'. Query: 'CREATE DATABASE mysql;

And the RETRIEVED GTID SET would look like this, showing you the 138 transactions which have already been copied to the slave.

mysql> SHOW SLAVE STATUS\G
...
Retrieved_Gtid_Set: 73fdfd2a-9e36-11e5-8592-00a64151d129:1-138
...

You can attempt to skip these transactions, but it is much easier to wait and enable GTID’s later.

After the post-installation steps, you will get the same results on all three servers for a SHOW MASTER STATUS command:

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000002
         Position: 398
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

Next, you will need to create the replication user on each of the master servers (where 192.168.1.141 is the IP address of your slave).

mysql> CREATE USER 'replicate'@'192.168.1.141' IDENTIFIED BY 'password';
Query OK, 0 rows affected (0.05 sec)

mysql> GRANT REPLICATION SLAVE ON *.* TO 'replicate'@'192.168.1.141';
Query OK, 0 rows affected (0.01 sec)

After, you can see the additional changes (from creating the user and granting permissions) to the binary log via the SHOW MASTER STATUS command:

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000002
         Position: 873
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set:
1 row in set (0.00 sec)

Now we are ready to create our schemas on the slave and the master servers. For this example, I have created a small table to be used for storing information about a comic book collection. Here are the CREATE DATABASE and CREATE TABLE commands:

Slave

CREATE DATABASE `comicbookdb`;
use comicbookdb;
CREATE TABLE `comics` (
  `comic_id` int(9) NOT NULL AUTO_INCREMENT,
  `comic_title` varchar(60) NOT NULL,
  `issue_number` decimal(9,0) NOT NULL,
  `pub_year` varchar(60) NOT NULL,
  `pub_month` varchar(60) NOT NULL,
  PRIMARY KEY (`comic_id`)
) ENGINE=InnoDB AUTO_INCREMENT=1;

You can use the same SQL to create tables on the slave as you do on the master. Since we will using AUTO_INCREMENT values on the master, you might think you would not want to use AUTO_INCREMENT in the CREATE TABLE statement on the slave. But, since we will not be doing any writes to the slave, you can use the same CREATE TABLE statement as you use on a master. You will only need to modify the CREATE TABLE statements for the masters to create alternate primary keys values. (More on this later)

When the data replicates to the slave from the master, replication will handle the AUTO_INCREMENT columns.

Here is what happens when you create the comics table on the slave without specifying the AUTO_INCREMENT for the comic_id column, and then you start replication. From the SHOW SLAVE STATUS\G command:

mysql> SHOW SLAVE STATUS\G...
Last_SQL_Error: Error 'Field 'comic_id' doesn't have a default value' on query. Default database: 'comicbookdb'. Query: 'INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','5','2014','03')'
...

We now need to find a way to create different and alternating values for our primary key column – comic_id. You could have your application do this, but an easy way is to use the auto_increment_increment variable. In your configuration file (my.cnf or my.ini), you will want to add this for both master databases:

[mysqld]
auto_increment_increment = 2

Adding this variable will require a reboot of MySQL. But, you can set it during the mysql session if you don’t want to reboot. Just make sure to add it to your configuration file (my.cnf or my.ini), or it won’t take effect after the session ends.

mysql> SET @@auto_increment_increment=2;
Query OK, 0 rows affected (0.00 sec)

You can verify to see if this variable is enabled with this command:

mysql> SHOW VARIABLES WHERE VARIABLES_NAME = 'auto_increment_increment';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| auto_increment_increment    | 2     |
+-----------------------------+-------+
1 row in set (0.00 sec)

The auto_increment_increment variable will increment the AUTO_INCREMENT value by two (2) for each new primary key value. We will also need to use different initial primary key values for each master. You can’t simply use 0 (zero) and 1 (one) for the AUTO_INCREMENT value, as when you use the value of 0 (zero), it defaults back to a value of 1 (one). It is easier to set the AUTO_INCREMENT values to a higher number, with the last digits being 0 (zero) and 1 (one) for each master. Here are the CREATE DATABASE and CREATE TABLE commands for each master:

Master #1

CREATE DATABASE `comicbookdb`;
use comicbookdb;
CREATE TABLE `comics` (
  `comic_id` int(9) NOT NULL AUTO_INCREMENT,
  `comic_title` varchar(60) NOT NULL,
  `issue_number` decimal(9,0) NOT NULL,
  `pub_year` varchar(60) NOT NULL,
  `pub_month` varchar(60) NOT NULL,
  PRIMARY KEY (`comic_id`)
) ENGINE=InnoDB AUTO_INCREMENT=100000;

Master #2

CREATE DATABASE `comicbookdb`;
use comicbookdb;
CREATE TABLE `comics` (
  `comic_id` int(9) NOT NULL AUTO_INCREMENT,
  `comic_title` varchar(60) NOT NULL,
  `issue_number` decimal(9,0) NOT NULL,
  `pub_year` varchar(60) NOT NULL,
  `pub_month` varchar(60) NOT NULL,
  PRIMARY KEY (`comic_id`)
) ENGINE=InnoDB AUTO_INCREMENT=100001;

Now that we have all of our tables and users created, we can implement GTID’s on the master servers. I also implemented GTID’s on the slave, in case I wanted to add another slave to this slave. To enable GTID’s, I put the following my the configuration file (my.cnf or my.ini), and restarted MySQL. I added these variable below the auto_increment_increment variable.

[mysqld]
auto_increment_increment = 2
gtid-mode = on
enforce-gtid-consistency = 1

After you have restarted each server, you can take a look at the MASTER STATUS for each server, and the status should be the same:

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000005
         Position: 154
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

You don’t have to do this, but I like to reset the master status on both masters and the slave. Resetting the master deletes all binary log files listed in the index file, resets the binary log index file to be empty, and creates a new binary log file. On each server (both masters and slave servers), I ran this:

mysql> RESET MASTER;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 154
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 
1 row in set (0.00 sec)

You can see the new binary log (mysql-bin.000001), and the beginning position in the binary log (154). Let’s insert some data into one of the master databases, and then check the master’s status again. (And yes, we haven’t turned on replication yet).

Master #1

mysql> INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','1','2014','01');
Query OK, 1 row affected (0.02 sec)

mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 574
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1
1 row in set (0.00 sec)

You can see the GTID created for the INSERT statement – 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1. The first part of the GTID (63a7971c-b48c-11e5-87cf-f7b6a723ba3d) is the UUID of the master. The UUID information can be found in the auto.cnf file, located in the data directory.

Master #1

# cat auto.cnf
[auto]
server-uuid=63a7971c-b48c-11e5-87cf-f7b6a723ba3d

Let’s insert another row of data, check the master status, and then look at the entries of the comics table:

Master #1

mysql> INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','2','2014','02');
Query OK, 1 row affected (0.05 sec)

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 994
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1-2
1 row in set (0.00 sec)

mysql> select * from comics;
+----------+-------------+--------------+----------+-----------+
| comic_id | comic_title | issue_number | pub_year | pub_month |
+----------+-------------+--------------+----------+-----------+
|   100001 | Fly Man     |            1 | 2014     | 01        |
|   100003 | Fly Man     |            2 | 2014     | 02        |
+----------+-------------+--------------+----------+-----------+
2 rows in set (0.00 sec)

You can see how the values for the comic_id table are now incremented by two (2). Now we can insert two lines of data into the second master, look at the master’s status, and look at the entries in the comics database:

Master #2

mysql> INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','3','2014','03');
mysql> INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','4','2014','04');

mysql> show master status\G
*************************** 1. row ***************************
             File: mysql-bin.000005
         Position: 974
     Binlog_Do_DB: 
 Binlog_Ignore_DB: 
Executed_Gtid_Set: 75e2e1dc-b48e-11e5-83bb-1438deb0d51e:1-2
1 row in set (0.00 sec)

mysql> select * from comics;
+----------+-------------+--------------+----------+-----------+
| comic_id | comic_title | issue_number | pub_year | pub_month |
+----------+-------------+--------------+----------+-----------+
|   100002 | Fly Man     |            3 | 2014     | 03        |
|   100004 | Fly Man     |            4 | 2014     | 04        |
+----------+-------------+--------------+----------+-----------+
2 rows in set (0.00 sec)

The second master has a different UUID than the first master, and that is how we can tell what GTID’s belong to which master. We now have two sets of GTID’s to replicate over to the slave. Of course, the slave will have it’s own UUID as well.

Master #1 and Master #2 GTID sets:

63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1-2
75e2e1dc-b48e-11e5-83bb-1438deb0d51e:1-2

I always check to make sure the slave isn’t running before I do anything:

Slave

mysql> show slave status\G
Empty set (0.00 sec)

Unlike regular replication, in multi-source replication, you have to create a CHANNEL specific to each master. You will need to also name this channel, and I simply named the channels “master-142” and “master-143” to match their server_id‘s (as well as their IP addresses). Here is how you start replication for Master #1 (server_id=142).

Slave

mysql> CHANGE MASTER TO MASTER_HOST='192.168.1.142', MASTER_USER='replicate', MASTER_PASSWORD='password', MASTER_AUTO_POSITION = 1 FOR CHANNEL 'master-142';
Query OK, 0 rows affected, 2 warnings (0.23 sec)

This statement produced two warnings, but they can be ignored. I am following the same instructions on the MySQL Manual Page.

Slave

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
  Level: Note
   Code: 1759
Message: Sending passwords in plain text without SSL/TLS is extremely insecure.
*************************** 2. row ***************************
  Level: Note
   Code: 1760
Message: Storing MySQL user name or password information in the master info repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START SLAVE; see the 'START SLAVE Syntax' in the MySQL Manual for more information.
2 rows in set (0.00 sec)

Now we can start the slave for channel ‘master-142‘:

Slave

mysql> START SLAVE FOR CHANNEL 'master-142';
Query OK, 0 rows affected (0.03 sec)

This command is the same as starting the SQL_THREAD and the IO_THREAD at the same time. There may be times when you will want to stop and stop either of these threads, so here is the syntax – as you have to specify which channel you want to modify:

START SLAVE SQL_THREAD FOR CHANNEL 'master-142';
START SLAVE IO_THREAD FOR CHANNEL 'master-142';

You can also issue a simple START SLAVE command, and it will start both threads for all currently configured replication channels. The slave has been started, and we should see the GTID’s from Master #1 already retrieved and applied to the database. (I am not going to display the entire SHOW SLAVE STATUS output, as it very long)

Slave

mysql> SHOW SLAVE STATUS FOR CHANNEL 'master-142'\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.142
...
                  Master_UUID: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d
...
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
...
           Retrieved_Gtid_Set: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1-2
            Executed_Gtid_Set: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1-2
                Auto_Position: 1
...
                 Channel_Name: master-142

We can take a look at the comics table, and see the two entries from the Master #1 database (channel master-142):

Slave

mysql> select * from comics;
+----------+-------------+--------------+----------+-----------+
| comic_id | comic_title | issue_number | pub_year | pub_month |
+----------+-------------+--------------+----------+-----------+
|   100001 | Fly Man     |            1 | 2014     | 01        |
|   100003 | Fly Man     |            2 | 2014     | 02        |
+----------+-------------+--------------+----------+-----------+
2 rows in set (0.00 sec)

Since we have the first master up and running with replication, let’s start replication for the second master:

CHANGE MASTER TO MASTER_HOST='192.168.1.143', MASTER_USER='replicate', MASTER_PASSWORD='password', MASTER_AUTO_POSITION = 1 FOR CHANNEL 'master-143';

And we can check the SLAVE STATUS for this master: (Again, not all of the results are displayed below)

Slave

mysql> SHOW SLAVE STATUS FOR CHANNEL 'master-143'\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 192.168.1.143
...
                  Master_UUID: 75e2e1dc-b48e-11e5-83bb-1438deb0d51e
...
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
...
           Retrieved_Gtid_Set: 75e2e1dc-b48e-11e5-83bb-1438deb0d51e:1-2
            Executed_Gtid_Set: 63a7971c-b48c-11e5-87cf-f7b6a723ba3d:1-2,
75e2e1dc-b48e-11e5-83bb-1438deb0d51e:1-2,
                Auto_Position: 1
...
                 Channel_Name: master-143

We can see the slave has retrieved the two GTID’s (75e2e1dc-b48e-11e5-83bb-1438deb0d51e:1-2) and executed them as well. Looking at the comics table, we can see all four comics have been transferred from two different masters:

Slave

mysql> select * from comics;
+----------+-------------+--------------+----------+-----------+
| comic_id | comic_title | issue_number | pub_year | pub_month |
+----------+-------------+--------------+----------+-----------+
|   100001 | Fly Man     |            1 | 2014     | 01        |
|   100002 | Fly Man     |            3 | 2014     | 03        |
|   100003 | Fly Man     |            2 | 2014     | 02        |
|   100004 | Fly Man     |            4 | 2014     | 04        |
+----------+-------------+--------------+----------+-----------+
4 rows in set (0.01 sec)

Replication took care of the AUTO_INCREMENT values. However, if you were able to see the SQL statements which were being replicated, you would have seen the original INSERT statements:

INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','1','2014','01')
INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','2','2014','02');
INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','3','2014','03');
INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','4','2014','04');

The way replication handles the different AUTO_INCREMENT values is by sending over (from the master to the slave via the IO thread), the value for the comic_id column (which uses AUTO_INCREMENT). The value for this column (generated by the master) is transmitted along with the statement. We can take a look at the binary log on the master to see the SET INSERT_ID=100001 information, which is the value for the comic_id column, being transmitted to the slave along with the original SQL statement:

Slave

# mysqlbinlog mysql-bin.000001
...
# at 349
#160106 21:08:01 server id 142  end_log_pos 349 CRC32 0x48fb16a2 	Intvar
SET INSERT_ID=100001/*!*/;
#160106 21:08:01 server id 142  end_log_pos 543 CRC32 0xbaf55210 	Query	thread_id=1exec_time=0	error_code=0
use `comicbookdb`/*!*/;
SET TIMESTAMP=1452132481/*!*/;
INSERT INTO COMICS (comic_title, issue_number, pub_year, pub_month) VALUES('Fly Man','1','2014','01')
/*!*/;
...

You now have two master MySQL databases replicating data to a single MySQL slave database. Let me know if you have any problems following this tutorial. And follow me on Twitter at ScriptingMySQL.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, MySQL Replication, Uncategorized Tagged with mysql, MySQL Replication

Easy-to-use Perl scripts to backup your MySQL database with mysqldump and FTP the files to a remote server

January 6, 2017 1 Comment

Most users of MySQL utilize the mysqldump utility to backup their database. While mysqldump is handy and easy-to-use (and free), if you have data which is important to your business, then you should take a look at the MySQL Enterprise Edition – and use MySQL Enterprise Backup instead. The MySQL Enterprise Backup allows you to backup your database without the table locking you get with mysqldump. And, it is extremely fast – especially when you have to restore your database. Here is a sample speed comparison between MySQL Enterprise Backup and using mysqldump:

49x Better Performance: Backup

80x Better Performance: Backup

From the image, you can see it takes a long time to either dump or restore a fairly large (73 gigabyte) database compared to mysqldump. Even if your database isn’t this large (and most people don’t care how long their backups take to complete), when it comes time to restore your database in a production environment, the quicker you can restore your database, the better.

If your only option is to use mysqldump, here are two Perl scripts to make the task easier. The first script will backup your database(s) and send a copy of your backup to a remote server via FTP. The second script will connect to your FTP server and delete your old backup files – in case you have a storage limit on your FTP server. You can put these scripts in cron or Windows Task Scheduler, or run them manually. You can have the backup script run as often as possible (maybe once an hour) – but keep in mind there will be table-level locking. The script to delete the old backups only needs to be run once a day.

THE BACKUP SCRIPT

For the backup script, you will have to enter a few variables to match your system. You will also need to create a configuration file of all of the databases you want to backup. I could have connected to the MySQL database and ran a query (SHOW DATABASES;) to retrieve all of the databases, but I prefer to manually manage the list of databases to backup. With this method, you can skip an hourly backup of static or read-only databases and only backup the databases which are being changed. This configuration file is a text file with a list of the databases to be backed up, and you can use a # (pound sign) to comment out databases you want to skip.

NOTE:You don’t want to backup the following databases: PERFORMANCE_SCHEMA, INFORMATION_SCHEMA or SYS SCHEMA.

# set the directory where you will keep the backup files
$backup_folder = '/Users/tonydarnell/cron/mysqlbackups';

# the config file is a text file with a list of the databases to backup
# this should be in the same location as this script, but you can modify this
# if you want to put the file somewhere else
my $config_file = dirname($0) . "/mysql_backup.config";

# Here is where you will put your FTP server name (or IP address)
# and your FTP username and password
my $host = "server_name.com";
my $user = "username";
my $password = "password";

You can also modify your mysqldump command which will be use to backup your databases, or use what I have in the script. You will need to add your mysql password where I have the word “PassWord“.

`/usr/local/mysql/bin/mysqldump -R -h192.168.1.2 --events --triggers -u mysqlbackup --password=PassWord --routines --add-drop-database --set-gtid-purged=OFF --add-drop-table $database $table | compress > $folder/$file.Z`;

I created a separate FTP user which has its own home directory, so I don’t have to change the directory once I login via FTP. If you can’t do this, you will need to uncomment the last three lines of this and navigate your FTP user to the correct directory:

# uncomment the last three lines if you can't set the home directory of the FTP user to a specific directory
# the directory on the FTP server where you want to save the backup files
# my $dir = "mysqlbackups";
# print "Accessing FTP - changing to $dir folder\n";
# $f->cwd($dir) or die "Can't cwd to $dir\n";

That is all you need to modify in the first script for it to work. If you run this script on the command line, or if you run it in cron, you should see something like this:

mysqladmin: [Warning] Using a password on the command line interface can be insecure.

It isn’t a good idea to have your password in plain text anywhere, so you should create a user which only has the limited read-only permissions needed to run mysqldump. You will need to change the value of “database_name” in the GRANT statement to match each database you want to backup. You will need to run the GRANT statement for every database you want to backup, or you can use an asterisk “*” in place of the database name.

CREATE USER 'mysqlbackup'@'192.168.1.2' IDENTIFIED WITH sha256_password BY '';
GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `database_name`.* TO 'mysqlbackup'@'192.168.1.2';

Or, to grant permissions to the mysqlbackup user on all of the tables:

GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON *.* TO 'mysqlbackup'@'192.168.1.2';

Here is a sample output from the script:

# perl mysql_backup.pl
Beginning 2017-01-06-16:35:57
Backing up database_01 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_01 - as 2017-01-06-1635/database_01.sql.Z
Backing up database_02 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_02 - as 2017-01-06-1635/database_02.sql.Z
Backing up database_03 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_03 - as 2017-01-06-1635/database_03.sql.Z
....
Backing up database_04 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_04 - as 2017-01-06-1635/database_04.sql.Z
------------------------------------------------------------------
 Compressing backup as:  2017-01-06-1635.tar.gz 
------------------------------------------------------------------
Creating a tar file from the backup...
tar -czf 2017-01-06-1635.tar.gz 2017-01-06-1635
FTP'ing the file - 2017-01-06-1635.tar.gz 
Deleting the original backup directory and files.
Finished 2017-01-06-16:36:49

The backup script creates a new directory with a timestamp as the name of the directory. In this example, the directory was named “2017-01-06-1635“. After the backup is finished, it creates a single tar file and then deletes the backup directory and files. I delete the backup directory and files as the backup is now contained in the single tar file. It is easier to copy and delete a single tar file via FTP than to do the same with a directory containing multiple files.

DELETING OLD BACKUPS

The script to delete the old backups is fairly simple to configure. You will need to enter the host, username and password of your FTP user, and specify how many backups you want to keep on the server. Since I do a backup once an hour, I keep a week’s worth (168 copies) of backups on my server. I could have checked the date/time on the files and deleted the older files which were X number of days old, but I decided to just go with a certain number of files to keep. I also included a “print_output” variable if you want to suppress any output – simply change this value to anything but “yes” and the script won’t print any output.

# Here is where you will put your FTP server name (or IP address)
# and your username and password
my $host = "server_name.com";
my $user = "username";
my $password = "password";

# how many copies of the backup do you want to keep?
$total_files_to_keep = 168;

$print_output = "yes";

Uncomment this line if you want to see a list of the other files which will not be deleted.

# optional output - remove # to have it print remaining files
# if ($print_output eq "yes") { print "| $count of $total_files_available |   Keeping: $filename\n"; }

Here is a sample output from running the script:

root# perl mysql_delete_backup.pl 
--------------------------------------------
           Total files: 194
    Total backup files: 192
   Total files to keep: 168
 Total files to delete: 24
----------------------------------------------
| x of 192 |  Skipping: .
| x of 192 |  Skipping: ..
| 1 of 192 |  Deleting: 2017-12-29-1352.tar.gz
| 2 of 192 |  Deleting: 2017-12-29-1452.tar.gz
| 3 of 192 |  Deleting: 2017-12-29-1552.tar.gz
| 4 of 192 |  Deleting: 2017-12-29-1652.tar.gz
....
| 24 of 192 |  Deleting: 2017-12-30-1252.tar.gz
----------------------------------------------
Finished 2017-01-06-15:21:58

When I run this script on my FTP server, and I do a listing of the backup directory, it shows the single-period (.) or current directory and the double-period (..) or the parent directory. I take this into account by skipping all files which do not have “20” in the name (as in the first two letters of the year).

If you login to your FTP server and you don’t see the “.” and “..“, then you will need to remove the “-2” in this part of the script, and simply have $total_files_available = $total_files:

# subtract two because of the . and ..
$total_files_available = $total_files - 2;

Here is what I see when I FTP to my server: (notice the “.” and “..“)

root# ftp backups@scripts.com
Trying 1234:f1c0:4738:5088:cb9a:dksi:ebfa:3829...
Connected to scripts.com.
220 FTP Server ready.
331 Password required for backups
Password: 
230 User backups logged in
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> dir
229 Entering Extended Passive Mode (|||58906|)
150 Opening ASCII mode data connection for file list
drwx---r-x   2 u63541528-backups ftpusers     4096 Jan  6 16:52 .
drwx---r-x   2 u63541528-backups ftpusers     4096 Jan  6 16:52 ..
-rw----r--   1 u63541528-backups ftpusers 45522630 Jan  5 22:52 2017-01-05-2252.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45539118 Jan  5 23:52 2017-01-05-2352.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45558328 Jan  6 01:52 2017-01-06-0152.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45560794 Jan  6 07:52 2017-01-06-0752.tar.gz
....

The scripts are available on GitHub – https://github.com/ScriptingMySQL/PerlFiles. The scripts are named: mysql_backup.pl and mysql_delete_old_backups.pl.

I am not the best Perl programmer, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, MySQL Enterprise Backup, mysqldump, Perl, Scripting, Uncategorized

Use MySQL to store data from Amazon’s API via Perl scripts

December 22, 2015 Leave a comment

I really like Amazon.com and I have been a Prime member for several years. Along with millions of different items for sale, Amazon has an affiliate program, where you can earn money advertising products on your web site. When a visitor to your site clicks on a link and orders a product from Amazon, you earn a small commission on the total sale. As an affiliate, you have access to Amazon’s Product Advertising API for obtaining product information. But, you can use this information for many other purposes.

The Amazon API is like most other API’s, and their API web site provides you with code examples and explains how it all works. I am going to show you how a Perl program which you can use to access the API data and store it in a MySQL database. This program was modified from one of the examples on the Amazon API web site.

I wrote a book in 2014, and I wanted to be able to track the book’s ranking on Amazon. I have a couple of friends who wrote books as well, so I tracked their ranking at the same time. By using the API, I can get a lot of information about any product – including my book’s ranking. And, I can keep a historical record of the ranks in a MySQL database. You could also use this to track the price of a product, and you could have the script notify you if the price changed or went below a certain threshold. Example – if you want to know when the price of that 55″ 4K television drops below $599 – you can use the API to check the price once an hour – and send you an alert when the price drops. Most prices and ranks only change (at the most) once an hour, so running the script every five minutes isn’t necessary.

To access Amazon’s API, you will need to register as an Amazon affiliate, and obtain your own Access Keys (Access Key ID and Secret Access Key). In the Perl script, this information goes here:

use constant myAWSId	    => 'Access Key ID';
use constant myAWSSecret    => 'Secret Access Key';
use constant myEndPoint	    => 'ecs.amazonaws.com';

The following Perl script is an example of getting the rank for a particular item. To use the script and to access Amazon’s API, you have to supply the product’s ASIN (Amazon Standard Identification Number), or for a book, you can supply the ISBN (International Standard Book Number). The ASIN is usually found in the URL for a product – as in http://www.amazon.com/gp/product/B00R0ZM5W4 – where B00R0ZM5W4 is the ASIN. I use a MySQL table to store the ranking information (you can modify the table to also include other information). Here is the CREATE TABLE statement for this table:

CREATE TABLE `ranks` (
  `idranks` int(11) NOT NULL AUTO_INCREMENT,
  `product_id` varchar(15) DEFAULT NULL,
  `product_title` varchar(100) DEFAULT NULL,
  `product_rank` varchar(15) DEFAULT NULL,
  `rank_datetime` datetime DEFAULT NULL,
  PRIMARY KEY (`idranks`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

The script also has a sub-routine (named sub ConnectToMySql) which uses connection information stored in a text file one directory below the directory where the script is executed. In this example, the file name is accessAMAZON, and the file contains this connection information (in this order):

database_name
IP_address
mysql_user_name
mysql_password

You can hardcode the connection information directly into the script if you prefer.

The script will also output all of the potential variables and values for each ASIN/ISBN, and this information is stored in a text file with the name equal to the ASIN/ISBN, and with a prefix of “.txt”. Example: B00R0ZM5W4.txt There are over a hundred different variables

The script is named amazonrank.pl and may be found on github at: https://github.com/ScriptingMySQL/PerlFiles.

Here is the script: (remember to add your own Amazon access key information into the script)

#!/usr/bin/perl -w

##############################################################################################
# Copyright 2009,2013 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 
# except in compliance with the License. A copy of the License is located at
#
#	   http://aws.amazon.com/apache2.0/
#
# or in the "LICENSE.txt" file accompanying this file. This file is distributed on an "AS IS"
# BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under the License. 
#
#############################################################################################
#
#  Amazon Product Advertising API
#  Signed Requests Sample Code
#
#  API Version: 2009-03-31/2013-08-01
#
#############################################################################################

#use strict;
#use warnings;

#no warnings 'uninitialized';

use Data::Dumper;
use DBD::mysql;
use DBI;

use RequestSignatureHelper;
use LWP::UserAgent;
use XML::Simple;

use constant myAWSId		=> 'Access Key ID';
use constant myAWSSecret	=> 'Secret Access Key';
use constant myEndPoint		=> 'ecs.amazonaws.com';

my $Database = "amazonrank";

# see if user provided ItemId on command-line
# my $itemId = shift @ARGV || '0545010225';

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst $fileout") or die print "Couldn't open log_file: $!";

print "Retrieving rank for $itemId - ";

# Set up the helper
my $helper = new RequestSignatureHelper (
	+RequestSignatureHelper::kAWSAccessKeyId => myAWSId,
	+RequestSignatureHelper::kAWSSecretKey => myAWSSecret,
	+RequestSignatureHelper::kEndPoint => myEndPoint,
);

# A simple ItemLookup request
my $request = {
	Service => 'AWSECommerceService',
	Operation => 'ItemLookup',
	Version => '2013-08-01',
	ItemId => $itemId,
	AssociateTag => 'scmy-20',
	ResponseGroup => 'Large',
};

# Sign the request
my $signedRequest = $helper->sign($request);

# We can use the helper's canonicalize() function to construct the query string too.
my $queryString = $helper->canonicalize($signedRequest);
my $url = "http://" . myEndPoint . "/onca/xml?" . $queryString;
#print "Sending request to URL: $url \n";

my $ua = new LWP::UserAgent();
my $response = $ua->get($url);
my $content = $response->content();
#print "Recieved Response: $content \n";

my $xmlParser = new XML::Simple();
my $xml = $xmlParser->XMLin($content);

# This prints out all of the item information into a text file
print OUT "Parsed XML is: " . Dumper($xml) . "\n";

if ($response->is_success()) {

	# Here is where you extract the information for each item
	my $title = $xml->{Items}->{Item}->{ItemAttributes}->{Title};
	my $rank = $xml->{Items}->{Item}->{SalesRank};
	my $price = $xml->{Items}->{Item}->{Offers}->{Offer}->{OfferListing}->{Price}->{FormattedPrice};

	# option to print to screen - uncomment this next line
	#	print "Item $itemId is titled \"$title\" and ranked $rank\n";

if (length($rank) > 1)

{
	$dbh = ConnectToMySql($Database);	
	$query = "insert into ranks (product_id, product_rank, rank_datetime, product_title) values ('$itemId', '$rank', '$DateTime','$title')";
	$sth = $dbh->prepare($query);
	$sth->execute();
	$dbh->disconnect;

	print "$rank || $title || $itemId || \$$price\n\n";
}

else

{
	print "Rank for: $title is unavailable.\n\n";
}

# optional sleep 
# sleep 5;


} else {
	my $error = findError($xml);
	if (defined $error) {
	print "Error: " . $error->{Code} . ": " . $error->{Message} . "\n";
	} else {
	print "Unknown Error!\n";
	}
}

sub findError {
	my $xml = shift;
	
	return undef unless ref($xml) eq 'HASH';

	if (exists $xml->{Error}) { return $xml->{Error}; };

	for (keys %$xml) {
	my $error = findError($xml->{$_});
	return $error if defined $error;
	}

	return undef;
}


# end - foreach
}

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

  # my ($db) = @_;

   open(PW, "<..\/accessAMAZON") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

I am not the best Perl programmer, nor am I an expert at the Amazon API, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL and TonyDarnell.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, Perl, Scripting, Uncategorized Tagged with amazon, api, mysql, Perl, scripting

Add RSS feeds to your Twitter stream using MySQL and Perl

October 27, 2015 2 Comments

Adding good content to Twitter can be a pain. I can’t do it during working hours, and I don’t have much time at night. But, the more content you have, the more followers you can gain, and the more your original tweets can be seen (hopefully). I have written several posts about using the latest Perl-Twitter API – Net::Twitter::Lite::WithAPIv1_1, so you might want to check these out as well.

– Use MySQL and Perl to automatically find, follow and unfollow twitter users

– Using Perl to retrieve direct messages from Twitter, insert messages into a MySQL database and then delete the messages

– Using Perl and MySQL to automatically respond to retweets on twitter

– Using Perl to send tweets stored in a MySQL database to twitter

However, finding good supplemental content is easy. There are plenty of companies and web sites which offer RSS (Rich Site Summary or Really Simple Syndication) feeds that you can use on Twitter. In this post, I will show you how to capture the RSS feed from a web site and tweet the links with your Twitter account.

One warning – Twitter has strict guidelines for using API’s on their site. The rate limits may be found at https://dev.twitter.com/rest/public/rate-limiting. In my previous post, I included a Perl script (rate_limit.pl) that prints out the API rate limits for your application. You can use this script to monitor your usage. The key with using the Twitter API is to not be too aggressive, or your app will be banned by Twitter. For example, Twitter does not allow bulk follows and unfollows – so having patience is important. And if you use these scripts to send SPAM, your app and Twitter account will be banned.

For this post, I created two Perl scripts. The first grabs the RSS information and inserts it into a MySQL database. The second script pulls the information from the MySQL database and sends a tweet one record at a time. You can set up a cron job to do both of these scripts, but remember the RSS script needs to run first.

The key when parsing RSS feeds is to find a unique number/identifier which you can use for each feed item. Some RSS feeds will include a <guid> tag, but often you have to parse the URL link to get a unique number/identifier. The script uses this unique identifier to check and see if this RSS feed item is already in the database, so you don’t have duplicate RSS items. I tested these scripts on a dozen different RSS feeds, and it works really well.

Before you attempt to use these scripts, you will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

We will need to create two databases. One to hold the RSS feed information, and another to hold the tweet history. I use the tweet history to delete older tweets with another Perl script I wrote. Here is the SQL to create the databases:

CREATE TABLE 'rss_oracle' (
  'id' int(9) NOT NULL AUTO_INCREMENT,
  'id_post' bigint(14) NOT NULL DEFAULT '0',
  'post_title' varchar(256) DEFAULT NULL,
  'post_url' varchar(256) DEFAULT NULL,
  'post_author' varchar(48) DEFAULT NULL,
  'post_date' datetime DEFAULT NULL,
  'tweet_sent' varchar(3) DEFAULT NULL,
  'tweet_sent_date' datetime DEFAULT NULL,
  PRIMARY KEY ('id','id_post')
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

CREATE TABLE `history` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tweet` char(140) DEFAULT NULL,
  `tweet_id` varchar(30) DEFAULT NULL,
  `tweet_update` datetime DEFAULT NULL,
  `error` char(3) DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

For our RSS feed, we will be grabbing the RSS from Oracle‘s web site. You will need to change the $source variable to whatever RSS feed you want to use. As you grab the data, you will need to test your script to see what non-ASCII characters are included in the title. This example script should translate most characters to ASCII ones, but you will want to check the output before actually tweeting the posts. When you find strange characters, you can simply truncate the RSS database table and re-run the script, or you can uncomment the print statements to see what is being returned.

In the subroutine ConnectToMySql used in the Perl scripts, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file – named accessTweets contains this information:

database_name
hostname or IP
MySQL user name
password

The scripts are also available on GitHub – https://github.com/ScriptingMySQL/PerlFiles. (For some reason, WordPress likes to change the code.)

rss_oracle.pl

#!/usr/bin/perl -w

use LWP::Simple 'get';
use Encode;
use utf8;
use Text::Unidecode;
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use XML::FeedPP;

    my $source = 'https://www.oracle.com/corporate/press/rss/rss-pr.xml';
    my $feed = XML::FeedPP->new( $source );

foreach my $item ( $feed->get_item() ) {

	$post_id = $item->guid();
	$title = $item->title();

	# this is where you have to replace non-ASCII characters
	# each RSS feed will use different non-ASCII characters
	$title = decode_utf8( $title );
	$title =~ s/’/\^/g;
	$title =~ s/\&\#8217;/^/g;
	$title =~ s/\&\#8216;/^/g;
	$title =~ s/\&\#8212;/-/g;
	$title =~ s/\&\#8230;/-/g;
	$title =~ s/'/\^/g;
	$title =~ s/‘/\^/g;
	$title =~ s/’/^/g;
	$title =~ s/…/.../g;
	$title =~ s/—/-/g;
	$title =~ s/-/-/g;
	$title =~ s/–/-/g;
	$title =~ s/ 8212 /-/g;
	$title =~ s/ 8230 /-/g;
	$title =~ s/<em>//g;
	$title =~ s/</em>//g;
	$title =~ s/[^a-zA-Z0-9 ~,._*:?\$^-]//g;

	$link = $item->link();

	# uncomment this line to test
	#print "$post_id | $title | $link\n";
	
	# see if we already have this post in the RSS database	
	$dbh = ConnectToMySql($Database);
	$query = "select id_post FROM rss_oracle where id_post = '$post_id' limit 1";	
	$sth = $dbh->prepare($query);
	$sth->execute();

        #print "\n$query\n\n";

	# loop through our results - one user at a time
	while (@data = $sth->fetchrow_array()) {
		$id_post_found = $data[0];
	# end - while
	}

		#print "id_post_found $id_post_found \n";

		if (length($id_post_found) > 1)
			
		{
			#print "Found $id_post_found...\n";
			$id_post_found = "";
		}
			
		else
			
		{
			$dbh2 = ConnectToMySql($Database);
			$query2 = "insert into rss_oracle (id_post, post_title, post_url) values ('$post_id', '$title', '$link')";	
			$sth2 = $dbh2->prepare($query2);
			# during testing, comment this next line to prevent the data from being inserted into the database
			$sth2->execute();
			#print "$query2\n";

			$title = "";
			$link = "";
			$id_post_found = "";
			
		}

	# foreach my $item
    }

exit;

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

And here is the script to do the tweeting. You will need to add your Twitter Name to the $My_Twitter_User variable.

tweet-oracle.pl

#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;

$My_Twitter_User = "YourTwitterNameHere";

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst prepare($query);
    $sth->execute();
    
    #print "$query \n";
    
			while (@data = $sth->fetchrow_array()) {
            
					$id_post = $data[0];
					$post_title = $data[1];
					$post_url = $data[2];

				$post_title =~s/  / /g;
				$post_title =~s/ ampamp / and /g;

			# while (@data = $sth->fetchrow_array()) {
			}
			
if (length($post_title)  100)

	{
		$title_trimmed = substr($post_title,0,105);
		$title_trimmed = "$title_trimmed...";
		#print "TRIM $title_trimmed\n";

			if ($title_trimmed =~ " Oracle ")

			{
				$title_trimmed =~ s/ Oracle / \#Oracle /;
			}

			else

			{
				$add_Hashtag = " \#Oracle ";
			}		


	}
	
	else
	
	{
		$title_trimmed = "$post_title";
		$title_trimmed =~ s/ Oracle / \#Oracle /;
		#print "x $title_trimmed\n";
	}


$tweet = "$title_trimmed $add_Hashtag \n\n$post_url";

$tweet =~ s/  / /g;
$tweet =~ s/  / /g;

$add_Hashtag = "";

#print "$tweet \n";

#exit;

# ----------------------------------------------------------------------------
# find carats and substitue for single quote
# ----------------------------------------------------------------------------

$tweet =~ s/\^/\'/g;

# ----------------------------------------------------------------------------
# send tweet
# ----------------------------------------------------------------------------


# Credentials for your twitter application
# You will need to substitute your own values for these variables
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);
	# send the tweet
	my $results = eval { $nt->update("$tweet") };

#		    print "---DUMPER START---\n";
#			print Dumper $results;
#		    print "---DUMPER END---\n\n";


# ----------------------------------------------------------------------------
# update mysql with new date for last_tweet date/time
# ----------------------------------------------------------------------------

$dbh = ConnectToMySql($Database);
$query = "UPDATE rss_oracle SET tweet_sent_date = '$DateTime' , tweet_sent = 'yes' where id_post = '$id_post'";
$sth = $dbh->prepare($query);
$sth->execute();

# ----------------------------------------------------------------------------
# get the status id of the last tweet
# ----------------------------------------------------------------------------

my $statuses = $nt->user_timeline({ user => "$My_Twitter_User", count=> 1 });

for my $status ( @$statuses ) {
	$tweet_id = "$status->{id}";
#	print "Tweet ID $tweet_id\n";
}


# ----------------------------------------------------------------------------
# replace special characters
# ----------------------------------------------------------------------------

$tweet =~ s/\\\n/~/g;
$tweet =~ s/\n/~/g;
$tweet =~ s/\'/^/g;

# update mysql with new date for last_tweet date/time

$dbh = ConnectToMySql($Database);	
$query = "insert into history (tweet,tweet_id,tweet_update) values ('$tweet','$tweet_id','$DateTime')";
$sth = $dbh->prepare($query);
$sth->execute();

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

I am not the best Perl programmer, nor am I an expert at the Twitter API, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL and TonyDarnell.

Tony is the author of Twenty Forty-Four: The League of Patriots

Visit http://2044thebook.com for more information.

Filed under MySQL, Perl, Scripting, Uncategorized Tagged with mysql, Perl, scripting, Twitter

← Older posts

Newer posts →

Follow Tony on Twitter
Follow @scriptingmysql
Find me on LinkedIn:
info [at] ScriptingMySQL.com

Scripting MySQL

Adding a replicated MySQL database instance using a Group Replication server as the source

You say you want a Replication?

What is replication and how does it work?

Global Transaction Identifiers

Group Replication

Replicating from a member in Group Replication

Configuring the replica (slave) server

Configuring the source (master) server

Starting the replica (slave) server

What if my Group Replication (GR) group already has data?

Restoring a slave using mysqldump

Other posts on restoring backups to servers

Using a trigger to parse non-conforming data in MySQL

Manipulating queries with non-conforming data via MySQL Query Rewrite Plugin, triggers and stored procedures

MySQL 8.0 InnoDB Cluster – Creating a sandbox and testing MySQL Shell, Router and Group Replication

MySQL Enterprise Edition Database Firewall – Control and Monitor SQL Statement Executions

MySQL Enterprise Audit – parsing audit information from log files, inserting into MySQL table via LOAD DATA INFILE and Perl script

MySQL 5.7 multi-source replication – automatically combining data from multiple databases into one

Easy-to-use Perl scripts to backup your MySQL database with mysqldump and FTP the files to a remote server

Use MySQL to store data from Amazon’s API via Perl scripts

Add RSS feeds to your Twitter stream using MySQL and Perl

About the author:

Interesting Links

Recent Posts

Archives

Categories

Blog Stats

ScriptingMySQL RSS Feds