Checking on the Progress of Large DML Commands in MySQL Using Perl – Part Two

Part Two of Two: Checking on database activity when running a large DML (Data Manipulation Language) statement – such as INSERT, DELETE, UPDATE or SELECT.

Part Two: Monitoring the activity via Perl and SHOW ENGINE INNODB STATUS. (part of the InnoDB Monitors)

In part one, I showed you how to use a Perl script to insert a million rows of dummy data into a table. I needed a large database in order to test a Perl script that I would use to monitor the activity when running a large DML statement.

The original reason for creating both of these scripts was to find a quick way to see if a large DML statement was actually being executed. A customer was performing some modifications on tables with tens of millions of rows, and they wanted to know if they were making any progress. Since the customer was using the InnoDB storage engine, I thought of a way that you could check on the progress – but only given the fact that nothing else was happening in the database (more on this reason later).

With InnoDB, you can issue the SHOW ENGINE INNODB STATUS and you will get a wealth of information. I am not going to list any of it here as it would consume too much space. However, towards the bottom of the output, you will notice under “ROW OPERATIONS” a line that contains the words “Number of rows inserted…”:

--------------
ROW OPERATIONS
--------------
0 queries inside InnoDB, 0 queries in queue
1 read views open inside InnoDB
Main thread id 2960240640, state: waiting for server activity
Number of rows inserted 1202598, updated 97249, deleted 806, read 56448551
0.00 inserts/s, 0.00 updates/s, 0.00 deletes/s, 0.00 reads/s

The above line gives you a quick snapshot of how many rows have been inserted, updated, deleted or selected (read). So, by invoking the SHOW ENGINE INNODB STATUS command, you can figure what database activity is occurring. But, if you invoke this command while the database is being used for other purposes, it will be difficult to figure out your progress on any of the four values shown.

So, all I needed to do was to write a quick Perl script to monitor the changes to that particular line of output, and then I could figure out if a certain statement was indeed being executed. My original thinking was that I needed a database with a lot of records in it. So, I decide to first create a dummy table and then shove a million records into it. But I figured out that by creating a database with a million rows, I could simply just monitor this activity versus issuing a command on the database once it had been built.

I created a quick Perl script that issues the SHOW ENGINE INNODB STATUS command every X number of seconds. It then looks for the output total for whatever I want to track – either inserts, updates, deletes or selects (reads). And yes, you could just do this manually, but if you have a command that you want to run overnight, you could execute this script and at least see what happened when you return in the morning.

One caveat – SHOW ENGINE INNODB STATUS does not correspond to a particular point in time, so it may not be consistent – as to ensure consistency would require a global lock which would consume too many resources. But it is good enough for what we want to do here.

The script will check the status, and then depending upon what variable (inserts, updates, deletes, selects) you want to search for, it will show you the total for that variable as it increases. I didn’t take time to add a lot of information about how long it would take to complete the original command, as this would be just a wild guess, given the fact that other things could be happening with the database. You will need to change the variable that you want to look for, how many rows you are expecting to change, how many seconds you want it to refresh, and your database information:

#!/usr/bin/perl -w

# DBI is the standard database interface for Perl
# DBD is the Perl module that we use to connect to the MySQL database
use DBI;
use DBD::mysql;
# use CGI for forms
#use CGI qw(:standard);

$start = '';
$begin = '';
$now = 1;

# How many rows are being changed?
# (whole numbers only)
$rows = 200000;

# how long do you want it to sleep? (seconds)
# you can modify this to be a longer time interval than 60 seconds
# which is advisable, as the SHOW ENGINE query will consume resources
$sleep_time = 60;

# what action do you want to monitor?
# based upon this output: Number of rows inserted xxxxx, updated xxxxx, deleted xxxxx, read xxxxx
# (use the word to the right of the equal sign)
# inserts = inserted
# update = updates
# deletes = deleted
# selects = read
$action = "inserted";

$Database = "scripting_mysql";

     $dbh = ConnectToMySql($Database);
     $query = "SHOW ENGINE INNODB STATUS";
     $sth = $dbh->prepare($query);
     $sth->execute();

$data = $sth->fetchrow_array();

@all_rows = split("\\n",$data);

foreach (@all_rows) {

     if ($_ =~ "Number of rows")

     {
          #print "Line $_ \n";
     
          @total = split (" ", $_);
          
          # viewing this output from SHOW ENGINE INNODB STATUS
          # Number of rows inserted 912268, updated 96931, deleted 806, read 52052215
          if ($action =~ "inserted") { $column = 4 }
          if ($action =~ "updated") { $column = 6 }
          if ($action =~ "deleted") { $column = 8 }
          if ($action =~ "read") { $column = 10 }

               if ($_ =~ "Number of rows ")
               {
                    $total[$column] =~ s/,//;
                    $begin = $total[$column];
               }               
     }
}

print "Execute your SQL statement, and then press enter/return: ";
$start = <>;

print "Beginning with $begin $action:\n";

while ($now < $rows) {

$print_date_time = &get_date_time;

     $dbh = ConnectToMySql($Database);
     $query = "SHOW ENGINE INNODB STATUS";
     $sth = $dbh->prepare($query);
     $sth->execute();

     $data = $sth->fetchrow_array();

     @all_rows = split("\n",$data);

     foreach (@all_rows) {

               @total = split (" ", $_);
          
          if ($action =~ "inserted") { $column = 4 }
          if ($action =~ "updated") { $column = 6 }
          if ($action =~ "deleted") { $column = 8 }
          if ($action =~ "read") { $column = 10 }
          
               if ($_ =~ "Number of rows ")
               {
                    $total[$column] =~ s/,//;
                    $now = $total[$column];
                    $now = $now - $begin;
                    $progress = $now / $rows;
                    $progress = $progress * 100;
                    print "$print_date_time - $progress\% changed - $now out of $rows.\n";
               }
}

# you can modify this to be a longer time interval than 60 seconds
# which is advisable, as the SHOW ENGINE query will consume resources
sleep $sleep_time;

}

     $sth->finish;
     $dbh->disconnect;

# from Connecting to MySQL with Perl
#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

     my ($db) = @_;

     open(ACCESS_INFO, "<..\/accessBLOG") || die "Can't access login credentials";
     my $database = <ACCESS_INFO>;
     my $host = <ACCESS_INFO>;
     my $userid = <ACCESS_INFO>;
     my $passwd = <ACCESS_INFO>;

# the chomp() function will remove any newline character from the end of a string
chomp ($database, $host, $userid, $passwd);
#print "<br>$database $host $userid $passwd <br>";

     my $connectionInfo="dbi:mysql:$database;$host";
     close(ACCESS_INFO);

     # make connection to database
     my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
     return $l_dbh;

}

# ----------------------------------------------------------------------------------
sub get_date_time {
# ----------------------------------------------------------------------------------

     my ($sec,$min,$hour,$mday,$mon,$year) = localtime time;

     $year = $year + 1900;
     $mon = $mon + 1;

     # add a zero if the value is less than 10

     if ($sec < 10) { $sec = "0$sec"; }
     if ($min < 10) { $min = "0$min"; }
     if ($hour < 10) { $hour = "0$hour"; }
     if ($mday < 10) { $mday = "0$mday"; }
     if ($mon < 10) { $mon = "0$mon"; }
     if ($year < 10) { $year = "0$year"; }

     $DateTime = "$year-$mon-$mday $hour:$min:$sec";
     return $DateTime

}
#----------------------------------------------------------------------

Here is the output from the Perl script. For this example, I was assuming that I would be inserting 200,000 rows.

sh-3.2# /usr/bin/perl monitor.pl
Execute your SQL statement, and then press enter/return:
Beginning with 1196128 inserted:
2011-11-01 17:48:06 - 0.029% changed - 58 out of 200000.
2011-11-01 17:48:16 - 0.464% changed - 928 out of 200000.
2011-11-01 17:48:26 - 0.8645% changed - 1729 out of 200000.
2011-11-01 17:48:36 - 1.259% changed - 2518 out of 200000.
2011-11-01 17:48:46 - 1.628% changed - 3256 out of 200000.
2011-11-01 17:48:56 - 2.0045% changed - 4009 out of 200000.
2011-11-01 17:49:06 - 2.3845% changed - 4769 out of 200000.
2011-11-01 17:49:16 - 2.7685% changed - 5537 out of 200000.
2011-11-01 17:49:26 - 3.1415% changed - 6283 out of 200000.

This script was just a quick hack. Another way to potentially see if your script is still running is to check out the “TRANSACTIONS” section of the SHOW ENGINE INNODB STATUS output. In this example below, you can see the insert statement that the Perl script was performing during part one (where I was inserting the 1,000,000 rows). However, since the database that I was using doesn’t have a lot of activity on it, I was able to see the insert statement multiple times when I resubmitted the SHOW ENGINE INNODB STATUS command. If you have a database with a lot of transactions running, you might not be able to see your statement in the output.

------------
TRANSACTIONS
------------
Trx id counter 1818DA1
Purge done for trx's n:o < 1817E3F undo n:o < 0
History list length 3830
LIST OF TRANSACTIONS FOR EACH SESSION:
---TRANSACTION 1818DA0, not started, OS thread id 2961469440 flushing log
mysql tables in use 1, locked 1
MySQL thread id 335232, query id 2818136 192.168.1.2 WebUser query end
insert into test_large (name_first, name_last, address_01, address_02, city, state, zip) values ('1BPnJiuWsyajA4b3SH7OjS4BFJgedK','JPuJ4xh_QEbNokZZGlpcHEHKLk2W__','24IT20mW0moAwWmoYTMOwsv44yRL9mAMoo0mLyLoGoTjaW78O6','6zX2j4PLShQ_IfiOzd0LbTVi8ZaYGk3_6LIafpnFfdFL7kRMTa','mbENuc0kHQz9NLGkn5iy','xR','81941')
---TRANSACTION 0, not started, OS thread id 2960855040

MySQL thread id 326621, query id 2818137 localhost root
SHOW ENGINE INNODB STATUS
---TRANSACTION 1815A3B, not started, OS thread id 2961059840
MySQL thread id 322078, query id 2778540 192.168.1.5 WebUser
---TRANSACTION 1815A32, not started, OS thread id 2960650240
MySQL thread id 319917, query id 2778444 192.168.1.5 WebUser
---TRANSACTION 0, not started, OS thread id 2962083840
MySQL thread id 319845, query id 2771552 192.168.1.5 WebUser
---TRANSACTION 0, not started, OS thread id 2963107840
MySQL thread id 43198, query id 2817803 localhost 127.0.0.1 WebUser

I am sure that there are better ways of monitoring large DML statements, and this will only work for InnoDB tables. If you try the same command for MyISAM, SHOW ENGINE MyISAM STATUS, the output is blank:

mysql> SHOW ENGINE MyISAM STATUS;
Empty set (0.00 sec)

If you know of a better or different way to do this, please add your thoughts in the comment section below.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Leave a comment