MySQL Enterprise Audit – parsing audit information from log files, inserting into MySQL table via LOAD DATA INFILE and Perl script

The MySQL Enterprise Audit plug-in is part of the MySQL Enterprise Edition (available through a paid license). Basically, Enterprise Audit tracks everything that is happening on your MySQL server, and can be used to protect/detect the misuse of information, and to meet popular compliance regulations including HIPAA, Sarbanes-Oxley, and the PCI Data Security Standard.

MySQL Enterprise Audit uses the open MySQL Audit API to enable standard, policy-based monitoring and logging of connection and query activity executed on specific MySQL servers. Designed to meet the Oracle audit specification, MySQL Enterprise Audit provides an out of box, easy to use auditing and compliance solution for applications that are governed by both internal and external regulatory guidelines.

When installed, the audit plugin enables MySQL Server to produce a log file containing an audit record of server activity. The log contents include when clients connect and disconnect, and what actions they perform while connected, such as which databases and tables they access.
(from https://dev.mysql.com/doc/refman/5.6/en/audit-log-plugin.html)

When you enable MySQL Enterprise Audit, log files are generated in your MySQL data directory. You can use tools like MySQL Workbench (Enterprise Edition) or Oracle Audit Vault to import the log data, to view the information and to generate reports.

I was talking with a client, and he wanted to know if the audit data could be stored in a table. Currently (as of MySQL 5.6.25), the audit information is stored as XML in the audit log files. There are several ways to do this, and I will cover two methods.

The first is to use the LOAD XML [LOCAL] INFILE command. You will need to create a table to store the audit information:

CREATE TABLE audit_log (
   RECORD_ID varchar(40) NOT NULL,
   NAME varchar(64),
   TIMESTAMP timestamp,
   COMMAND_CLASS varchar(64),
   CONNECTION_ID bigint unsigned,
   DB varchar(64),
   HOST varchar(60),
   IPv4 int unsigned,
   IPv6 varbinary(16),
   MYSQL_VERSION varchar(64),
   OS_LOGIN varchar(64),
   PRIV_USER varchar(16),
   PROXY_USER varchar(16),
   SERVER_ID int unsigned,
   SQLTEXT longtext,
   STARTUP_OPTIONS text,
   STATUS int unsigned,
   STATUS_CODE int unsigned,
   USER varchar(168),
   VERSION int unsigned,
   PRIMARY KEY(RECORD_ID)
) DEFAULT CHARSET utf8mb4;

You can then load the data as:

LOAD XML LOCAL INFILE 'audit.log'
    INTO TABLE audit_log
    CHARACTER SET utf8mb4
    ROWS IDENTIFIED BY ''
         (RECORD_ID, NAME, @TIMESTAMP, COMMAND_CLASS, CONNECTION_ID, DB, HOST, @IP, MYSQL_VERSION, OS_LOGIN, PRIV_USER, PROXY_USER, SERVER_ID, SQLTEXT, STARTUP_OPTIONS, STATUS, STATUS_CODE, USER, VERSION)
     SET TIMESTAMP = CONVERT_TZ(STR_TO_DATE(@TIMESTAMP, '%Y-%m-%dT%H:%i:%s UTC'), 'UTC', 'Australia/Sydney'),
         IPv4 = IF(IS_IPV4(@IP), INET_ATON(@IP), NULL),
         IPv6 = IF(IS_IPV6(@IP), INET6_ATON(@IP), NULL);

Important notes for the above example:

  • The example converts the IP address to the numeric version and stores it in IPv4 or IPv6 depending on the type of IP address. It is also possible to store the IP address in a shared varchar() column.
  • Replace the target time zone in CONVERT_TZ() with the time zone of you system.
  • The use of CONVERT_TZ() requires named time zones to be loaded or that you use a numeric offset such as +10:00.

RECORD_ID is guaranteed unique with the following limitations:

  • If you change audit_log_format, the counter will reset. However as the timestamp is also included in the RECORD_ID this will in general not be an issue.
  • It is only unique for a given MySQL instance. If you want to import the audit logs for multiple instances, you can for example add the server_uuid to the table and use as part of the primary key.
  • server_uuid is available starting from MySQL 5.6.
  • The matching between the elements in the XML file and the column names is case sensitive. All the elements in the audit log are in upper case.
  • The ROWS IDENTIFIED BY ” clause is required unless the table is named AUDIT_RECORD as the table name is the default elements to look for as rows.

Here are some links for more reading on using LOAD XML [LOCAL] INFILE:

https://dev.mysql.com/doc/refman/5.6/en/audit-log-plugin-options-variables.html#sysvar_audit_log_format
https://dev.mysql.com/doc/refman/5.6/en/load-xml.html
https://dev.mysql.com/doc/refman/5.6/en/date-and-time-functions.html#function_convert-tz
https://dev.mysql.com/doc/refman/5.6/en/time-zone-support.html
https://dev.mysql.com/doc/refman/5.6/en/replication-options.html#sysvar_server_uuid


For the second option, I wrote a quick Perl script that would parse the XML log files and insert the information into a MySQL database. You will need to set the size of your audit log files in your my.cnf or my.ini configuration file via the audit_log_rotate_on_size variable. You might need to adjust the size of your log files based upon database activity and how well the script parses the log files. If your log files are very large, the Perl script might have issues processing it, and you might want to decrease the size of your log files and run the script more frequently.


CAVEAT
Enterprise Audit does require a license from MySQL. If you are interested in an Enterprise subscription, contact me via the comment section below. If you are an Enterprise customer, you will need to configure Enterprise Audit first. See the Enterprise Audit online documentation page for more information, or contact MySQL Support.


For the data fields, I used the audit log file format information found at The Audit Log File page on MySQL.com.

My MySQL server doesn’t have a lot of activity, so I tried to configure the size of the data fields as best as possible to accommodate the possible size of the data in each field. There may be instances where you will have to increase the size of these fields or change their data types. The largest field is the SQL_TEXT field which will contain your SQL statements. Every table has a max row size of 65,535 bytes. So, the largest possible size of the SQL_TEXT field could be for this example is around 63,200 bytes (65,535 bytes minus the sum of the size of all of the other fields, and minus the 1-byte or 2-byte length prefix used for each varchar field). In this example, the SQL_TEXT field is set to 8,096 bytes, so you may need to increase or decrease this value.

I used varchar data types for each field, excluding the primary key field named ID. I did not spend a lot of time on the database schema, so you might want to modify it a little. I am sure that some of the fields are integers, but I did not have enough data in my log files to positively determine all of the possible values for each field. I did read the online manual, and it stated that CONNECTION_ID, SERVER_ID, STATUS, STATUS_CODE and VERSION were unsigned integers – but I left them as varchar.


NOTICE
This script requires the use of the new format for the audit log files, which is available in MySQL versions 5.6.20 or later.


I created a database along with two tables; one to store the log file information, and a history table to keep track of what files had already been parsed and inserted into MySQL, as well as the number of log file entries. The CREATE DATABASE and CREATE TABLE syntax is as follows:

CREATE DATABASE `audit_information` /*!40100 DEFAULT CHARACTER SET latin1 */

CREATE TABLE `audit_parsed` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `COMMAND_CLASS` varchar(64) DEFAULT NULL,
  `CONNECTIONID` varchar(32) DEFAULT NULL,
  `DB_NAME` varchar(64) DEFAULT NULL,
  `HOST_NAME` varchar(256) DEFAULT NULL,
  `IP_ADDRESS` varchar(16) DEFAULT NULL,
  `MYSQL_VERSION` varchar(64) DEFAULT NULL,
  `COMMAND_NAME` varchar(64) DEFAULT NULL,
  `OS_LOGIN` varchar(64) DEFAULT NULL,
  `OS_VERSION` varchar(256) DEFAULT NULL,
  `PRIV_USER` varchar(16) DEFAULT NULL,
  `PROXY_USER` varchar(16) DEFAULT NULL,
  `RECORD_ID` varchar(64) DEFAULT NULL,
  `SERVER_ID` varchar(32) DEFAULT NULL,
  `SQL_TEXT` varchar(8096) DEFAULT NULL,
  `STARTUP_OPTIONS` varchar(1024) DEFAULT NULL,
  `COMMAND_STATUS` varchar(64) DEFAULT NULL,
  `STATUS_CODE` varchar(11) DEFAULT NULL,
  `DATE_TIMESTAMP` varchar(24) DEFAULT NULL,
  `USER_NAME` varchar(128) DEFAULT NULL,
  `LOG_VERSION` varchar(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

CREATE TABLE `audit_history` (
  `ID` int(11) NOT NULL AUTO_INCREMENT,
  `AUDIT_LOG_NAME` varchar(64) DEFAULT NULL,
  `PARSED_DATE_TIME` datetime DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `LOG_ENTRIES` int(11) DEFAULT NULL,
  PRIMARY KEY (`ID`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

The Perl script finds the non-active log files (which end in .xml – example: audit.log.14357895017796690.xml), parses the data, creates an SQL file with INSERT statements, imports the data via the mysql command-line program, and then moves the log file(s) and SQL file(s) to a directory. The history table records what files have been processed, so you don’t accidentally process the same file twice.

In the beginning of the Perl script, there are several values you need to replace to match your system. The values are under the section titled “values needed”. Here is the Perl script (named audit.pl):

#!/usr/bin/perl
# audit.pl

use DBI;
use CGI;
use XML::Simple;

#----------------------------------------------------------
# values needed
$Database = "audit_information";
$MYSQL_DATA_DIR = "/usr/local/mysql/data";
$MySQL_Host_IP_Name = "192.168.1.2";
$mysql_user = "root";
$mysql_password = "password_needed";

# directory to store old audit files after parsing
$audit_directory = "$MYSQL_DATA_DIR/audit_files";

# make an audit_files directory if one does not exist
mkdir($audit_directory) unless(-d $audit_directory);
#----------------------------------------------------------


#----------------------------------------------------------
#for each file do this
@files = @files = ;;
foreach $file_name_to_parse (@files) {

	#----------------------------------------------------------
	# check to see if file has already been parsed
	$dbh1 = ConnectToMySql($Database);
	$query1 = "select AUDIT_LOG_NAME from audit_history where AUDIT_LOG_NAME = '$file_name_to_parse'";
	$sth1 = $dbh1->prepare($query1);
	$sth1->execute();

          while (@data = $sth1->fetchrow_array()) {
            
         	   $audit_log_name = $data[0];

			}

	# if length of audit_log_name is less than 1, process file
	if (length($audit_log_name) $PARSED_FILE") or die print "Couldn't open log_file: $!";
		
		$count = 0;
		
		# XML::Simple variable - SuppressEmpty => 1   ignore empty values
		$xml = XML::Simple->new(SuppressEmpty => 1);
		$data = $xml->XMLin("$file_name_to_parse");
		
		foreach $info (@{$data->{AUDIT_RECORD}})
		{
			# replace tick marks ' with \' in the SQL TEXT
			$info->{"SQLTEXT"} =~ s/'/\\'/g;
		
			print LOGOUT "INSERT INTO audit_information.AUDIT_PARSED (COMMAND_CLASS, CONNECTIONID, DB_NAME, HOST_NAME, IP_ADDRESS, MYSQL_VERSION, COMMAND_NAME, OS_LOGIN, OS_VERSION, PRIV_USER, PROXY_USER, RECORD_ID, SERVER_ID, SQL_TEXT, STARTUP_OPTIONS, COMMAND_STATUS, STATUS_CODE, DATE_TIMESTAMP, USER_NAME, LOG_VERSION) values ('" . $info->{"COMMAND_CLASS"} . "', '" . $info->{"CONNECTION_ID"} . "', '" . $info->{"DB"} . "', '" . $info->{"HOST"} . "', '" . $info->{"IP"} . "', '" . $info->{"MYSQL_VERSION"} . "', '" . $info->{"NAME"} . "', '" . $info->{"OS_LOGIN"} . "', '" . $info->{"OS_VERSION"} . "', '" . $info->{"PRIV_USER"} . "', '" . $info->{"PROXY_USER"} . "', '" . $info->{"RECORD_ID"} . "', '" . $info->{"SERVER_ID"} . "', '" . $info->{"SQLTEXT"} . "', '" . $info->{"STARTUP_OPTIONS"} . "', '" . $info->{"STATUS"} . "', '" . $info->{"STATUS_CODE"} . "', '" . $info->{"TIMESTAMP"} . "', '" . $info->{"USER"} . "', '" . $info->{"VERSION"} . "');\n";
			$count++;
		
		# end foreach $info (@{$data->{AUDIT_RECORD}})
		}
		
		# load parsed file into MySQL - hide warnings
		system("mysql -u$mysql_user -p$mysql_password  /dev/null 2>&1");
										
		$dbh2 = ConnectToMySql($Database);
		$query2 = "insert into audit_information.audit_history (AUDIT_LOG_NAME, LOG_ENTRIES) values ('$file_name_to_parse', '$count')";
		
		# optional print output - uncomment if desired
		# print "$query2\n";
																													
		$sth2 = $dbh2->prepare($query2);
		$sth2->execute();

		# close audit log file
		close(INFILE);

		# optional print output - uncomment if desired
		# print "Moving audit log ($file_name_to_parse) and log file ($PARSED_FILE) to $audit_directory.\n";
		
		# strip directories off $file_name_to_parse
		@file_name_to_move_array = split("\/",$file_name_to_parse);
		$directory_count = $#file_name_to_move_array;
		$file_name_to_move = $file_name_to_move_array[$directory_count];
		
		
		# optional print output - uncomment if desired
		# print "mv $file_name_to_move $file_name_to_parse\n";
		# print "mv $PARSED_FILE $audit_directory\n";

		# move audit log files and parsed log files to $audit_directory
		system("mv $file_name_to_parse $audit_directory");
		system("mv $PARSED_FILE $audit_directory");

	# end - if (length($audit_log_name) < 1)
	}

	else

	{
		# optional print output - uncomment if desired
		# print "$audit_log_name already processed\n";
		system("mv $file_name_to_parse $audit_directory");
	}

# end - foreach $file_name_to_parse (@files) 
}

sub ConnectToMySql {

   $connectionInfo="dbi:mysql:$Database;$MySQL_Host_IP_Name:3306";

   # make connection to database
   $l_dbh = DBI->connect($connectionInfo,$mysql_user,$mysql_password);
   return $l_dbh;

}

It should not matter where you execute audit.pl, as long as you have correctly entered the required values in the script. You might get errors if you try to run this script on a log file that has not been rotated, which is the current log file in your MySQL data directory. The current log file is named audit.log.

# pwd
/usr/local/mysql/data
# ls -l audit.log
-rw-rw----  1 mysql  _mysql  9955118 Jul  2 15:25 audit.log

The script will only work on files ending in .xml. For testing, I used four small (and identical) audit log files:

# pwd
/usr/local/mysql/data
# ls -l *xml
-rw-rw----  1 mysql  wheel   15508 Jul  2 12:20 audit.log.14357895017796690.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796691.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796692.xml
-rw-r-----  1 mysql  _mysql  15508 Jul  2 13:46 audit.log.14357895017796693.xml

I have commented-out the print statements in the Perl script, but if you uncomment them, running the script gives you this output for each log file:

# perl audit.pl
Parsing - /usr/local/mysql/data/audit.log.14357895017796690.xml
insert into audit_information.audit_history (AUDIT_LOG_NAME, LOG_ENTRIES) values ('/usr/local/mysql/data/audit.log.14357895017796690.xml', '34')
Moving audit log (/usr/local/mysql/data/audit.log.14357895017796690.xml) and log file (/usr/local/mysql/data/audit.log.14357895017796690_parsed.sql) to /usr/local/mysql/data/audit_files.
mv audit.log.14357895017796690.xml /usr/local/mysql/data/audit.log.14357895017796690.xml
mv /usr/local/mysql/data/audit.log.14357895017796690_parsed.sql /usr/local/mysql/data/audit_files
....

After running my test script, the following data is what is in the audit_history table:

mysql> use audit_information
Database changed
mysql> select * from audit_history;
+----+-------------------------------------------------------+---------------------+-------------+
| ID | AUDIT_LOG_NAME                                        | PARSED_DATE_TIME    | LOG_ENTRIES |
+----+-------------------------------------------------------+---------------------+-------------+
|  1 | /usr/local/mysql/data/audit.log.14357895017796690.xml | 2015-07-02 15:25:07 | 34          |
|  2 | /usr/local/mysql/data/audit.log.14357895017796691.xml | 2015-07-02 15:25:08 | 34          |
|  3 | /usr/local/mysql/data/audit.log.14357895017796692.xml | 2015-07-02 15:25:08 | 34          |
|  4 | /usr/local/mysql/data/audit.log.14357895017796693.xml | 2015-07-02 15:25:09 | 34          |
+----+-------------------------------------------------------+---------------------+-------------+
4 rows in set (0.00 sec)

And here is an example of one line from the audit_parsed table.

mysql> select * from audit_parsed limit 1 \G
*************************** 1. row ***************************
             ID: 1
  COMMAND_CLASS: select
   CONNECTIONID: 10093
        DB_NAME: 
      HOST_NAME: localhost
     IP_ADDRESS: 127.0.0.1
  MYSQL_VERSION: 
   COMMAND_NAME: Query
       OS_LOGIN: 
     OS_VERSION: 
      PRIV_USER: 
     PROXY_USER: 
      RECORD_ID: 1614933_2015-07-01T22:08:58
      SERVER_ID: 
       SQL_TEXT: SELECT (UNIX_TIMESTAMP(now()) - CAST(variable_value AS SIGNED))*1000 as serverStartMillis 
  FROM information_schema.global_status 
 WHERE variable_name='uptime'
STARTUP_OPTIONS: 
 COMMAND_STATUS: 0
    STATUS_CODE: 0
 DATE_TIMESTAMP: 2015-07-01T22:08:58 UTC
      USER_NAME: root[root] @ localhost [127.0.0.1]
    LOG_VERSION: 
1 row in set (0.00 sec)

After parsing the log files, you can then write your own queries for searching through your audit data. You can even include this script in cron, so it runs and parses your files automatically. But as always, test this script and use it with caution before putting it in a production environment. You could also modify the Perl script to filter out values you did not want or need to store.

If you do use this script or if you have any suggestions or other questions, please leave a comment below.

Thanks to Jesper Krogh for providing the information on the LOAD XML [LOCAL] INFILE.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Advertisement

Easy-to-use Perl scripts to backup your MySQL database with mysqldump and FTP the files to a remote server

Most users of MySQL utilize the mysqldump utility to backup their database. While mysqldump is handy and easy-to-use (and free), if you have data which is important to your business, then you should take a look at the MySQL Enterprise Edition – and use MySQL Enterprise Backup instead. The MySQL Enterprise Backup allows you to backup your database without the table locking you get with mysqldump. And, it is extremely fast – especially when you have to restore your database. Here is a sample speed comparison between MySQL Enterprise Backup and using mysqldump:

49x Better Performance: Backup

80x Better Performance: Backup

From the image, you can see it takes a long time to either dump or restore a fairly large (73 gigabyte) database compared to mysqldump. Even if your database isn’t this large (and most people don’t care how long their backups take to complete), when it comes time to restore your database in a production environment, the quicker you can restore your database, the better.

If your only option is to use mysqldump, here are two Perl scripts to make the task easier. The first script will backup your database(s) and send a copy of your backup to a remote server via FTP. The second script will connect to your FTP server and delete your old backup files – in case you have a storage limit on your FTP server. You can put these scripts in cron or Windows Task Scheduler, or run them manually. You can have the backup script run as often as possible (maybe once an hour) – but keep in mind there will be table-level locking. The script to delete the old backups only needs to be run once a day.

THE BACKUP SCRIPT

For the backup script, you will have to enter a few variables to match your system. You will also need to create a configuration file of all of the databases you want to backup. I could have connected to the MySQL database and ran a query (SHOW DATABASES;) to retrieve all of the databases, but I prefer to manually manage the list of databases to backup. With this method, you can skip an hourly backup of static or read-only databases and only backup the databases which are being changed. This configuration file is a text file with a list of the databases to be backed up, and you can use a # (pound sign) to comment out databases you want to skip.

NOTE:You don’t want to backup the following databases: PERFORMANCE_SCHEMA, INFORMATION_SCHEMA or SYS SCHEMA.

# set the directory where you will keep the backup files
$backup_folder = '/Users/tonydarnell/cron/mysqlbackups';

# the config file is a text file with a list of the databases to backup
# this should be in the same location as this script, but you can modify this
# if you want to put the file somewhere else
my $config_file = dirname($0) . "/mysql_backup.config";

# Here is where you will put your FTP server name (or IP address)
# and your FTP username and password
my $host = "server_name.com";
my $user = "username";
my $password = "password";

You can also modify your mysqldump command which will be use to backup your databases, or use what I have in the script. You will need to add your mysql password where I have the word “PassWord“.

`/usr/local/mysql/bin/mysqldump -R -h192.168.1.2 --events --triggers -u mysqlbackup --password=PassWord --routines --add-drop-database --set-gtid-purged=OFF --add-drop-table $database $table | compress > $folder/$file.Z`;

I created a separate FTP user which has its own home directory, so I don’t have to change the directory once I login via FTP. If you can’t do this, you will need to uncomment the last three lines of this and navigate your FTP user to the correct directory:

# uncomment the last three lines if you can't set the home directory of the FTP user to a specific directory
# the directory on the FTP server where you want to save the backup files
# my $dir = "mysqlbackups";
# print "Accessing FTP - changing to $dir folder\n";
# $f->cwd($dir) or die "Can't cwd to $dir\n";

That is all you need to modify in the first script for it to work. If you run this script on the command line, or if you run it in cron, you should see something like this:

mysqladmin: [Warning] Using a password on the command line interface can be insecure.

It isn’t a good idea to have your password in plain text anywhere, so you should create a user which only has the limited read-only permissions needed to run mysqldump. You will need to change the value of “database_name” in the GRANT statement to match each database you want to backup. You will need to run the GRANT statement for every database you want to backup, or you can use an asterisk “*” in place of the database name.

CREATE USER 'mysqlbackup'@'192.168.1.2' IDENTIFIED WITH sha256_password BY '';
GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON `database_name`.* TO 'mysqlbackup'@'192.168.1.2';

Or, to grant permissions to the mysqlbackup user on all of the tables:

GRANT SELECT, LOCK TABLES, SHOW VIEW, EVENT, TRIGGER ON *.* TO 'mysqlbackup'@'192.168.1.2';

Here is a sample output from the script:

# perl mysql_backup.pl
Beginning 2017-01-06-16:35:57
Backing up database_01 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_01 - as 2017-01-06-1635/database_01.sql.Z
Backing up database_02 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_02 - as 2017-01-06-1635/database_02.sql.Z
Backing up database_03 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_03 - as 2017-01-06-1635/database_03.sql.Z
....
Backing up database_04 - 2017-01-06-1635 ... mysqldump: [Warning] Using a password on the command line interface can be insecure.
Finished exporting database_04 - as 2017-01-06-1635/database_04.sql.Z
------------------------------------------------------------------
 Compressing backup as:  2017-01-06-1635.tar.gz 
------------------------------------------------------------------
Creating a tar file from the backup...
tar -czf 2017-01-06-1635.tar.gz 2017-01-06-1635
FTP'ing the file - 2017-01-06-1635.tar.gz 
Deleting the original backup directory and files.
Finished 2017-01-06-16:36:49

The backup script creates a new directory with a timestamp as the name of the directory. In this example, the directory was named “2017-01-06-1635“. After the backup is finished, it creates a single tar file and then deletes the backup directory and files. I delete the backup directory and files as the backup is now contained in the single tar file. It is easier to copy and delete a single tar file via FTP than to do the same with a directory containing multiple files.

DELETING OLD BACKUPS

The script to delete the old backups is fairly simple to configure. You will need to enter the host, username and password of your FTP user, and specify how many backups you want to keep on the server. Since I do a backup once an hour, I keep a week’s worth (168 copies) of backups on my server. I could have checked the date/time on the files and deleted the older files which were X number of days old, but I decided to just go with a certain number of files to keep. I also included a “print_output” variable if you want to suppress any output – simply change this value to anything but “yes” and the script won’t print any output.

# Here is where you will put your FTP server name (or IP address)
# and your username and password
my $host = "server_name.com";
my $user = "username";
my $password = "password";

# how many copies of the backup do you want to keep?
$total_files_to_keep = 168;

$print_output = "yes";

Uncomment this line if you want to see a list of the other files which will not be deleted.

# optional output - remove # to have it print remaining files
# if ($print_output eq "yes") { print "| $count of $total_files_available |   Keeping: $filename\n"; }

Here is a sample output from running the script:

root# perl mysql_delete_backup.pl 
--------------------------------------------
           Total files: 194
    Total backup files: 192
   Total files to keep: 168
 Total files to delete: 24
----------------------------------------------
| x of 192 |  Skipping: .
| x of 192 |  Skipping: ..
| 1 of 192 |  Deleting: 2017-12-29-1352.tar.gz
| 2 of 192 |  Deleting: 2017-12-29-1452.tar.gz
| 3 of 192 |  Deleting: 2017-12-29-1552.tar.gz
| 4 of 192 |  Deleting: 2017-12-29-1652.tar.gz
....
| 24 of 192 |  Deleting: 2017-12-30-1252.tar.gz
----------------------------------------------
Finished 2017-01-06-15:21:58

When I run this script on my FTP server, and I do a listing of the backup directory, it shows the single-period (.) or current directory and the double-period (..) or the parent directory. I take this into account by skipping all files which do not have “20” in the name (as in the first two letters of the year).

If you login to your FTP server and you don’t see the “.” and “..“, then you will need to remove the “-2” in this part of the script, and simply have $total_files_available = $total_files:

# subtract two because of the . and ..
$total_files_available = $total_files - 2;

Here is what I see when I FTP to my server: (notice the “.” and “..“)

root# ftp backups@scripts.com
Trying 1234:f1c0:4738:5088:cb9a:dksi:ebfa:3829...
Connected to scripts.com.
220 FTP Server ready.
331 Password required for backups
Password: 
230 User backups logged in
Remote system type is UNIX.
Using binary mode to transfer files.
ftp> dir
229 Entering Extended Passive Mode (|||58906|)
150 Opening ASCII mode data connection for file list
drwx---r-x   2 u63541528-backups ftpusers     4096 Jan  6 16:52 .
drwx---r-x   2 u63541528-backups ftpusers     4096 Jan  6 16:52 ..
-rw----r--   1 u63541528-backups ftpusers 45522630 Jan  5 22:52 2017-01-05-2252.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45539118 Jan  5 23:52 2017-01-05-2352.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45558328 Jan  6 01:52 2017-01-06-0152.tar.gz
-rw----r--   1 u63541528-backups ftpusers 45560794 Jan  6 07:52 2017-01-06-0752.tar.gz
....

The scripts are available on GitHub – https://github.com/ScriptingMySQL/PerlFiles. The scripts are named: mysql_backup.pl and mysql_delete_old_backups.pl.

I am not the best Perl programmer, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Use MySQL to store data from Amazon’s API via Perl scripts

I really like Amazon.com and I have been a Prime member for several years. Along with millions of different items for sale, Amazon has an affiliate program, where you can earn money advertising products on your web site. When a visitor to your site clicks on a link and orders a product from Amazon, you earn a small commission on the total sale. As an affiliate, you have access to Amazon’s Product Advertising API for obtaining product information. But, you can use this information for many other purposes.

The Amazon API is like most other API’s, and their API web site provides you with code examples and explains how it all works. I am going to show you how a Perl program which you can use to access the API data and store it in a MySQL database. This program was modified from one of the examples on the Amazon API web site.

I wrote a book in 2014, and I wanted to be able to track the book’s ranking on Amazon. I have a couple of friends who wrote books as well, so I tracked their ranking at the same time. By using the API, I can get a lot of information about any product – including my book’s ranking. And, I can keep a historical record of the ranks in a MySQL database. You could also use this to track the price of a product, and you could have the script notify you if the price changed or went below a certain threshold. Example – if you want to know when the price of that 55″ 4K television drops below $599 – you can use the API to check the price once an hour – and send you an alert when the price drops. Most prices and ranks only change (at the most) once an hour, so running the script every five minutes isn’t necessary.

To access Amazon’s API, you will need to register as an Amazon affiliate, and obtain your own Access Keys (Access Key ID and Secret Access Key). In the Perl script, this information goes here:

use constant myAWSId	    => 'Access Key ID';
use constant myAWSSecret    => 'Secret Access Key';
use constant myEndPoint	    => 'ecs.amazonaws.com';

The following Perl script is an example of getting the rank for a particular item. To use the script and to access Amazon’s API, you have to supply the product’s ASIN (Amazon Standard Identification Number), or for a book, you can supply the ISBN (International Standard Book Number). The ASIN is usually found in the URL for a product – as in http://www.amazon.com/gp/product/B00R0ZM5W4 – where B00R0ZM5W4 is the ASIN. I use a MySQL table to store the ranking information (you can modify the table to also include other information). Here is the CREATE TABLE statement for this table:

CREATE TABLE `ranks` (
  `idranks` int(11) NOT NULL AUTO_INCREMENT,
  `product_id` varchar(15) DEFAULT NULL,
  `product_title` varchar(100) DEFAULT NULL,
  `product_rank` varchar(15) DEFAULT NULL,
  `rank_datetime` datetime DEFAULT NULL,
  PRIMARY KEY (`idranks`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

The script also has a sub-routine (named sub ConnectToMySql) which uses connection information stored in a text file one directory below the directory where the script is executed. In this example, the file name is accessAMAZON, and the file contains this connection information (in this order):

database_name
IP_address
mysql_user_name
mysql_password

You can hardcode the connection information directly into the script if you prefer.

The script will also output all of the potential variables and values for each ASIN/ISBN, and this information is stored in a text file with the name equal to the ASIN/ISBN, and with a prefix of “.txt”. Example: B00R0ZM5W4.txt There are over a hundred different variables

The script is named amazonrank.pl and may be found on github at: https://github.com/ScriptingMySQL/PerlFiles.

Here is the script: (remember to add your own Amazon access key information into the script)

#!/usr/bin/perl -w

##############################################################################################
# Copyright 2009,2013 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You may not use this file 
# except in compliance with the License. A copy of the License is located at
#
#	   http://aws.amazon.com/apache2.0/
#
# or in the "LICENSE.txt" file accompanying this file. This file is distributed on an "AS IS"
# BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations under the License. 
#
#############################################################################################
#
#  Amazon Product Advertising API
#  Signed Requests Sample Code
#
#  API Version: 2009-03-31/2013-08-01
#
#############################################################################################

#use strict;
#use warnings;

#no warnings 'uninitialized';

use Data::Dumper;
use DBD::mysql;
use DBI;

use RequestSignatureHelper;
use LWP::UserAgent;
use XML::Simple;

use constant myAWSId		=> 'Access Key ID';
use constant myAWSSecret	=> 'Secret Access Key';
use constant myEndPoint		=> 'ecs.amazonaws.com';

my $Database = "amazonrank";

# see if user provided ItemId on command-line
# my $itemId = shift @ARGV || '0545010225';

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst $fileout") or die print "Couldn't open log_file: $!";

print "Retrieving rank for $itemId - ";

# Set up the helper
my $helper = new RequestSignatureHelper (
	+RequestSignatureHelper::kAWSAccessKeyId => myAWSId,
	+RequestSignatureHelper::kAWSSecretKey => myAWSSecret,
	+RequestSignatureHelper::kEndPoint => myEndPoint,
);

# A simple ItemLookup request
my $request = {
	Service => 'AWSECommerceService',
	Operation => 'ItemLookup',
	Version => '2013-08-01',
	ItemId => $itemId,
	AssociateTag => 'scmy-20',
	ResponseGroup => 'Large',
};

# Sign the request
my $signedRequest = $helper->sign($request);

# We can use the helper's canonicalize() function to construct the query string too.
my $queryString = $helper->canonicalize($signedRequest);
my $url = "http://" . myEndPoint . "/onca/xml?" . $queryString;
#print "Sending request to URL: $url \n";

my $ua = new LWP::UserAgent();
my $response = $ua->get($url);
my $content = $response->content();
#print "Recieved Response: $content \n";

my $xmlParser = new XML::Simple();
my $xml = $xmlParser->XMLin($content);

# This prints out all of the item information into a text file
print OUT "Parsed XML is: " . Dumper($xml) . "\n";

if ($response->is_success()) {

	# Here is where you extract the information for each item
	my $title = $xml->{Items}->{Item}->{ItemAttributes}->{Title};
	my $rank = $xml->{Items}->{Item}->{SalesRank};
	my $price = $xml->{Items}->{Item}->{Offers}->{Offer}->{OfferListing}->{Price}->{FormattedPrice};

	# option to print to screen - uncomment this next line
	#	print "Item $itemId is titled \"$title\" and ranked $rank\n";

if (length($rank) > 1)

{
	$dbh = ConnectToMySql($Database);	
	$query = "insert into ranks (product_id, product_rank, rank_datetime, product_title) values ('$itemId', '$rank', '$DateTime','$title')";
	$sth = $dbh->prepare($query);
	$sth->execute();
	$dbh->disconnect;

	print "$rank || $title || $itemId || \$$price\n\n";
}

else

{
	print "Rank for: $title is unavailable.\n\n";
}

# optional sleep 
# sleep 5;


} else {
	my $error = findError($xml);
	if (defined $error) {
	print "Error: " . $error->{Code} . ": " . $error->{Message} . "\n";
	} else {
	print "Unknown Error!\n";
	}
}

sub findError {
	my $xml = shift;
	
	return undef unless ref($xml) eq 'HASH';

	if (exists $xml->{Error}) { return $xml->{Error}; };

	for (keys %$xml) {
	my $error = findError($xml->{$_});
	return $error if defined $error;
	}

	return undef;
}


# end - foreach
}

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

  # my ($db) = @_;

   open(PW, "<..\/accessAMAZON") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

I am not the best Perl programmer, nor am I an expert at the Amazon API, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL and TonyDarnell.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Add RSS feeds to your Twitter stream using MySQL and Perl

Adding good content to Twitter can be a pain. I can’t do it during working hours, and I don’t have much time at night. But, the more content you have, the more followers you can gain, and the more your original tweets can be seen (hopefully). I have written several posts about using the latest Perl-Twitter API – Net::Twitter::Lite::WithAPIv1_1, so you might want to check these out as well.

Use MySQL and Perl to automatically find, follow and unfollow twitter users

Using Perl to retrieve direct messages from Twitter, insert messages into a MySQL database and then delete the messages

Using Perl and MySQL to automatically respond to retweets on twitter

Using Perl to send tweets stored in a MySQL database to twitter

However, finding good supplemental content is easy. There are plenty of companies and web sites which offer RSS (Rich Site Summary or Really Simple Syndication) feeds that you can use on Twitter. In this post, I will show you how to capture the RSS feed from a web site and tweet the links with your Twitter account.

One warning – Twitter has strict guidelines for using API’s on their site. The rate limits may be found at https://dev.twitter.com/rest/public/rate-limiting.  In my previous post, I included a Perl script (rate_limit.pl) that prints out the API rate limits for your application. You can use this script to monitor your usage. The key with using the Twitter API is to not be too aggressive, or your app will be banned by Twitter. For example, Twitter does not allow bulk follows and unfollows – so having patience is important. And if you use these scripts to send SPAM, your app and Twitter account will be banned.

For this post, I created two Perl scripts. The first grabs the RSS information and inserts it into a MySQL database. The second script pulls the information from the MySQL database and sends a tweet one record at a time. You can set up a cron job to do both of these scripts, but remember the RSS script needs to run first.

The key when parsing RSS feeds is to find a unique number/identifier which you can use for each feed item. Some RSS feeds will include a <guid> tag, but often you have to parse the URL link to get a unique number/identifier. The script uses this unique identifier to check and see if this RSS feed item is already in the database, so you don’t have duplicate RSS items. I tested these scripts on a dozen different RSS feeds, and it works really well.

Before you attempt to use these scripts, you will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

We will need to create two databases. One to hold the RSS feed information, and another to hold the tweet history. I use the tweet history to delete older tweets with another Perl script I wrote. Here is the SQL to create the databases:

CREATE TABLE 'rss_oracle' (
  'id' int(9) NOT NULL AUTO_INCREMENT,
  'id_post' bigint(14) NOT NULL DEFAULT '0',
  'post_title' varchar(256) DEFAULT NULL,
  'post_url' varchar(256) DEFAULT NULL,
  'post_author' varchar(48) DEFAULT NULL,
  'post_date' datetime DEFAULT NULL,
  'tweet_sent' varchar(3) DEFAULT NULL,
  'tweet_sent_date' datetime DEFAULT NULL,
  PRIMARY KEY ('id','id_post')
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

CREATE TABLE `history` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `tweet` char(140) DEFAULT NULL,
  `tweet_id` varchar(30) DEFAULT NULL,
  `tweet_update` datetime DEFAULT NULL,
  `error` char(3) DEFAULT '',
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

For our RSS feed, we will be grabbing the RSS from Oracle‘s web site. You will need to change the $source variable to whatever RSS feed you want to use. As you grab the data, you will need to test your script to see what non-ASCII characters are included in the title. This example script should translate most characters to ASCII ones, but you will want to check the output before actually tweeting the posts. When you find strange characters, you can simply truncate the RSS database table and re-run the script, or you can uncomment the print statements to see what is being returned.

In the subroutine ConnectToMySql used in the Perl scripts, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file – named accessTweets contains this information:

database_name
hostname or IP
MySQL user name
password

The scripts are also available on GitHub – https://github.com/ScriptingMySQL/PerlFiles. (For some reason, WordPress likes to change the code.)

rss_oracle.pl

#!/usr/bin/perl -w

use LWP::Simple 'get';
use Encode;
use utf8;
use Text::Unidecode;
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use XML::FeedPP;

    my $source = 'https://www.oracle.com/corporate/press/rss/rss-pr.xml';
    my $feed = XML::FeedPP->new( $source );

foreach my $item ( $feed->get_item() ) {

	$post_id = $item->guid();
	$title = $item->title();

	# this is where you have to replace non-ASCII characters
	# each RSS feed will use different non-ASCII characters
	$title = decode_utf8( $title );
	$title =~ s/’/\^/g;
	$title =~ s/\&\#8217;/^/g;
	$title =~ s/\&\#8216;/^/g;
	$title =~ s/\&\#8212;/-/g;
	$title =~ s/\&\#8230;/-/g;
	$title =~ s/'/\^/g;
	$title =~ s/‘/\^/g;
	$title =~ s/’/^/g;
	$title =~ s/…/.../g;
	$title =~ s/—/-/g;
	$title =~ s/-/-/g;
	$title =~ s/–/-/g;
	$title =~ s/ 8212 /-/g;
	$title =~ s/ 8230 /-/g;
	$title =~ s/<em>//g;
	$title =~ s/</em>//g;
	$title =~ s/[^a-zA-Z0-9 ~,._*:?\$^-]//g;

	$link = $item->link();

	# uncomment this line to test
	#print "$post_id | $title | $link\n";
	
	# see if we already have this post in the RSS database	
	$dbh = ConnectToMySql($Database);
	$query = "select id_post FROM rss_oracle where id_post = '$post_id' limit 1";	
	$sth = $dbh->prepare($query);
	$sth->execute();

        #print "\n$query\n\n";

	# loop through our results - one user at a time
	while (@data = $sth->fetchrow_array()) {
		$id_post_found = $data[0];
	# end - while
	}

		#print "id_post_found $id_post_found \n";

		if (length($id_post_found) > 1)
			
		{
			#print "Found $id_post_found...\n";
			$id_post_found = "";
		}
			
		else
			
		{
			$dbh2 = ConnectToMySql($Database);
			$query2 = "insert into rss_oracle (id_post, post_title, post_url) values ('$post_id', '$title', '$link')";	
			$sth2 = $dbh2->prepare($query2);
			# during testing, comment this next line to prevent the data from being inserted into the database
			$sth2->execute();
			#print "$query2\n";

			$title = "";
			$link = "";
			$id_post_found = "";
			
		}

	# foreach my $item
    }

exit;

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

And here is the script to do the tweeting. You will need to add your Twitter Name to the $My_Twitter_User variable.

tweet-oracle.pl

#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;

$My_Twitter_User = "YourTwitterNameHere";

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst prepare($query);
    $sth->execute();
    
    #print "$query \n";
    
			while (@data = $sth->fetchrow_array()) {
            
					$id_post = $data[0];
					$post_title = $data[1];
					$post_url = $data[2];

				$post_title =~s/  / /g;
				$post_title =~s/ ampamp / and /g;

			# while (@data = $sth->fetchrow_array()) {
			}
			
if (length($post_title)  100)

	{
		$title_trimmed = substr($post_title,0,105);
		$title_trimmed = "$title_trimmed...";
		#print "TRIM $title_trimmed\n";

			if ($title_trimmed =~ " Oracle ")

			{
				$title_trimmed =~ s/ Oracle / \#Oracle /;
			}

			else

			{
				$add_Hashtag = " \#Oracle ";
			}		


	}
	
	else
	
	{
		$title_trimmed = "$post_title";
		$title_trimmed =~ s/ Oracle / \#Oracle /;
		#print "x $title_trimmed\n";
	}


$tweet = "$title_trimmed $add_Hashtag \n\n$post_url";

$tweet =~ s/  / /g;
$tweet =~ s/  / /g;

$add_Hashtag = "";

#print "$tweet \n";

#exit;

# ----------------------------------------------------------------------------
# find carats and substitue for single quote
# ----------------------------------------------------------------------------

$tweet =~ s/\^/\'/g;

# ----------------------------------------------------------------------------
# send tweet
# ----------------------------------------------------------------------------


# Credentials for your twitter application
# You will need to substitute your own values for these variables
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);
	# send the tweet
	my $results = eval { $nt->update("$tweet") };

#		    print "---DUMPER START---\n";
#			print Dumper $results;
#		    print "---DUMPER END---\n\n";


# ----------------------------------------------------------------------------
# update mysql with new date for last_tweet date/time
# ----------------------------------------------------------------------------

$dbh = ConnectToMySql($Database);
$query = "UPDATE rss_oracle SET tweet_sent_date = '$DateTime' , tweet_sent = 'yes' where id_post = '$id_post'";
$sth = $dbh->prepare($query);
$sth->execute();

# ----------------------------------------------------------------------------
# get the status id of the last tweet
# ----------------------------------------------------------------------------

my $statuses = $nt->user_timeline({ user => "$My_Twitter_User", count=> 1 });

for my $status ( @$statuses ) {
	$tweet_id = "$status->{id}";
#	print "Tweet ID $tweet_id\n";
}


# ----------------------------------------------------------------------------
# replace special characters
# ----------------------------------------------------------------------------

$tweet =~ s/\\\n/~/g;
$tweet =~ s/\n/~/g;
$tweet =~ s/\'/^/g;

# update mysql with new date for last_tweet date/time

$dbh = ConnectToMySql($Database);	
$query = "insert into history (tweet,tweet_id,tweet_update) values ('$tweet','$tweet_id','$DateTime')";
$sth = $dbh->prepare($query);
$sth->execute();

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

I am not the best Perl programmer, nor am I an expert at the Twitter API, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL and TonyDarnell.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Use MySQL and Perl to automatically find, follow and unfollow twitter users

A friend of mine asked me how they could automatically follow and unfollow people on Twitter. But they didn’t want to follow just anyone and everyone. He had a Twitter account which they used for recruiting in a very narrow construction industry. He wanted to find people in the same industry and follow them – hoping they would follow him back and learn about his open jobs. When I joined Twitter back in 2008, I wrote a similar program to automatically follow/unfollow users, but the Twitter API has changed quite a bit since then. So I decided to re-write the program with the latest Perl-Twitter API – Net::Twitter::Lite::WithAPIv1_1.

Before you attempt to use these scripts, you will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

When running these types of scripts on Twitter, you have to be careful to not surpass the rate limits that Twitter has established for using the API.  I have included a script named rate_limit.pl – which can give you the limits you have for each API-call, and how many calls you have remaining before the limits are reset by Twitter.  I have added pauses in the scripts to help prevent you from going over the rate limits (but you will need to check to ensure you don’t surpass them).  You are free to modify these scripts (but I won’t be able to help you figure out how to do this.)  The rate limits may be found at https://dev.twitter.com/rest/public/rate-limiting.  The key with using the Twitter API is to not be too aggressive, or your app will be banned by Twitter. For example, Twitter does not allow bulk follows and unfollows – so having patience is important.

There are several programs involved, and all of them utilize MySQL databases to store the information.  A summary of the scripts are as follows:

followers_find.pl – To use this script, you choose a Twitter user that has an audience similar to yours, and then follow their users.  For example, if you want to follow cat lovers, you could grab the followers of the Twitter user named @Cat.  This script will pull the last 5,000 followers of @Cat, and place those user ID’s into a database named follows_other_users.  The script will also save the cursor information, so you can run this script multiple times to obtain a large pool of users, and avoid duplicates.  This script stores the account you followed (@Cat) and the user_id of the follower.

friend_lookup.pl – This script takes 100 of the user ID’s from the follows_other_users database, pulls the user’s details from Twitter and inserts this information into the twitter_users database.  The user information includes the name, the user ID, number of tweets, how many followers, how many people they are following, time zone and description.  You can modify the database and the script to include or omit other pieces of information.

follow_user.pl – This script follows users from the twitter_users database, based upon the percentage of followers/following and the number of tweets (you can change the search criteria).  For example, I didn’t want to follow someone who was following 2,000 people but only had 100 followers.  I wanted the followers/following ratio to be a little more even.  I also wanted to follow people who had posted at least 30 tweets.  Even though the followers_find.pl script downloads the information for 5,000 users (at a time), you might only follow a couple hundred of these users who fit your criteria.

friends_follow_check.pl – This script will check to see if a user you followed has followed you back, and if not, then the script will unfollow that user. You will need to specify how many days to give someone to follow you back before you unfollow them.  For example: You follow a group of users on 10/05/2015 (the database stores what date you follow someone). You decide to wait five days to see if anyone from this group follows you back. Therefore, on 10/10/2015, you can run this script and change the $date_to_delete variable to 2015-10-05, and the script will unfollow anyone you followed on 2015-10-05 (or prior) who is not following you back.

This diagram shows you the steps for each script and to what database they connect:

More details on each script:

followers_find.pl – Use this to grab followers of a related Twitter user by providing a value for $user_to_find_followers. This value should be the Twitter user’s name without the “@” symbol – and not their description name. The script will insert 5,000 followers at a time into follows_other_users and insert cursor information in user_cursors.

This script uses the followers_ids API call, which has a limit of 15 every 15 minutes. When you run the rate_limit.pl script, you can see how many calls are remaining:

'followers' => {
'/followers/ids' => {
'limit' => 15,
'remaining' => 14
'reset' => 1445026087,
},

friend_lookup.pl – Selects users from follows_other_users, gets the user’s details, and then inserts the information into twitter_users. The script can get information on 100 users at a time. Some users may produce an error, and the script will stop. If this happens, just re-run the script. If the script still has an error, delete that user from the database.

This script uses the lookup_users API call, which has a limit of 180 every 15 minutes. When you run the rate_limit.pl script, you can see how many calls are remaining:

'/users/lookup' => {
'limit' => 180
'remaining' => 179,
'reset' => 1445028838,
},

For some reason, when I ran this script, I would get an error on a particular user. I spent a lot of time trying to figure out why the error occurred, but there were too many unknowns as to why the error existed. So, when the script finds a bad user, it updates the follows_other_users database and marks that users with an error (err). You may have to delete a user from the follows_other_users if the script doesn’t automatically mark the user with an error and the script exits immediately after running it.

follow_user.pl – Selects users from twitter_users database and follows them if they meet certain criteria. The script also updates the twitter_users database as to whether or not they were followed, and what date/time they were followed.

This script uses the create_friend API call. The web site does not specify the limit, and the limit does not appear when you run the rate_limit.pl script. I only follow 10-20 new friends an hour – to avoid Twitter’s ban on automatic bulk follow/unfollow.

Before you run this script, you want to be sure that you have enough users in the twitter_users database that fit your search criteria. Use this SQL command to find the number of users available for you to follow: (and feel free to modify the criteria)

select user_id FROM twitter_users where sent_follow_request IS NULL and percent_follow > 90 and percent_follow  30;

The number of users to follow is set with the $limit variable, and I have it set to 250. This means the script will follow 250 users before quitting. I used a sleep command (a random-length pause between six minutes ($minimum = 360;) and twelve minutes ($maximum = 720;) between following users so Twitter doesn’t think you are a robot. You may adjust these values as well.

friends_follow_check.pl – Selects users you followed from the twitter_users database and unfollows them if they haven’t followed you. It updates twitter_users with the unfollow information.

After you run the follow_user.pl script, you will need to wait a few days to give people time to follow you back. You then will need to change the variable $date_to_delete to be a few days prior to the current date. I usually give people five days to follow me.

You can always change your search criteria to be less restrictive, in order to find more followers. But I have found that a strict search criteria removes most of the spammers.

This script uses the lookup_friendships API call, which has a limit of 15 every 15 minutes. When you run the rate_limit.pl script, you can see how many calls are remaining:

'/friendships/lookup' => {
'limit' => 15,
'remaining' => 14
'reset' => 1445031488,
},

In each of the scripts, you have an option to print out the results from the Twitter API call. You will need to uncomment (remove the #) from these lines:

#    print "---DUMPER START---\n";
#    print Dumper $followers_list;
#    print "---DUMPER END---\n\n";

Also, there are print statements that have been commented out as well. Uncomment them if you want to see the output.


Here are the CREATE TABLE statements for each database. Some fields are longer than you would think they should be, but I did this to leave room for special characters which are longer than one character (I had to use decode_utf8 on names and descriptions):

CREATE TABLE `follows_other_users` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `user_to_find_followers` varchar(16) DEFAULT NULL,
  `follower_id` varchar(32) DEFAULT NULL,
  `looked_up_info` char(3) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1
CREATE TABLE 'user_cursors' (
  'id' int(11) NOT NULL AUTO_INCREMENT,
  'user_id' varchar(16) DEFAULT NULL,
  'next_cursor' varchar(48) DEFAULT NULL,
  'previous_cursor' varchar(48) DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1
CREATE TABLE 'twitter_users' (
  'id' int(11) NOT NULL AUTO_INCREMENT,
  'name' varchar(56) DEFAULT NULL,
  `screen_name` varchar(32) DEFAULT NULL,
  'user_id' varchar(16) DEFAULT NULL,
  'sent_follow_request' varchar(3) DEFAULT NULL,
  'sent_request_datetime' datetime DEFAULT NULL,
  'followed_me' varchar(3) DEFAULT NULL,
  'unfollowed_them' varchar(3) DEFAULT NULL,
  'statuses_count' int(11) DEFAULT NULL,
  'following_count' int(11) DEFAULT NULL,
  'followers_count' int(11) DEFAULT NULL,
  'percent_follow' int(11) DEFAULT NULL,
  'time_zone' varchar(256) DEFAULT NULL,
  'description' varchar(4096) DEFAULT NULL,
  'creation_datetime' datetime DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

And here are the scripts. Remember you need to create your own keys and tokens and insert them into the script for $consumer_key, $consumer_secret, $access_token and $access_token_secret.

In the subroutine ConnectToMySql used in the Perl scripts, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file – named accessTweets contains this information:

database_name
hostname or IP
MySQL user name
password

The scripts are also available on GitHub – https://github.com/ScriptingMySQL/PerlFiles.

followers_find.pl

#!/usr/bin/perl
# Updated 2015-10-25

use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
use Scalar::Util 'blessed';
# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);


$count=1;

# twitter user name without the @
$user_to_find_followers = "Cat";

$dbh = ConnectToMySql($Database);
$query = "select user_id, next_cursor FROM user_cursorswhere user_id = '$user_to_find_followers' order by id desc limit 1";	
$sth = $dbh->prepare($query);
$sth->execute();

print "\n$query\n";

$count_users = 0;

#			print "name | friends_count | followers_count | statuses_count | percent_follow\% |$time_zone | description | creation_datetime\n";
# loop through our results - one tweet at a time
while (@data = $sth->fetchrow_array()) {
$cursor = $data[1];

if ($cursor == 0)

{
	$cursor = "-1";
}

}

print "Starting at cursor: $data[1]\n";
# Use the optional cursor parameter to retrieve IDs in pages of 5000. When the cursor parameter is used, 
# the return value is a reference to a hash with keys previous_cursor, next_cursor, and ids. 
# The value of ids is a reference to an array of IDS of the user's followers. 
# Set the optional cursor parameter to -1 to get the first page of IDs. 
# Set it to the prior return's value of previous_cursor or next_cursor to page forward or backwards. 
# When there are no prior pages, the value of previous_cursor will be 0. 
# When there are no subsequent pages, the value of next_cursor will be 0.

  eval {

		my $followers_list = $nt->followers_ids({
        screen_name => "$user_to_find_followers",
        cursor      => "$cursor",
        });

#		count => 1
#    print "---DUMPER START---\n";
#	print Dumper $followers_list;
#    print "---DUMPER END---\n\n";

			$next_cursor = $followers_list->{next_cursor_str};
			$previous_cursor = $followers_list->{previous_cursor_str};
			
			print "next_cursor $next_cursor - previous_cursor $previous_cursor \n";

		for my $status2 ( @{$followers_list->{ids}} ) {
        # print "$count $status $next_cursor\n";

			$follower_id = $status2;
		
			# uncomment to watch as it prints each user
			#print "$count $user_to_find_followers $follower_id\n";

			$dbh = ConnectToMySql($Database);	
			$query = "insert into follows_other_users(user_to_find_followers, follower_id) values ('$user_to_find_followers','$follower_id')";
			#print "\n $query\n";
			$sth = $dbh->prepare($query);
			$sth->execute();

		#sleep 1;

		$count++;

	# end for my $status
    }

# end eval    
};
			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "HTTP Response Code: ", $err->code, "\n",
          	 "HTTP Message......: ", $err->message, "\n",
         	  "Twitter error.....: ", $err->error, "\n";
			}
		# put this into a database in case you want to search for more of their followers
		print "\n$user_to_find_followers $next_cursor $previous_cursor\n";

		$user_id = $user_to_find_followers;

		$dbh = ConnectToMySql($Database);	
		$query = "insert into user_cursors(user_id, next_cursor, previous_cursor) values ('$user_id','$next_cursor','$previous_cursor')";
		print "\n $query\n";
		$sth = $dbh->prepare($query);
		$sth->execute();

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst < 10) { $isdst = "0$isdst"; }

$DateTime = "$year-$mon-$mday $hour:$min:$sec";

# ----------------------------------------------------------------------------------


print "Finished importing - $DateTime....\n";
print "\n----------------------------------------------------------------------------\n";

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

exit;

friend_lookup.pl

#!/usr/bin/perl
# Updated 2015-10-25

use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
use Scalar::Util 'blessed';
use Encode;
       
# ----------------------------------------------------------------------------
# get the relationship between my user name and another user name to see
# if they are following me or if I am following them
# ----------------------------------------------------------------------------

$number = 1;

# you are allowed 180 of these lookups every 15 minutes
# with a 15 second pause at the end of each one, you won't 
# go over the limit
while ($number new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);

open(OUT, ">dumper_out.txt") || die "Can't redirect stdout";
$dbh = ConnectToMySql($Database);
$query = "select follower_id, user_to_find_followers FROM follows_other_users where looked_up_info IS NULL limit 100";	
# run to see if you can debug why some users get an error
#$query = "select follower_id, user_to_find_followers FROM follows_other_users where looked_up_info = 'err' limit 100";	
$sth = $dbh->prepare($query);
$sth->execute();

print "\n$query\n";

$count_users = 0;

#			print "name | friends_count | followers_count | statuses_count | percent_follow\% |$time_zone | description | creation_datetime\n";
# loop through our results - one tweet at a time
while (@data = $sth->fetchrow_array()) {

	print "$data[0] ";

	$user_id = $data[0];
	$user_to_find_followers = $data[1];

if (length($data[0])  0)

	{

		$users_to_get = "$users_to_get, $user_id";

		push(@data2, "$user_id");

	}

	else

	{

		$users_to_get = "$user_id";
		push(@data2, "$user_id");

	}

	$count_users++;

# end - while (@data = $sth->fetchrow_array()) {
}

#print "$users_to_get\n";

$count = 1;

#while (@data2) {

			print "--------------------------------------------------------------------------------------------------\n";
	eval {

    			my $user_info = $nt->lookup_users({ 
    				user_id => [ "$users_to_get" ] 
    			});

	print OUT "---DUMPER START---\n";
	print OUT Dumper $user_info;
	print OUT "---DUMPER END---\n\n";

			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "\n - HTTP Response Code: ", $err->code, "\n",
				"\n - HTTP Message......: ", $err->message, "\n",
				"\n - Twitter error.....: ", $err->error, "\n";
			}
		for my $status ( @$user_info ) {

			if (length($status->{name}) {name};
			$name =~ s/[^[:ascii:]]//g;
			$name =~ s/[^!-~\s]//g;
			$name = decode_utf8( $name );
			$name =~ s/\'/\^/g;
			print "Working on $name - ";

			$user_id = $status->{id};
			$following_count = $status->{friends_count};
			$followers_count = $status->{followers_count};
			$statuses_count = $status->{statuses_count};
			$time_zone = $status->{time_zone};

			$screen_name = $status->{screen_name};
			$screen_name =~ s/[^[:ascii:]]//g;
			$screen_name = decode_utf8( $screen_name );
			$screen_name =~ s/[^a-zA-Z0-9 _^-]//g;
			$screen_name =~ s/[^!-~\s]//g;

			$description = $status->{description};
			
			if (length($description)  'Wed Nov 09 19:38:46 +0000 2011',

			$created_at = $status->{created_at};
			@creation_date_array = split(" ",$created_at);
		
			$creation_date_month = $creation_date_array[1];
			
			if ($creation_date_month =~ "Jan") { $creation_date_month = "01"}
			if ($creation_date_month =~ "Feb") { $creation_date_month = "02"}
			if ($creation_date_month =~ "Mar") { $creation_date_month = "03"}
			if ($creation_date_month =~ "Apr") { $creation_date_month = "04"}
			if ($creation_date_month =~ "May") { $creation_date_month = "05"}
			if ($creation_date_month =~ "Jun") { $creation_date_month = "06"}
			if ($creation_date_month =~ "Jul") { $creation_date_month = "07"}
			if ($creation_date_month =~ "Aug") { $creation_date_month = "08"}
			if ($creation_date_month =~ "Sep") { $creation_date_month = "09"}
			if ($creation_date_month =~ "Oct") { $creation_date_month = "10"}
			if ($creation_date_month =~ "Nov") { $creation_date_month = "11"}
			if ($creation_date_month =~ "Dec") { $creation_date_month = "12"}
		
			$creation_date_day_of_month = $creation_date_array[2];
			$creation_date_year = $creation_date_array[5];
			$creation_date_time = $creation_date_array[3];
			$creation_datetime = "$creation_date_year-$creation_date_month-$creation_date_day_of_month $creation_date_time";

			# had to add this as the percentage formula below would fail
			if ($following_count < 1)
			
			{
			
				$following_count = 1;
			
			}
			if ($followers_count prepare($query);
			$sth->execute();
			$dbh2 = ConnectToMySql($Database);	
			$query2 = "update follows_other_users set looked_up_info = 'yes' where follower_id = '$user_id'";
			#print " $count ----  $query2\n";
			print "--------------------------------------------------------------------------------------------------\n";
			$sth2 = $dbh2->prepare($query2);
			$sth2->execute();

#sleep 1;

$count++;
		}
# end - eval
};
#599
$number++;

	# if we didn't grab all 100 users, change the last user's status to error
	#print "Count $count\n";
	if ($count prepare($query3);
		$sth3->execute();
		exit;
	}

# if there aren't any more users, quit
if (length($data[0]) fetchrow_array()) {
#}
#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}
close(OUT);
exit;

follow_user.pl

#!/usr/bin/perl
# Updated 2015-10-25

use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
use Scalar::Util 'blessed';

# ----------------------------------------------------------------------------
# follow users from database
# ----------------------------------------------------------------------------
# Credentials for your twitter application - blog
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);

# find the stats and info for the following users

$limit = 250;
$percent_follow_minimum = 80;
$percent_follow_maximum = 140;
$statuses_count_minimum = 30;

$dbh = ConnectToMySql($Database);
$query = "select user_id FROM twitter_users where sent_follow_request IS NULL and percent_follow > $percent_follow_minimum and percent_follow  $statuses_count_minimum limit $limit";	
$sth = $dbh->prepare($query);
$sth->execute();

print "\n$query\n\n";

$count_users = 1;

# 107 following 114 followers

# loop through our results - one tweet at a time
while (@data = $sth->fetchrow_array()) {

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst create_friend({ user_id => "$user_id" });
    
#		    print "---DUMPER START---\n";
#			print Dumper $friend;
#		    print "---DUMPER END---\n\n";

			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "\n - HTTP Response Code: ", $err->code, "\n",
				"\n - HTTP Message......: ", $err->message, "\n",
				"\n - Twitter error.....: ", $err->error, "\n";
			
			# end if
			};
			
			
	# end - eval
	};

			$dbh2 = ConnectToMySql($Database);	
			$query2 = "update twitter_users SET sent_follow_request = 'yes', sent_request_datetime  = '$DateTime' where user_id = '$user_id'";
			#print " $query2\n";
			#print " database updated.\n"
			$sth2 = $dbh2->prepare($query2);
			$sth2->execute();

# pause for a random time so twitter doesn't think you are a robot
# minimum and maximum time in seconds to sleep
$minimum = 360; 
$maximum = 720;
$random_sleep = int($minimum + rand($maximum - $minimum));

print " - sleeping for $random_sleep seconds\n";
sleep $random_sleep;

$count_users++;

# end - while (@data = $sth->fetchrow_array())
};

exit;

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

exit;

friends_follow_check.pl

#!/usr/bin/perl
# Updated 2015-10-25

use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
use Scalar::Util 'blessed';

# ----------------------------------------------------------------------------
# see if a user follows me and/or if i follow them
# ----------------------------------------------------------------------------

# Credentials for your twitter application - blog
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);

# limit must be 100 or less
$limit = 100;

# how many days do you want to wait until you unfriend someone?

$date_to_delete = "2015-10-21";

# find the stats and info for the following users

$dbh = ConnectToMySql($Database);
$query = "select user_id, sent_request_datetime FROM twitter_users where sent_follow_request = 'yes' and sent_request_datetime prepare($query);
$sth->execute();

print "\n$query\n\n\n";

$count_users = 0;
$count = 0;

# loop through our results - one tweet at a time
while (@data = $sth->fetchrow_array()) {

	$user_id = $data[0];
	$sent_request_datetime{$user_id} = $data[1];

	if ($count_users > 0)

	{
		$users_to_get = "$users_to_get, $user_id";
		push(@data2, "$user_id");
	}

	else

	{
		$users_to_get = "$user_id";
		push(@data2, "$user_id");
	}

$count_users++;

# end - while
}

print "$users_to_get\n\n";

#exit;

eval {

	my $friend = $nt->lookup_friendships({ user_id => "$users_to_get" });
    
#    print "---DUMPER START---\n";
#	print Dumper $friend;
#    print "---DUMPER END---\n\n";

			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "\n - HTTP Response Code: ", $err->code, "\n",
				"\n - HTTP Message......: ", $err->message, "\n",
				"\n - Twitter error.....: ", $err->error, "\n";
			
			# end if
			};

	for $data_list ( @$friend ) {

	$count++;
   
		for $role ( keys %$data_list ) {
        	

        	if ($role =~ "id_str")
        	
        	{
        			# since the connection info is an array, find the values
        			$user_id_str = $data_list->{$role};
 
 					$user_id = $user_id_str;
 
 			# get user_id
 			
 			#print "$user_id_str - ";
 
 
 
 			# once you have the status of the connection and the user_id
 			# you can check to see if they are following you or not
 
				if ($status_friend =~ "followed_by")
        	
				{
        	
					print "*************\nThis person $user_id_str follows you. - $sent_request_datetime";
					print "Status: $status_connection1 $status_connection2\n*************\n";

					$dbh2 = ConnectToMySql($Database);
					$query2 = "update twitter_users set followed_me = 'yes' where user_id = '$user_id_str'";	
					$sth2 = $dbh2->prepare($query2);
					$sth2->execute();

					print "\n$query2\n";
					print "########\n $count of $limit sleeping....\n########\n";
					sleep 155;

				}
    		
				else
    		
				{
					print "This person $user_id_str DOES NOT follow you. - $sent_request_datetime{$user_id}\n";
					print "Status: $status_connection1 $status_connection2\n";

		eval {

		my $friend = $nt->destroy_friend({ user_id => "$user_id" });
    
#		    print "---DUMPER START---\n";
#			print Dumper $friend;
#		    print "---DUMPER END---\n\n";

			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "\n - HTTP Response Code: ", $err->code, "\n",
				"\n - HTTP Message......: ", $err->message, "\n",
				"\n - Twitter error.....: ", $err->error, "\n";
			
			# end if
			};
			
			
	# end - eval
	};

					$dbh3 = ConnectToMySql($Database);
					$query3 = "update twitter_users set followed_me = 'no' where user_id = '$user_id_str'";	
					$sth3 = $dbh3->prepare($query3);
					$sth3->execute();

					print "\n$query3\n";

					$dbh4 = ConnectToMySql($Database);
					$query4 = "update twitter_users set unfollowed_them = 'yes' where user_id = '$user_id_str'";	
					$sth4 = $dbh4->prepare($query4);
					$sth4->execute();

					print "\n$query4\n";
					print "########\n $count of $limit sleeping ";
					# pause for a random time so twitter doesn't think you are a robot
					# minimum and maximum time in seconds to sleep
					$minimum = 60; 
					$maximum = 120;
					$random_sleep = int($minimum + rand($maximum - $minimum));

					print " for $random_sleep seconds\n";
					sleep $random_sleep;

				# unfollow this user
    		    		
				}

 			# end - if ($role =~ "id_str")
 			}

        	# check the status of the connection        	
        	if ($role =~ "connections")
        	
        	{
        			# since the connection info is an array, find the values
        			$status_connection1 = $data_list->{$role}[0];
        			$status_connection2 = $data_list->{$role}[1];
        			$status_connection3 = $data_list->{$role}[2];
        	
        	$status_friend = "$status_connection1 $status_connection2 $status_connection3";

    		# if ($role =~ "connections")
    		}
    		
    		# for $role ( keys %$data_list ) {
    		}
    		
    		# end - for $data_list ( @$friend ) {
    		}

# end - eval
};

# end - while
#}

print "\n\n";

exit;

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

rate_limits.pl

#!/usr/bin/perl
# Updated 2015-10-25
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
use Scalar::Util 'blessed';
#use Encode;
use JSON;
  
# ----------------------------------------------------------------------------
# get the relationship between my user name and another user name to see
# if they are following me or if I am following them
# ----------------------------------------------------------------------------

# Credentials for your twitter application - blog
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "access_token_secret",
      apiurl => 'http://api.twitter.com/1.1',
      ssl                 => 1
);

	eval {

    			my $user_info = $nt->rate_limit_status;

#	print "---DUMPER START---\n";
print Dumper $user_info;
#	print "---DUMPER END---\n\n";

			if ( my $err = $@ ) {
				die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

				warn "\n - HTTP Response Code: ", $err->code, "\n",
				"\n - HTTP Message......: ", $err->message, "\n",
				"\n - Twitter error.....: ", $err->error, "\n";
			
			# end if
			};


	$friendships_show_remaining = $user_info=>{friendships};
	print "friendships_show_remaining $friendships_show_remaining\n";

#		print Dumper $friendships_show_remaining;
			
for my $item( @{$user_info_data->{friendships}} ){
    print $item->{'/friendships/show'} . "\n";
};

exit;


I am not the best Perl programmer, nor am I an expert at the Twitter API, so there may be a better/easier way to do this. Good luck with the scripts and let me know how they work for you. And follow me on Twitter at ScriptingMySQL and TonyDarnell.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Using Perl to retrieve direct messages from Twitter, insert messages into a MySQL database and then delete the messages

In two earlier posts, I gave some examples on how to use Perl to send tweets stored in a MySQL database to Twitter, and then how to automatically reply to your retweets with a “thanks”. In this post, I will show you how to automatically download your direct messages from Twitter, store the messages in a MySQL database, and then delete them.

I don’t like the way Twitter makes me read my direct messages. Granted, the majority of them are not real messages. The message is usually thanking me for following the sender, and then there is a personal website link or a link to a product they are selling. But if I want to delete a direct message, I have to click on the message, click the trash can, and then confirm I want to delete the message. This process is too cumbersome.

I wrote a Perl script that connects to Twitter, downloads your direct messages, inserts them into a MySQL database, and then deletes the direct message. I had a year of direct messages in my Inbox, and in a few minutes, they were gone. But I still had a copy in my MySQL database, in case I wanted to go back and read them or respond.

Just like in the first post, you will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

One caveat: twitter has a rate limit on how often you may connect with your application – depending upon what you are trying to do. See Rate Limiting and Rate Limits for more information. So, check your limits before you try downloading a large number of direct messages.

Just like the other two apps, we will be using the Net::Twitter module, as well as a few more modules which are listed in the beginning of the script.

There are two kinds of direct messages – the messages you sent and the messages you have received. So, we will be looking at two different Perl scripts to retrieve each kind of message. In the Net::Twitter module, there are a lot of different variables you can capture. In these examples, I only grabbed what I thought I needed. If there are other data variables you want, you will have to modify the tables and scripts.

First, you will need to create two databases to store your direct messages – one for the sent messages and one for the received messages. Here are the CREATE TABLE statements for both tables:

CREATE TABLE 'MESSAGES_SENT ' (
  'id' int(10) NOT NULL AUTO_INCREMENT,
  'creation_datetime' datetime DEFAULT NULL,
  'message_id' bigint(20) DEFAULT NULL,
  'sender_screen_name' varchar(16) DEFAULT NULL,
  'recipient_screen_name' varchar(16) DEFAULT NULL,
  'message_text' varchar(140) DEFAULT NULL,
  'sender_friends_count' int(10) DEFAULT NULL,
  'sender_time_zone' varchar(64) DEFAULT NULL,
  'sender_description' varchar(160) DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1


CREATE TABLE 'MESSAGES_RECEIVED' (
  'id' int(10) NOT NULL AUTO_INCREMENT,
  'creation_datetime' datetime DEFAULT NULL,
  'message_id' bigint(20) DEFAULT NULL,
  'sender_screen_name' varchar(16) DEFAULT NULL,
  'recipient_screen_name' varchar(16) DEFAULT NULL,
  'message_text' varchar(140) DEFAULT NULL,
  'sender_friends_count' int(10) DEFAULT NULL,
  'sender_time_zone' varchar(64) DEFAULT NULL,
  'sender_description' varchar(160) DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=1

I am only storing a few of the variables. In each script, if you want to see all of the variables which are available, you will need to uncomment this line:

#          print Dumper $statuses;

When using the Dumper command, be sure that you are only downloading a single message. Otherwise, the script will pull all of the variables and their values for all of the messages. You only need one message to see all of the variables. To only retrieve one message, be sure that the value $number_of_messages is equal to 1:

$number_of_messages = 1;

Afterwards, you can change this value to whatever you want – just be sure to watch your Twitter limits.

The print Dumper $statuses; line will display all of the possible variables for a single message. For the “sent” script, there are about 140 variables. For the “received” script, there are about 67 variables.

Here is the “Get direct messages I sent to other people” script, which uses the MESSAGES_SENT database:

#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
    
# ----------------------------------------------------------------------------
# get twitter direct messages sent
# ----------------------------------------------------------------------------

# you will need to fill in this information about your application and your twitter account
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      ssl                 => 1
);

$number_of_messages = 1;

  eval {
      my $statuses = $nt->sent_direct_messages({ count => $number_of_messages });

# uncomment this line and set the count above to 1 (one) to see all of the variables that are available
#          print Dumper $statuses;

	for my $status ( @$statuses ) {
		
		$creation_date = $status->{created_at};

		# convert $creation_date to MySQL datetime format
		#  0   1   2  3         4     5
		# Fri Sep 04 07:32:05 +0000 2015
		
		@creation_date_array = split(" ",$creation_date);
		
		$creation_date_month = $creation_date_array[1];
		
		if ($creation_date_month =~ "Jan") { $creation_date_month = "01"}
		if ($creation_date_month =~ "Feb") { $creation_date_month = "02"}
		if ($creation_date_month =~ "Mar") { $creation_date_month = "03"}
		if ($creation_date_month =~ "Apr") { $creation_date_month = "04"}
		if ($creation_date_month =~ "May") { $creation_date_month = "05"}
		if ($creation_date_month =~ "Jun") { $creation_date_month = "06"}
		if ($creation_date_month =~ "Jul") { $creation_date_month = "07"}
		if ($creation_date_month =~ "Aug") { $creation_date_month = "08"}
		if ($creation_date_month =~ "Sep") { $creation_date_month = "09"}
		if ($creation_date_month =~ "Oct") { $creation_date_month = "10"}
		if ($creation_date_month =~ "Nov") { $creation_date_month = "11"}
		if ($creation_date_month =~ "Dec") { $creation_date_month = "12"}
		
		$creation_date_day_of_month = $creation_date_array[2];
		$creation_date_year = $creation_date_array[5];
		$creation_date_time = $creation_date_array[3];
		$creation_datetime = "$creation_date_year-$creation_date_month-$creation_date_day_of_month $creation_date_time";

		$message_id = $status->{id};
		
		$sender_screen_name = $status->{sender}{screen_name};
		$sender_screen_name =~ s/\'/\~/g;

		$recipient_screen_name = $status->{recipient_screen_name};
		$recipient_screen_name =~ s/\'/\~/g;
				
		$message_text = $status->{text};
		# remove carriage returns and single tick marks
		$message_text =~ s/\n/ /g;
		$message_text =~ s/\'/\~/g;

		$sender_friends_count = $status->{recipient}{followers_count};

		$sender_time_zone = $status->{sender}{time_zone};
		
		$sender_description = $status->{sender}{description};
		$sender_description =~ s/\n/ /g;
		$sender_description =~ s/\'/\~/g;

		# uncomment this line if you want to print
		# print "$creation_date_day_of_month - $creation_datetime - $message_id - $sender_screen_name - $recipient_screen_name - $message_text - $sender_friends_count - $sender_time_zone - $sender_description\n";


$dbh = ConnectToMySql($Database);	
$query = "insert into messages_sent (creation_datetime,message_id,sender_screen_name,recipient_screen_name,message_text,sender_friends_count,sender_time_zone,sender_description) values ('$creation_datetime','$message_id','$sender_screen_name','$recipient_screen_name','$message_text','$sender_friends_count','$sender_time_zone','$sender_description')";
#print "\nquery $query\n";
$sth = $dbh->prepare($query);
$sth->execute();


# stop the program if we have an error with the database
if ( $sth->err )
{
	die "ERROR! return code:" . $sth->err . " error msg: " . $sth->errstr . "\n";
}

else

{
	 my $destroy_id = eval { $nt->destroy_direct_message("$message_id") };
}



# you can change this so you don't go over your twitter connection limits
sleep 5;


	# end for my $status
	}

# end eval
  };

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

Here is the “Get direct messages I received from other people” script, which uses the MESSAGES_RECEIVED database:

#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;
use Data::Dumper;
       
# ----------------------------------------------------------------------------
# get twitter direct messages received
# ----------------------------------------------------------------------------

# you will need to fill in this information about your application and your twitter account
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      ssl                 => 1
);

$number_of_messages = 1;

  eval {
      my $statuses = $nt->direct_messages({ count => $number_of_messages });

			# uncomment this line and set the $number_of_messages variable above to 1 (one) to see all of the variables that are available
#			print Dumper $statuses;

	for my $status ( @$statuses ) {
		
		$creation_date = $status->{created_at};

		# convert $creation_date to MySQL datetime format
		#  0   1   2  3         4     5
		# Fri Sep 04 07:32:05 +0000 2015
		
		@creation_date_array = split(" ",$creation_date);
		
		$creation_date_month = $creation_date_array[1];
		
		if ($creation_date_month =~ "Jan") { $creation_date_month = "01"}
		if ($creation_date_month =~ "Feb") { $creation_date_month = "02"}
		if ($creation_date_month =~ "Mar") { $creation_date_month = "03"}
		if ($creation_date_month =~ "Apr") { $creation_date_month = "04"}
		if ($creation_date_month =~ "May") { $creation_date_month = "05"}
		if ($creation_date_month =~ "Jun") { $creation_date_month = "06"}
		if ($creation_date_month =~ "Jul") { $creation_date_month = "07"}
		if ($creation_date_month =~ "Aug") { $creation_date_month = "08"}
		if ($creation_date_month =~ "Sep") { $creation_date_month = "09"}
		if ($creation_date_month =~ "Oct") { $creation_date_month = "10"}
		if ($creation_date_month =~ "Nov") { $creation_date_month = "11"}
		if ($creation_date_month =~ "Dec") { $creation_date_month = "12"}
		
		$creation_date_day_of_month = $creation_date_array[2];
		$creation_date_year = $creation_date_array[5];
		$creation_date_time = $creation_date_array[3];
		$creation_datetime = "$creation_date_year-$creation_date_month-$creation_date_day_of_month $creation_date_time";

		$message_id = $status->{id};
		
		$sender_screen_name = $status->{sender}{screen_name};
		$sender_screen_name =~ s/\'/\~/g;
		
		$message_text = $status->{text};
		# remove carriage returns and single tick marks
		$message_text =~ s/\n/ /g;
		$message_text =~ s/\'/\~/g;

		$sender_friends_count = $status->{sender}{friends_count};

		$sender_time_zone = $status->{sender}{time_zone};
		
		$sender_description = $status->{sender}{description};
		$sender_description =~ s/\n/ /g;
		$sender_description =~ s/\'/\~/g;

		$recipient_screen_name = $status->{recipient_screen_name};
		$recipient_screen_name =~ s/\'/\~/g;
		
		# uncomment this line if you want to see the output
		# print "$creation_datetime - $message_id - $sender_screen_name - $recipient_screen_name - $message_text - $sender_friends_count - $sender_time_zone - $sender_description\n";

$dbh = ConnectToMySql($Database);	
$query = "insert into messages_received (creation_datetime,message_id,sender_screen_name,recipient_screen_name,message_text,sender_friends_count,sender_time_zone,sender_description) values ('$creation_datetime','$message_id','$sender_screen_name','$recipient_screen_name','$message_text','$sender_friends_count','$sender_time_zone','$sender_description')";
#print "\nquery $query\n";
$sth = $dbh->prepare($query);
$sth->execute();

# stop the program if we have an error with the database
if ( $sth->err )
{
	die "ERROR! return code:" . $sth->err . " error msg: " . $sth->errstr . "\n";
}

else

{
	 my $destroy_id = eval { $nt->destroy_direct_message("$message_id") };
}

# you can change this so you don't go over your twitter connection limits
sleep 5;

	# end for my $status
	}

# end eval
  };

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

In the subroutine ConnectToMySql, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file – named accessTweets contains this information:

database_name
hostname or IP
MySQL user name
password

You may hardcode these values into the script if you don’t want to mess with storing them in a file.

I tested these scripts on two twitter accounts, and everything worked for me – but I ran out of messages quickly. Let me know if you have problems. I am not the best Perl programmer, nor am I an expert at the Twitter API, so there may be a better/easier way to do this.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Using Perl and MySQL to automatically respond to retweets on twitter

In an earlier post titled Using Perl to send tweets stored in a MySQL database to twitter, I showed you a way to use MySQL to store tweets, and then use Perl to automatically send your tweets to twitter.

In this post, we will look at automatically sending a “thank you” to people who retweet your tweets – and we will be using Perl and MySQL again.

Just like in the first post, you will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

One caveat: twitter has a rate limit on how often you may connect with your application – depending upon what you are trying to do. See Rate Limiting and Rate Limits for more information. So, if you are going to put this into a cron job, I wouldn’t run it more than once every 15 minutes.

We will also be using the same tables we created in the first post – tweets and history – as well as a new table, named retweets. The retweets table will contain all of the user names and tweet ID’s for those retweets we have discovered and already sent a thank-you tweet response.

The Perl script will connect to your tweet history table, and retrieve a set of your tweet ID’s, with the most recent tweet first. The script will then connect to twitter and check to see if there are any retweets for each ID. If a retweet is found, the script will check your retweets table to see if you have already thanked the tweeter for the retweet. If this is a new retweet, the script will connect to twitter and send a “thank-you” message to that user, and then insert the user name and tweet ID into the retweets table. This will ensure that you do not send a thank-you response more than one time.

Here is a flow chart that will attempt to explain what the script does:

We will be using the API call/method retweets(id) to see if a tweet ID was retweeted, and then we will send the thank-you tweet via the update call. More information about the Perl twitter API may be found at Net::Twitter::Lite::WithAPIv1_1.

First we will need to create the retweets table, where we will store the information about our tweets that were retweeted. Here is the CREATE TABLE statement for the retweets table:

CREATE TABLE `retweets` (
  `id` int(8) NOT NULL AUTO_INCREMENT,
  `tweet_id` bigint(24) DEFAULT NULL,
  `user_name` varchar(24) DEFAULT NULL,
  `retweet_update` varchar(36) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=latin1

All you need to do is to use edit this script with your own consumer_key, consumer_secret, access_token and access_token_secret for your application, and edit the accessTweets file used by the subroutine ConnectToMySql. You may also comment-out the “print” commands.

#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;

my $Database = "tweets";

# Credentials for your twitter application
# you will need to subsitute your own application information for these four variables
my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      ssl                 => 1
);

# Grab the last X number of tweets to check for retweets
# - determined by the number after "limit"

$dbh = ConnectToMySql($Database);
$query = "select tweet_id, tweet_update FROM history order by tweet_update desc, id limit 10";	
$sth = $dbh->prepare($query);
$sth->execute();

# loop through our results - one tweet at a time
while (@data = $sth->fetchrow_array()) {

	$tweet_id = $data[0];
	$tweet_update = $data[1];

	print "----------------------------------------------------------------------\n";
	print "Checking:  $tweet_id $tweet_update\n";
	print "----------------------------------------------------------------------\n";

		# Connect to twitter and see if anyone retweeted this tweet
		my $results = eval { $nt->retweets($tweet_id)};

		for my $status ( @$results ) {
       
			$user_name = "$status->{user}{screen_name}";
			$retweet_update = "$status->{created_at}";

			# see if this person has retweeted this before, and we already
			# have a record of the retweet in our database
					
			$dbh2 = ConnectToMySql($Database);
			$query2 = "select tweet_id, user_name FROM retweets where tweet_id = '$tweet_id' and user_name = '$user_name' limit 1";	
			$sth2 = $dbh2->prepare($query2);
			$sth2->execute();
    
			@data2 = $sth2->fetchrow_array();
    
			# Uncomment if you want to see it in action
			# print "Query: $query\n";
		
			# Check to see if we had any results, and if not, then insert
			# tweet into database and send them a "thank you" tweet
			if (length($data2[0]) prepare($query3);
				$sth3->execute();

				# Uncomment if you want to see it in action
				# print "Query2: $query2\n";


				# ----------------------------------------------------------------------------
				# send tweet
				# ----------------------------------------------------------------------------

				# This pause is just to slow down the action - you can remove this line if you want
				sleep 5;
				
				my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
					traits              => [qw/API::RESTv1_1/],
					consumer_key        => "$consumer_key",
					consumer_secret     => "$consumer_secret",
					access_token        => "$access_token",
					access_token_secret => "$access_token_secret",
					ssl                 => 1
					);
					
					# Here is the message you want to send - 
					# the thank-you to the user who sent the retweet
					$tweet = "\@$user_name thanks for the retweet!";

					# send thank-you tweet
					my $results = eval { $nt->update("$tweet") };

						undef @data2;
						undef @data3;
					}
		
					else
				
					{
  						# we have already thanked this user - as their name and this tweet-id was found in the database
    					print "----- Found tweet: $tweet_id\n";
    
						while (@data2) {

							print "----------------------------------------------------------------------\n";
							print "Checking retweet by $user_name for $tweet_id\n";
							print "Found retweet:  $tweet_id $user_name $retweet_update \n";

							$tweet_id = $data2[0];
							$user_name = $data2[1];
					
							print "***** Retweet by $user_name already in database \n";
							print "----------------------------------------------------------------------\n";
					
							#exit;
							undef @data2;
							undef @data3;

							# This pause is just to slow down the action - you can remove this line if you want
							sleep 5;

							# end while
							}
						
					# end else
					}

		# end for my $status ( @$results ) {
		}

# end while
}

exit;

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= ;
   my $host= ;
   my $userid= ;
   my $passwd= ;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}

In the subroutine ConnectToMySql, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file – named accessTweets contains this information:

database_name
hostname or IP
MySQL user name
password

I tested this on two twitter accounts, and everything worked for me – but let me know if you have problems. I am not the best Perl programmer, nor am I an expert at the twitter API, so there is probably a better/easier way to do this.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Using Perl to send tweets stored in a MySQL database to twitter

Twitter is not my favorite social media site. Using twitter is like driving downtown, screaming what you want to say out the window, and hoping someone hears you. There might be tens of thousands of people downtown, but your message will only be heard by a few. Your best bet is to repeat your message as often as possible.

However, twitter is free and if you want to reach as many people (theoretically) as possible, you might as well use it. But sending tweets on a scheduled basis can be a pain. There are client programs available which allow you to schedule your tweets (Hootsuite is one I have used in the past). You can load your tweets in the morning, and have the application tweet for you all day long. But you still have to load the application with your tweets – one by one.

A friend of mine asked me if there was a way to send the same 200 tweets over and over again, spaced out every 20 minutes or so. He has a consulting business, and just wants to build up a list of twitter followers by tweeting inspirational quotes. If he tweets for twenty hours a day, and sends three quotes an hour, it will take him a little more than three days to burn through his 200 quotes. And he can always add more quotes or space out the tweets as necessary. I decided to write a Perl script to do this for him.

To start, we will need a MySQL database to store the tweets. I use MySQL’s Workbench product as my client application for connecting to MySQL. From within Workbench, I can create my tweet database:

CREATE DATABASE 'tweets' /*!40100 DEFAULT CHARACTER SET latin1 */

I will then need a table inside my database to store my tweets.

CREATE TABLE 'tweets' (
  'id' int(11) NOT NULL AUTO_INCREMENT,
  'tweet' char(140) DEFAULT NULL,
  'last_tweet_date' datetime NOT NULL DEFAULT '2015-01-01 00:00:00',
  'tweet_count' int(5) DEFAULT NULL,
  'tweet_length' int(3) DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB AUTO_INCREMENT=100001 DEFAULT CHARSET=latin1

The tweet messages will be stored in the tweet column, and the last date the tweet was sent will have a time stamp in the last_tweet_date column. When we perform a search to find the next tweet, we will simply sort our search by the last_tweet_date and then id, and limit our output to one tweet. After we send the tweet, we will update the last_tweet_date column and send that tweet to the end of the line. The script will also incrementally change the tweet_count value (number of times the tweet has been sent), and record the length of the tweet in tweet_length. I do not do any error checking in the script to make sure the tweet was sent, but errors are printed.

We now need to insert some tweets into our table. Since my friend is going to be sending inspirational quotes, I found a few I can import. In order to make it easier for importing, all single quote marks () will be replaced by the carat symbol (^). I can then swap these symbols inside the Perl script. You could use the backslash (\) before the single quote, but I prefer a single character substitution so I know how long the tweet will be.

I will also use the tilde (~) as a way to designate a carriage return in my tweet. The Perl script will replace the tilde with a carriage return (\n). Two tildes give me two carriage returns and a blank line.

insert into tweets (tweet) VALUES('I^m not afraid. -Luke~~You will be. -Yoda~~http://SomeWebSiteHere.com');
insert into tweets (tweet) VALUES('Do or do not.  There is no try.~~-Yoda~~http://SomeWebSiteHere.com');
insert into tweets (tweet) VALUES('No, I am your father.~~-Darth~~http://SomeWebSiteHere.com');

I also create a history table to store the tweet identification numbers. Each tweet is assigned a unique number by twitter, and this is how you can access this tweet. I save this information so I can delete the tweets later using this number. I have included a short script for deleting tweets near the end of this post.

CREATE TABLE 'history' (
  'id' int(11) NOT NULL AUTO_INCREMENT,
  'tweet' char(140) DEFAULT NULL,
  'tweet_id' varchar(30) DEFAULT NULL,
  'tweet_update' datetime DEFAULT NULL,
  PRIMARY KEY ('id')
) ENGINE=InnoDB AUTO_INCREMENT=1000032 DEFAULT CHARSET=latin1

You will need to register your application with twitter via apps.twitter.com, and obtain the following:

consumer_key
consumer_secret
access_token
access_token_secret

You will also need to register your mobile phone in order to link your twitter account to your application. I have not figured out how to use this script with someone else’s account, as the instructions for scripting Perl for use with twitter are not very thorough. I will try to add this at a later date.

Now that you have your application information and all of your tables created with data inserted, here is the Perl script for sending tweets. You will need to install the necessary Perl modules that are used.


#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;

my $Database = "tweets";

# ----------------------------------------------------------------------------------
# this has to be near the top - as other parts of the script rely on these figures
# ----------------------------------------------------------------------------------

my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime time;

$year = $year + 1900;
$mon = $mon + 1;

# add a zero if the value is less than 10

if ($sec < 10) { $sec = "0$sec"; }
if ($min < 10) { $min = "0$min"; }
if ($hour < 10) { $hour = "0$hour"; }
if ($mday < 10) { $mday = "0$mday"; }
if ($mon < 10) { $mon = "0$mon"; }
if ($year < 10) { $year = "0$year"; }
if ($wday < 10) { $wday = "0$wday"; }
if ($yday < 10) { $yday = "0$yday"; }
if ($isdst < 10) { $isdst = "0$isdst"; }

$DateTime = "$year-$mon-$mday $hour:$min:$sec";

# ----------------------------------------------------------------------------------
# retrieve tweet from database
# ----------------------------------------------------------------------------------

     $dbh = ConnectToMySql($Database);
     $query = "select id, tweet, last_tweet_date, tweet_count FROM tweets order by last_tweet_date, id limit 1";	
     $sth = $dbh->prepare($query);
     $sth->execute();
    
          while (@data = $sth->fetchrow_array()) {
            
		 $id = $data[0];
		 $tweet = $data[1];
		 $last_tweet_date = $data[2];
		 $tweet_count = $data[3];
	}

$tweet_original = $tweet;

# ----------------------------------------------------------------------------
# find tildes ~ and substitute for carriage return
# find carats and substitue for single quote
# ----------------------------------------------------------------------------

$tweet =~ s/~/\n/g;
$tweet =~ s/\^/\'/g;

# ----------------------------------------------------------------------------------
# check length of tweet
# ----------------------------------------------------------------------------------

$tweet_length = length($tweet);

if (length($tweet) > 140)

{
	print "Error - tweet is longer than 140 characters\n";
	exit;
}

# add to the tweet count
$tweet_count++;

# ----------------------------------------------------------------------------
# send tweet
# ----------------------------------------------------------------------------

my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      traits              => [qw/API::RESTv1_1/],
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      ssl                 => 1
);

my $results = eval { $nt->update("$tweet") };
 
  if ( my $err = $@ ) {
      die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

      warn "HTTP Response Code: ", $err->code, "\n",
           "HTTP Message......: ", $err->message, "\n",
           "Twitter error.....: ", $err->error, "\n";
  } 

# ----------------------------------------------------------------------------
# update mysql with new date for last_tweet date/time
# ----------------------------------------------------------------------------

$dbh = ConnectToMySql($Database);
$query = "UPDATE tweets SET last_tweet_date = '$DateTime' , tweet_count = '$tweet_count' , tweet_length = '$tweet_length' where id = '$id'";
$sth = $dbh->prepare($query);
$sth->execute();

# ----------------------------------------------------------------------------
# get the status id of the last tweet
# ----------------------------------------------------------------------------

my $statuses = $nt->user_timeline({ user => "2044_The_Book", count=> 1 });

for my $status ( @$statuses ) {
	$tweet_id = "$status->{id}";
}

if ( my $err = $@ ) {
      die $@ unless blessed $err && $err->isa('Net::Twitter::Lite::Error');

      warn "HTTP Response Code: ", $err->code, "\n",
           "HTTP Message......: ", $err->message, "\n",
           "Twitter error.....: ", $err->error, "\n";
} 

# ----------------------------------------------------------------------------
# replace special characters
# ----------------------------------------------------------------------------

$tweet =~ s/\\\n/~/g;
$tweet =~ s/\'/^/g;

# update mysql with new date for last_tweet date/time

$dbh = ConnectToMySql($Database);	
$query = "insert into history (tweet,tweet_id,tweet_update) values ('$tweet_original','$tweet_id','$DateTime')";
$sth = $dbh->prepare($query);
$sth->execute();

#----------------------------------------------------------------------
sub ConnectToMySql {
#----------------------------------------------------------------------

   my ($db) = @_;

   open(PW, "<..\/accessTweets") || die "Can't access login credentials";
   my $db= <PW>;
   my $host= <PW>;
   my $userid= <PW>;
   my $passwd= <PW>;

   chomp($db);
   chomp($host);
   chomp($userid);
   chomp($passwd);
   
   my $connectionInfo="dbi:mysql:$db;$host:3306";
   close(PW);

   # make connection to database
   my $l_dbh = DBI->connect($connectionInfo,$userid,$passwd);
   return $l_dbh;

}



In the subroutine ConnectToMySql, I store the MySQL login credentials in a text file one directory below where my Perl script is located. This file contains this information:

database_name
hostname or IP
MySQL user name
password

You can include your information inside the file if you prefer.

If you want to delete your tweets, you can create a script to access the tweets in your history table, and then delete them one at a time. Here is an example without the database connections:


#!/usr/bin/perl
 
use Net::Twitter::Lite::WithAPIv1_1;
use DBI;
use DBD::mysql;

# ----------------------------------------------------------------------------
# delete tweet
# ----------------------------------------------------------------------------

# replace the values for $consumer_key $consumer_secret $access_token $access_token_secret
# with your values for your application

my $nt = Net::Twitter::Lite::WithAPIv1_1->new(
      consumer_key        => "$consumer_key",
      consumer_secret     => "$consumer_secret",
      access_token        => "$access_token",
      access_token_secret => "$access_token_secret",
      ssl                 => 1,
	 );

my $statuses = $nt->destroy_status({ id => "$tweet_id" });

exit;

Be sure to replace the value of $tweet_id with the value from the tweet you want to delete.

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

MySQL Enterprise Monitor – send advisor events to your chat client with Perl and Jabber

MySQL Enterprise Monitor (MEM) is part of the MySQL Enterprise Edition, and MEM provides real-time visibility into the performance and availability of all your MySQL databases. MEM and the MySQL Query Analyzer continuously monitor your databases and alerts you to potential problems before they impact your system. It’s like having a “Virtual DBA Assistant” at your side to recommend best practices to eliminate security vulnerabilities, improve replication, optimize performance and more. As a result, the productivity of your developers, DBAs and System Administrators is improved significantly.

With MEM, you have a couple of notification options for receiving information when MEM has received an event alert. An event alert is a “significant deviation from the baseline performance trends, and best-practice Advisors recommended changes to configuration and variable settings to improve performance”. From: http://www.mysql.com/products/enterprise/monitor.html

You may choose to receive these alerts via email or SNMP. I am not going to go over how to setup these alerts – but instructions may be found here.

I have MEM installed on my home server, and since I have been using it for a while, I have been able to tune my instances and databases so that I rarely receive any alerts. I normally receive these alerts via email, but I don’t always check email for this particular account during the day. Since most of my colleagues are in other parts of the world, we use a Jabber chat client for quick communications during the day. I wanted to figure out a way for me to receive a chat message whenever MEM had an alert. For my chat client, I use adium – which is an open-source multi-protocol instant messaging client for Mac OS X – it supports MSN, Jabber, Yahoo! and other networks. But this should work with any XMPP-based chat client application.

You will probably want to create a new POP3 email address for these alerts, as this script will delete the email messages from the server. If you want to keep an email copy of these alerts, you may add an additional email address to the alert notification group. If you use a free email service (like gmail), remember that it has a daily limit of the number of messages that you can send. I ran into this problem when testing the scripts, so I created my own email under one of my domains. I found two Perl scripts; one that acts as a POP3 client, and another that sends the Jabber message – and I combined them into one Perl script. You might need to do a little tweaking to get it to work with your service.

This is a sample email message from MEM. The first “Subject” line is the actual email subject. In MEM, you can customize what information you want in your email subject line via Email Notification Groups. I combined several parts of the email message into one message to be sent via chat.


(Email Subject Line)
(macserver01, MacServer01) - MEM WARNING Alert: User Has Rights To Database That Does Not Exist

(Email Body)

Subject: macserver01, MacServer01
Time: 2013-12-30 17:01:44 UTC (2013-12-30 12:01:44 EST)
Category: Security
Event: User Has Rights To Database That Does Not Exist
Problem Description
When a database is dropped, user privileges on the database are not automatically dropped. This has security implications as that user will regain privileges if a database with the same name is created in the future, which may not be the intended result.
Advice
Revoke privileges for all users on any databases that do not exist. The following users have privileges on databases that do not exist.

    ''@'%' on database test_%

Recommended Action
REVOKE ALL PRIVILEGES ON db_name.* FROM 'user_name'@'host_name';
Links and Further Reading
MySQL Manual: DROP DATABASE Syntax
MySQL Manual: GRANT Syntax
MySQL Manual: REVOKE Syntax
MySQL Manual: Privileges Provided by MySQL
MySQL Manual: How the Privilege System Works
Securing Your MySQL Installation
Securing a MySQL Server on Windows
Expression
%user% != THRESHOLD
Evaluated Expression

    ''@'%' on database test_% != ''

Copyright © 2005, 2013, Oracle and/or its affiliates. All rights reserved.

And here is the script. You may run it as a cron job, where it will check your email every few minutes and then send you a chat message when an email with an alert has arrived. You will need to modify the script to match your email and chat settings, and I have placed notes in the script to help guide you:

#!/usr/bin/perl -w

# POP3 client script source:
# http://forums.devshed.com/perl-programming-6/how-to-get-gmail-mail-by-mail-pop3client-in-perl-555889.html
# Author: a user named keath

# Jabber message script source: 
# ttp://dipinkrishna.com/blog/2010/12/perl-send-chat-message-gmail-buddy-jabber/
# Author: Dipin Krishna

use strict;
# for the email
use Mail::POP3Client;
# for the chat
use Net::Jabber;

# I was having a problem with the Debug module, so I just
# commented line 154 in /Library/Perl/5.16/Net/XMPP/Debug.pm

# this is the email address that you want to use to receive 
# the alert messages from MySQL Enterprise Monitor
my $user = 'MEMalerts@scriptingmysql.com';
my $pass = 'mypassword';

# you will need to use your POP3 server name
my $host = "pop.emailserver.com";

my $pop = new Mail::POP3Client(
	USER     => $user,
	PASSWORD => $pass,
	HOST     => $host,
	PORT     => 995,
	USESSL   => 'true',
);

# I have commented most of the print messages
# - you may uncomment them as you wish
my $count = $pop->Count();
if ($count Message();
} elsif ($count == 0) {
	print "no messages\n";
} else {
	
	#print "$count messsages\n\n";
	
	for my $i (1 .. $count) {
	
	my $subject = "";
	my $message = "";
	
		# if you want to extract data from the head of the email
		# I am extracting data from the body
		#foreach ($pop->Head($i)) {
		foreach ($pop->Body($i)) {

			#print "$_\n" if /^(From|Subject|Date):/i;
			#print "$_\n" if /^(Subject|Date):/i;
			# my $message = "$_\n" if /^(Subject):/i;

			# I am building my message so that it contains the information in this order:
			# Category, Subject, Event

			if ($_ =~ "^Subject")
			{
				#print "$_\n";
				chomp $_;
				$subject = $_;
				$subject =~ s/Subject: //;
 			}

			if ($_ =~ "^Category")
			{
				#print "$_\n";
				chomp $_;
				$message = "$_ || $subject";
				$message =~ s/Category: //;
 			}

			if ($_ =~ "^Event")
			{
				#print "$_\n";
				chomp $_;
				
				$message = "$message || $_";
				$message =~ s/Event: //;
  						
					my $sttime=time;
						
					#print "Message: $_\n";
						
					# this is my Google Talk chat user name and password
					my $username = 'mem.scripting.mysql';;
					my $password = 'mypassword';
					my $to = 'my_email_address';
					my $msg = "$message";
					#print "$to: $msg\n";
						 
					my $resource = "dipin";
					my $hostname = 'talk.google.com';
					my $port = 5222;
					my $componentname = 'gmail.com';
					my $Contype = 'tcpip';
					my $tls = 1;
						 
					my $Con = new Net::Jabber::Client();
					$Con->SetCallBacks(presence=>\&presence,
					message=>\&message );
						 
					my $status = $Con->Connect(
					hostname => $hostname, port => $port,
					componentname => $componentname,
					connectiontype => $Contype, tls => $tls);
						 
					if (!(defined($status))) {
						print "ERROR:  XMPP connection failed.\n";
						print "        ($!)\n";
						exit(0);
					} 
						 
					# Change hostname
					my $sid = $Con->{SESSION}->{id};
					$Con->{STREAM}->{SIDS}->{$sid}->{hostname} = $componentname;
						 
					# Authenticate
						
					#my @result = $Con->AuthSend(
					my @result = $Con->AuthIQAuth(
					username => $username, password => $password,
					resource => $resource);
						
					#print "Result:  $result[0] $result[1]\n";
						 
					if ($result[0] ne "ok") {
						print "ERROR: Authorization failed: $result[0] - $result[1]\n";
					}
					else
					{
						#print "Logged in Sucessfull!\n";
						$Con->PresenceSend(show=>"Available");
						#print "Sending Message!\n";
						$Con->MessageSend(to=>"$to",
						subject=>"Test",
						body=>"$msg\n",
						priority=>10);
					}
						
						# END send Jabbar message
 						# # # # # # # # # # # # # # # # # # # # # # # # # # #

				# this deletes the message from the server
				$pop->Delete( $i );
				
				# if you only want to send one email message as a test, 
				# uncomment this line
				exit;
			}
		}
		print "\n";
	}
}

$pop->Close();

exit;

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.

Creating and restoring database backups with mysqldump and MySQL Enterprise Backup – Part 1 of 2

Be sure to check out my other posts on mysqldump:
Scripting Backups of MySQL with Perl via mysqldump
Splitting a MySQL Dump File Into Smaller Files Via Perl

Part 1 of 2: (part two)
If you have used MySQL for a while, you have probably used mysqldump to backup your database. In part one of this blog, I am going to show you how to create a simple full and partial backup using mysqldump. In part two, I will show you how to use MySQL Enterprise Backup (which is the successor to the InnoDB Hot Backup product). MySQL Enterprise Backup allows you to backup your database while it is online and it keeps the database available to users during backup operations (you don’t have to take the database offline or lock any databases/tables – but to do this, you need to use the –no-locking option).

This post will deal with mysqldump. For those of you that aren’t familiar with mysqldump:

The mysqldump client is a utility that performs logical backups, producing a set of SQL statements that can be run to reproduce the original schema objects, table data, or both. It dumps one or more MySQL database for backup or transfer to another SQL server. The mysqldump command can also generate output in CSV, other delimited text, or XML format.

The best feature about mysqldump is that it is easy to use. The main problem with using mysqldump occurs when you need to restore a database. When you execute mysqldump, the database backup (output) is an SQL file that contains all of the necessary SQL statements to restore the database – but restoring requires that you execute these SQL statements to essentially rebuild the database. Since you are recreating your database, the tables and all of your data from this file, the restoration procedure can take a long time to execute if you have a very large database.

There are a lot of features and options with mysqldump – (a complete list is here). I won’t review all of the features, but I will explain some of the ones that I use.

Here is the command to use mysqldump to simply backup all of your databases (assuming you have InnoDB tables). This command will create a dump (backup) file named all_databases.sql.

mysqldump --all-databases --single-transaction --user=root --password > all_databases.sql

After you hit return, you will have to enter your password. You can include the password after the –password option (example: –password=my_password), but this is less secure and you will get the following error:

Warning: Using a password on the command line interface can be insecure.

Here is some information about the options that were used:

--all-databases - this dumps all of the tables in all of the databases
--user - The MySQL user name you want to use for the backup
--password - The password for this user.  You can leave this blank or include the password value (which is less secure)
--single-transaction - for InnoDB tables

If you are using Global Transaction Identifier’s (GTID’s) with InnoDB (GTID’s aren’t available with MyISAM), you will want to use the –set-gtid-purged=OFF option. Then you would issue this command:

mysqldump --all-databases --single-transaction --set-gtid-purged=OFF --user=root --password > all_databases.sql

Otherwise you will see this error:

Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass --set-gtid-purged=OFF. To make a complete dump, pass --all-databases --triggers --routines --events.

You can also execute a partial backup of all of your databases. This example will be a partial backup because I am not going to backup the default databases for MySQL (which are created during installation) – mysql, test, PERFORMANCE_SCHEMA and INFORMATION_SCHEMA

Note: mysqldump does not dump the INFORMATION_SCHEMA database by default. To dump INFORMATION_SCHEMA, name it explicitly on the command line and also use the –skip-lock-tables option.

mysqldump never dumps the performance_schema database.

mysqldump also does not dump the MySQL Cluster ndbinfo information database.

Before MySQL 5.6.6, mysqldump does not dump the general_log or slow_query_log tables for dumps of the mysql database. As of 5.6.6, the dump includes statements to recreate those tables so that they are not missing after reloading the dump file. Log table contents are not dumped.

If you encounter problems backing up views due to insufficient privileges, see Section E.5, “Restrictions on Views” for a workaround.
(from: http://dev.mysql.com/doc/refman/5.6/en/mysqldump.html)

To do a partial backup, you will need a list of the databases that you want to backup. You may retrieve a list of all of the databases by simply executing the SHOW DATABASES command from a mysql prompt:

mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| comicbookdb        |
| coupons            |
| mysql              |
| performance_schema |
| scripts            |
| test               |
| watchdb            |
+--------------------+
8 rows in set (0.00 sec)

In this example, since I don’t want to backup the default mysql databases, I am only going to backup the comicbookdb, coupons, scripts and watchdb databases. I am going to use the following options:

--databases - This allows you to specify the databases that you want to backup.  You can also specify certain tables that you want to backup.  If you want to do a full backup of all of the databases, then leave out this option
--add-drop-database - This will insert a DROP DATABASE statement before each CREATE DATABASE statement.  This is useful if you need to import the data to an existing MySQL instance where you want to overwrite the existing data.  You can also use this to import your backup onto a new MySQL instance, and it will create the databases and tables for you.
--triggers - this will include the triggers for each dumped table
--routines - this will include the stored routines (procedures and functions) from the dumped databases
--events - this will include any events from the dumped databases
--set-gtid-purged=OFF - since I am using replication on this database (it is the master), I like to include this in case I want to create a new slave using the data that I have dumped.  This option enables control over global transaction identifiers (GTID) information written to the dump file, by indicating whether to add a SET @@global.gtid_purged statement to the output.
--user - The MySQL user name you want to use
--password - Again, you can add the actual value of the password (ex. --password=mypassword), but it is less secure than typing in the password manually.  This is useful for when you want to put the backup in a script, in cron or in Windows Task Scheduler.
--single-transaction - Since I am using InnoDB tables, I will want to use this option.

Here is the command that I will run from a prompt:

mysqldump --databases comicbookdb coupons scripts watchdb --single-transaction --set-gtid-purged=OFF --add-drop-database --triggers --routines --events --user=root --password > partial_database_backup.sql

I will need to enter my password on the command line. After the backup has completed, if your backup file isn’t too large, you can open it and see the actual SQL statements that will be used if you decide that you need to recreate the database(s). If you accidentally dump all of the databases into one file, and you want to separate the dump file into smaller files, see my post on using Perl to split the dump file.

For example, here is the section of the dump file (partial_database_backup.db) for the comicbookdb database (without the table definitions). (I omitted the headers from the dump file.)

--
-- Current Database: `comicbookdb`
--

/*!40000 DROP DATABASE IF EXISTS `comicbookdb`*/;

CREATE DATABASE /*!32312 IF NOT EXISTS*/ `comicbookdb` /*!40100 DEFAULT CHARACTER SET latin1 */;

USE `comicbookdb`;

--
-- Table structure for table `comics`
--

DROP TABLE IF EXISTS `comics`;
/*!40101 SET @saved_cs_client     = @@character_set_client */;
/*!40101 SET character_set_client = utf8 */;
CREATE TABLE `comics` (
  `serial_id` int(7) NOT NULL AUTO_INCREMENT,
  `date_time_added` datetime NOT NULL,
  `publisher_id` int(6) NOT NULL,
....

If you are using the dump file to create a slave server, you can use the –master-data option, which includes the CHANGE MASTER information, which looks like this:

--
-- Position to start replication or point-in-time recovery from
--

CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000013', MASTER_LOG_POS=79338;

If you used the –set-gtid-purged=ON option, you would see the value of the Global Transaction Identifier’s (GTID’s):

--
--GTID state at the beginning of the backup 
--

SET @@GLOBAL.GTID_PURGED='82F20158-5A16-11E2-88F9-C4A801092ABB:1-168523';

You may also test your backup without exporting any data by using the –no-data option. This will show you all of the information for creating the databases and tables, but it will not export any data. This is also useful for recreating a blank database on the same or on another server.

When you export your data, mysqldump will create INSERT INTO statements to import the data into the tables. However, the default is for the INSERT INTO statements to contain multiple-row INSERT syntax that includes several VALUES lists. This allows for a quicker import of the data. But, if you think that your data might be corrupt, and you want to be able to isolate a given row of data – or if you simply want to have one INSERT INTO statement per row of data, then you can use the –skip-extended-insert option. If you use the –skip-extended-insert option, importing the data will take much longer to complete, and the backup file size will be larger.

Importing and restoring the data is easy. To import the backup file into a new, blank instance of MySQL, you can simply use the mysql command to import the data:

mysql -uroot -p < partial_database_backup.sql

Again, you will need to enter your password or you can include the value after the -p option (less secure).

There are many more options that you can use with a href=”http://dev.mysql.com/doc/refman/5.6/en/mysqldump.html”>mysqldump. The main thing to remember is that you should backup your data on a regular basis, and move a copy of the backup file off the MySQL server.

Finally, here is a Perl script that I use in cron to backup my databases. This script allows you to specify which databases you want to backup via the mysql_bak.config file. This config file is simply a list of the databases that you want to backup, with an option to ignore any databases that are commented out with a #. This isn’t a secure script, as you have to embed the MySQL user password in the script.

#!/usr/bin/perl
# Perform a mysqldump on all the databases specified in the dbbackup.config file

use warnings;
use File::Basename;

# set the directory where you will keep the backup files
$backup_folder = '/Users/tonydarnell/mysqlbak';

# the config file is a text file with a list of the databases to backup
# this should be in the same location as this script, but you can modify this
# if you want to put the file somewhere else
my $config_file = dirname($0) . "/mysql_bak.config";

# example config file
# You may use a comment to bypass any database that you don't want to backup
# # Unwanted_DB    (commented - will not be backed up)
# twtr
# cbgc

# retrieve a list of the databases from the config file
my @databases = removeComments(getFileContents($config_file));

# change to the directory of the backup files.
chdir($backup_folder) or die("Cannot go to folder '$backup_folder'");

# grab the local time variables
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
$year += 1900;
$mon++;
#Zero padding
$mday = '0'.$mday if ($mday<10);
$mon = '0'.$mon if ($mon<10);

$hour = "0$hour" if $hour < 10;
$min = "0$min" if $min  $folder/$file.Z`;

	print "Done\n";
}
print "Done\n\n";

# this subroutine simply creates an array of the list of the databases

sub getFileContents {
	my $file = shift;
	open (FILE,$file) || die("Can't open '$file': $!");
	my @lines=;
	close(FILE);

	return @lines;
}

# remove any commented tables from the @lines array

sub removeComments {
	my @lines = @_;

	@cleaned = grep(!/^\s*#/, @lines); #Remove Comments
	@cleaned = grep(!/^\s*$/, @cleaned); #Remove Empty lines

	return @cleaned;
}

 


Tony Darnell is a Principal Sales Consultant for MySQL, a division of Oracle, Inc. MySQL is the world’s most popular open-source database program. Tony may be reached at info [at] ScriptingMySQL.com and on LinkedIn.
Tony is the author of Twenty Forty-Four: The League of Patriots

 

Visit http://2044thebook.com for more information.