Archive for the ‘MySQL’ Category

MySQL: Automate your Merge Engines

|

A while ago, I’d written a bash script to handle the automated control of a table using the MERGE engine.  The table originally held the log data from a mobile messaging service, and very quickly became far  too big to be easily manageable.  Since the application was provided by a third party, and I had no way of changing how it logged it’s data, I devised this bash script to automatically reorganise a merge table to ensure the data was in manageable chunks.

This predates partitioning within MySQL, which would have been an ideal solution at the time.  However, there are still practical uses for the merge engine, so I’ve dug out the script for your enjoyment.  I’m sure it worked, as is, back in 2005, but I can’t recall if this was the totally polished and final version, and I’ve also sanitised it to remove references to the company and the third party app. So, if it doesn’t work perfectly, don’t be surprised, but it is certainly a good starting point if you want to do something similar.

This script also handles the creation of the new tables, so you don’t need to predefine the tables X years in advance.

#/bin/bash

################################################################################
# Merge engine rotate and clean script.                                        #
# What do I do?                                                                #
# Take latest week live  entries entries (till time on day script was run)     #
# and copy them into a new table.                                              #
# Modifies MASTER table (using “merge” storage engine) so it                   #
# displays data for only last x weeks. All other weekly tables remain          #
# individually available.                                                      #
#                                                                              #
# Executes as user: [whatever]                                                 #
# Execution frequency: [stick it in your crontab. Weekly would be good].       #
#                                                                              #
# Changelog:                                                                   #
# Initial state                                                                #
# Version 1.0                                                                  #
################################################################################

################################################################################
#Okay, before I begin, I need to know some stuff….
# How do I get into the database, and where do I go when I get there?
user=”xxxx”
passwd=”xxxx”

echo “setup login”

database=”dbname”  #the database i’m working with
tableprefix=”tablename” #the table that the live data is in
mergetable=”toptable” #the “merge” engine table
keycolumn=”id”  #the unique identifier in the live log table
logrotateday=”Monday” #the day I should rotate the logs
logretain=2 #how many weeks do we want under the merge table?

#Who do I send the “oh dear I’m not working” email to?
alert_email=”email@null.com”
echo “setup vars”

# Where do you want me to put the file with the table listing?
listpath=”/path/to/writeable/tempfile/” #path to mergetable listing file
listfile=”merge_tables.txt” #filename of mergetable listing

listfqp=”${listpath}/${listfile}” #file storing the merge tables

mysql=”/path/to/mysql/bin/mysql -u$user -p$passwd $database” # execute MySQL

# Where do you want me tell you about all the bad shit?
qrylogdate=`date +%Y-%m-%d` #date for use in queries.
errlog=”/path/tp/error_log_”$qrylogdate”.err”
echo “setup paths and files”
# If bad stuff happens, I’m going to do this…
function report_error(){
echo “Log rotation script exited” >> $errlog
tail -n 10 $errlog | mail -s “log rotation script has failed” “$alert_email”

}
echo “running….”
################################################################################
# Now I start doing stuff…

daytoday=`date +%A` #what day of the week is it?
#before any new tables are added, get the latest one.
lasttable=`grep “^$tableprefix” $listfqp | tail -n 1 | sed ’s/’${tableprefix}’_//g’`
newtable=$lasttable
oldtable=$lasttable
echo “${lasttable} , ${newtable},${oldtable}”
if [[ $logrotateday == $daytoday ]]
then
echo “It’s ${daytoday}! Lets create a new table.”

#Create new table. with _<var> suffix.
newtable=`expr $lasttable + 1`
echo “$newtable”
$mysql -e”CREATE TABLE ${tableprefix}_${newtable} like ${tableprefix}_${oldtable}” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at table creation” >> $errlog
report_error
exit
fi

# get a list of all the archive log tables, and insert into the
# merge_log_tables.txt file
echo “Writing merge table listing to file”
$mysql -e”show tables like ‘${tableprefix}_%’” > $listfqp 2>> $errlog
# parse file to take 6 most recent tables and create CSV list.
mergelist=`grep “^${tableprefix}_” $listfqp | tail -n 6 | awk ‘{ ORS=”,” ; print $0 }’|sed -e ’s#,$#\n#g’`
echo $mergelist
# modify the master log archive table to use only the latest 6 log tables.
echo “altering $mergetable with new union list”
$mysql -e”alter table $mergetable union($mergelist)” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at master archive update” >> $errlog
report_error
exit
fi
fi
# use a “select into” to copy all records inserted between the last entry in the
# archive, and midnight last night.
echo “copying data into ${tableprefix}_${newtable}”
$mysql -e”insert into ${tableprefix}_${newtable} select * from $tableprefix where id < (select max(${keycolumn}) from ${tableprefix})”  2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at ’select * into…..’” >> $errlog
report_error
exit
fi

# and delete this, and all preceding records from the live log.
echo “deleting copied data from $tableprefix”
$mysql -e”delete from $tableprefix where $keycolumn<=(select max($keycolumn) from ${tableprefix}_${newtable})” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at ‘deletion from live logs…..’” >> $errlog
report_error
exit
fi
echo “All done”
# All Done…  Bye bye
################################################################################

  • Share/Bookmark

A cheeky MySQL replication log flusher

|

Rather than relying on the expire_log_days feature to ensure your replicating master doesn’t fill up with obsolete binary logfiles, you can deploy this little SQL script.  It’s a bit cheeky and dirty, but it works.  You could use this to just keep your logs in check, or also for a system where multiple slaves are connecting to a master for periodic updates, and you can’t be sure which databases are up to date or not.

Pre-requisites are that the databases, the slave at least, supports federated tables, and that both support scheduled events and stored procedures.  other requirements are that replication is statement based, and when the events are created, they have the status “slave side disabled” on the slave servers.

A summary of the functionality is as follows:
On running the SQL on the master, a “replication check” database, two tables, a stored procedure and 2 events are created.
The tables hold the “heartbeat” information comprising the slave hostname and the timestamp of when the table was last updated.
The hostcheck table is federated from the slave database onto the slavecheck table on the master, so the master’s slavecheck table provides a reliable view of the last time the slave was updated.
One ‘heartbeat’ event runs every 10 minutes, where any slave running the replicated query would update it’s local hostcheck table with it’s hostname and the current timestamp.  This, via the federated table, updates the master with the same data.
The “replpurge” procedure runs daily, and purges all master logs before the lowest timestamp value (i.e. oldest) entry in the slavecheck table.
It should be noted that the “slavecheck” table requires to be populated with the hostnames of all slave servers that are required to operate within this system, and a corresponding date value which is early enough to ensure the master logs do not get prematurely purged.

The following SQL can be executed on any Master database, and when all slaves are “synced” it will automatically start working.

– Start SQL

DROP DATABASE IF EXISTS repl_check;

CREATE DATABASE repl_check;

USE repl_check;

GRANT INSERT,UPDATE,DELETE on repl_check.* to feduser@’%’ identified by ‘fedpass’;

DROP TABLE IF EXISTS `repl_check`.`hostcheck`;

DROP TABLE IF EXISTS `repl_check`.`slavecheck`;

CREATE TABLE  `repl_check`.`hostcheck` (

`hostname` varchar(255) NOT NULL,

`time_stamp` datetime NOT NULL,

PRIMARY KEY (`hostname`)

) ENGINE=FEDERATED CONNECTION=’mysql://feduser:fedpass@master.server.host:3306/repl_check/slavecheck’;

CREATE TABLE  `repl_check`.`slavecheck` (

`hostname` varchar(255) NOT NULL,

`time_stamp` datetime NOT NULL,

PRIMARY KEY (`hostname`)

) ENGINE=MyISAM ;

SET GLOBAL EVENT_SCHEDULER=’on’;

DROP EVENT IF EXISTS repl_check.replupdate;

CREATE EVENT repl_check.replupdate ON SCHEDULE EVERY 10 MINUTE DO REPLACE INTO repl_check.hostcheck (SELECT variable_value,NOW() FROM information_schema.global_variables WHERE variable_name=’hostname’);

DROP PROCEDURE IF EXISTS repl_check.binlog_cleanup;

DELIMITER $$

CREATE PROCEDURE  repl_check.binlog_cleanup()

BEGIN

DECLARE tstamp timestamp;

SELECT min(time_stamp) into tstamp from repl_check.slavecheck;

PURGE MASTER LOGS BEFORE tstamp;

END $$

DELIMITER ;

DROP EVENT IF EXISTS repl_check.replpurge;

CREATE EVENT repl_check.replpurge ON SCHEDULE EVERY 1 DAY DO CALL repl_check.binlog_cleanup();

  • Share/Bookmark

MySQL Partitioning: Automated Management

|

MySQL introduced partitioning with version 5.1, with alpha releases of 5.1.3 offering it available from December 2005. Since then, it’s  been developed further and with the relatively recent 5.2.41 release, many of the bugs and issues surrounding partitioning have been ironed out.

One issue I’ve had some users come across relates to the partition management on a live database. How big do partitions need to be? how often do we need to repartition? What do we do with the “old” data? How can we make the management of partitions easier in general?

I’ve addressed these questions with a stored procedure and by taking advantage of MySQL’s recent Event Scheduler.

My example uses a MyISAM table, partitioned on a bigint column which contains a unix timestamp and an ‘archive’ table that contains the same structure but using a different storage engine, for long-term storage and archiving.  My Sample also automates the creation of the log table, and can be run as a “deployment script” to set up a logging database ready for use.

First: create the log table:
DROP TABLE IF EXISTS table_log;
set @qry := concat(‘CREATE TABLE table_log (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
unix_time bigint(20) unsigned NOT NULL,
logging_stuff varchar(512) DEFAULT NULL,
PRIMARY KEY (id,unix_time)
) ENGINE=MyISAM
PARTITION BY RANGE ( unix_time) (
PARTITION  p0 VALUES LESS THAN (‘,unix_timestamp(now())*1000,’) ENGINE = MyISAM)’);
prepare stmnt from @qry;
execute stmnt;
Then, create the archive table
DROP TABLE IF EXISTS table_log_archive;
CREATE TABLE table_log_archive (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
unix_time bigint(20) unsigned NOT NULL,
logging_stuff varchar(512) DEFAULT NULL,
PRIMARY KEY (id,unix_time)
) ENGINE=MyISAM
And now the stored Proc:
DROP PROCEDURE IF EXISTS partition_logs;
DELIMITER $$
CREATE  PROCEDURE  partition_logs(in tblname varchar(100),in partsize int,in lwater int,in hwater int)
BEGIN
DECLARE p_start_time bigint;
DECLARE p_end_time bigint;
DECLARE p_start_name bigint;
DECLARE p_end_name bigint;
DECLARE p_low_water bigint;
DECLARE p_high_water bigint;
DECLARE p_new_name bigint;
DECLARE p_new_time bigint;
DECLARE cnt int;
DECLARE deld int;
DECLARE partition_column varchar(100);
set partsize = partsize * 3600; — turn the passed partition size in hours, into the unixtimestamp.
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_start_name FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_end_name  FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position desc limit 1;
SELECT partition_description  into p_start_time FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT partition_description  into p_end_time  FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position desc limit 1;
SELECT partition_expression into partition_column from information_schema.PARTITIONS where table_name = tblname limit 1;
select unix_timestamp(date_sub(date(now()), interval lwater day)) into p_low_water;
select unix_timestamp(date_add(date(now()), interval hwater day)) into p_high_water;
set @plw := p_low_water;
set @tbl := tblname;
set @col := partition_column;
set @ins := concat(‘replace into ‘,@tbl,’_archive select * from ‘,@tbl,’ where ‘,@col,’ < ‘,@plw);
prepare instmnt from @ins;
execute instmnt;
select p_start_time,p_low_water;
set deld = 0;
while (p_start_time < p_low_water)
do
select p_start_time,p_low_water;
set @pstart := p_start_name;
set @droppart := concat(‘ALTER TABLE ‘,@tbl,’ drop partition p’,@pstart);
prepare dropstate from @droppart;
execute dropstate;
SELECT partition_description  into p_start_time FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_start_name FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
set deld = deld+1;
END WHILE;
set cnt=0;
while (p_end_time < p_high_water)
do
set p_end_name = p_end_name + 1;
IF (p_end_time < p_low_water)
THEN set p_end_time = p_low_water;
ELSE set p_end_time = p_end_time + partsize;
END IF;
set @pendname := p_end_name;
set @phighwater := p_end_time;
set @alter_log :=  concat(‘ALTER TABLE ‘,@tbl,’ ADD PARTITION (PARTITION p’,@pendname,’ VALUES LESS THAN(‘,@phighwater,’))’);
prepare stmnt from @alter_log;
execute stmnt;
set cnt = cnt+1;
END WHILE;
select partsize as ’seconds’,p_low_water as ’start at’,p_high_water as ’stop at’,cnt as ‘partitions created’,deld as ‘partitions dropped’;
END$$
This procedure takes advantage of the partition information being held in the information_schema, and uses it to calculate the current, obsolete and new partitions.  it takes any “obsolete” data dumps it into the _archive table before dropping the partitions. In addition, it’s flexible enough that it can be called against any table that’s partitioned in the same way thus:
call partition_logs(‘table_name’,[hours per partition],[number of days to retain],[number of days in advance])
e.g. call partition_logs(‘table_log’,2,7,7);
…will repartition the table_log table, dumping any data older than 7 days old into the archive table and removing those partitions, and will create enough partitions for the next 7 days where each partition is 2hrs in size.
When called, the proc will also output some basic information on the specified partition size as well as how many partitions were dropped and created.  The only caveat with this procedure as it is, is that the “top” partition doesn’t use “less than max value” and has a hard limit.  This means that, if the procedure isn’t called when it should be, you may hit a ‘hard’ limit on the partitioned table where it won’t allow inserts due to the time integer being higher than the maximum partition value.
With a scheduled event calling this procedure on a regular basis, you have a completely automatic partition management system.
  • Share/Bookmark

Oracle to buy Sun. What impact on MySQL?

|

Sun.com press release: Oracle to buy Sun

The Register: Oracle reels in Sun Microsystems with $7.4bn buy

NYTimes: Oracle Agrees to Acquire Sun Microsystems

Today, Oracle and Sun have announced that they have entered into an agreement such that Oracle now owns Sun in a deal that’s worth  approximately $7.4 billion, or $5.6 billion net of Sun’s cash and debt.

So what impact will this have on Sun’s Java and MySQL products?  Currently, Oracle are planning on keeping MySQL within it’s RDBMS offerings [ http://www.oracle.com/sun/sun-faq.pdf ], but one can’t help wondering about the longer term view.

Oracle has a long history of not being very “opensource”, but I don’t think this will mean the end of the road for neither MySQL nor Java.  Oracle aren’t so short sighted as to kill off the world’s most used database in one fell swoop, especially one that’s not (yet) in direct competition to Oracle for the enterprise market it has made it’s home.

It will, however, certainly be intersting times for MySQL from both an operational and developmental perspective, already having dealt with one such acquisition when they were sold to Sun last year, and subsequently lost some of their key staff.

We need to remember, MySQL is opensource, and as such, cannot be “killed”.  Even if Oracle did decide to cease development of the lightweight, powerful and fast RDBMS that MySQL has become, they cannot stop the community that’s supported MySQL for so long from continuing development with the existing source, and indeed, with so many key MySQL figures now independant of Sun and MySQL, it could be argued that the potential is there for a 2nd generation MySQL to spin off within the opensource community from those who dislike Oracle and “big business”.  Perhaps the biggest risk from Oracle is not that they stop or kill off MySQL, but perhaps lessen the priority of enterprise features to lessen competition with their own system, as MySQL’s popularity grows.  Again, however, as MySQL is opensource, it would be nigh on impossible to completely prevent the development of such features.

If the worst happens, and Oracle kills MySQL, certainly, development and, to some extent support, would falter; but it certainly wouldn’t cease.  Indeed, it might give Drizzle the opportunity to further develop their offering, but being aimed at a slightly different market, and it’s slimmed down features, I don’t see it being a direct replacement for MySQL.

Still, with my livliehood riding on my experience of MySQL, the coming months will be interesting times and I’ll be watching developments closely.

  • Share/Bookmark

Monty says: Time to move on

|

Monty in Prague (from Wikipedia)

Monty says: Time to move on.

It’s been a long time coming, and the expected resignation of Monty from MySQL has been the source of much rumour and speculation ever since the news of the deal with Sun broke.

It’s been no secret that there has been disagreement between Monty and Sun/MySQL on recent database releases, and he was quite vocal about his issues with  MySQL 5.1 going GA.

That being said, Monty does recognise that Sun was the “best possible buyer” for MySQL, and is dissapointed that things haven’t worked out better.  He’s planning on forming his own company, dedicated to Open Source, and will continue to work on the Maria project, with some of the Maria team.  Nothing will change in that respect, and Maria will still be released (in binary form) soon.

Monty’s departure from Sun follows that of the co-founder of MySQL, David Axmark, who left to pursue his own ventures and ideas, promoting open source software.  David is also known for his dislike of the rules and structure that working with a large company like Sun will bring.

  • Share/Bookmark
Get Adobe Flash playerPlugin by wpburn.com wordpress themes