Archive for the ‘technology’ Category

MySQL: Automate your Merge Engines

|

A while ago, I’d written a bash script to handle the automated control of a table using the MERGE engine.  The table originally held the log data from a mobile messaging service, and very quickly became far  too big to be easily manageable.  Since the application was provided by a third party, and I had no way of changing how it logged it’s data, I devised this bash script to automatically reorganise a merge table to ensure the data was in manageable chunks.

This predates partitioning within MySQL, which would have been an ideal solution at the time.  However, there are still practical uses for the merge engine, so I’ve dug out the script for your enjoyment.  I’m sure it worked, as is, back in 2005, but I can’t recall if this was the totally polished and final version, and I’ve also sanitised it to remove references to the company and the third party app. So, if it doesn’t work perfectly, don’t be surprised, but it is certainly a good starting point if you want to do something similar.

This script also handles the creation of the new tables, so you don’t need to predefine the tables X years in advance.

#/bin/bash

################################################################################
# Merge engine rotate and clean script.                                        #
# What do I do?                                                                #
# Take latest week live  entries entries (till time on day script was run)     #
# and copy them into a new table.                                              #
# Modifies MASTER table (using “merge” storage engine) so it                   #
# displays data for only last x weeks. All other weekly tables remain          #
# individually available.                                                      #
#                                                                              #
# Executes as user: [whatever]                                                 #
# Execution frequency: [stick it in your crontab. Weekly would be good].       #
#                                                                              #
# Changelog:                                                                   #
# Initial state                                                                #
# Version 1.0                                                                  #
################################################################################

################################################################################
#Okay, before I begin, I need to know some stuff….
# How do I get into the database, and where do I go when I get there?
user=”xxxx”
passwd=”xxxx”

echo “setup login”

database=”dbname”  #the database i’m working with
tableprefix=”tablename” #the table that the live data is in
mergetable=”toptable” #the “merge” engine table
keycolumn=”id”  #the unique identifier in the live log table
logrotateday=”Monday” #the day I should rotate the logs
logretain=2 #how many weeks do we want under the merge table?

#Who do I send the “oh dear I’m not working” email to?
alert_email=”email@null.com”
echo “setup vars”

# Where do you want me to put the file with the table listing?
listpath=”/path/to/writeable/tempfile/” #path to mergetable listing file
listfile=”merge_tables.txt” #filename of mergetable listing

listfqp=”${listpath}/${listfile}” #file storing the merge tables

mysql=”/path/to/mysql/bin/mysql -u$user -p$passwd $database” # execute MySQL

# Where do you want me tell you about all the bad shit?
qrylogdate=`date +%Y-%m-%d` #date for use in queries.
errlog=”/path/tp/error_log_”$qrylogdate”.err”
echo “setup paths and files”
# If bad stuff happens, I’m going to do this…
function report_error(){
echo “Log rotation script exited” >> $errlog
tail -n 10 $errlog | mail -s “log rotation script has failed” “$alert_email”

}
echo “running….”
################################################################################
# Now I start doing stuff…

daytoday=`date +%A` #what day of the week is it?
#before any new tables are added, get the latest one.
lasttable=`grep “^$tableprefix” $listfqp | tail -n 1 | sed ’s/’${tableprefix}’_//g’`
newtable=$lasttable
oldtable=$lasttable
echo “${lasttable} , ${newtable},${oldtable}”
if [[ $logrotateday == $daytoday ]]
then
echo “It’s ${daytoday}! Lets create a new table.”

#Create new table. with _<var> suffix.
newtable=`expr $lasttable + 1`
echo “$newtable”
$mysql -e”CREATE TABLE ${tableprefix}_${newtable} like ${tableprefix}_${oldtable}” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at table creation” >> $errlog
report_error
exit
fi

# get a list of all the archive log tables, and insert into the
# merge_log_tables.txt file
echo “Writing merge table listing to file”
$mysql -e”show tables like ‘${tableprefix}_%’” > $listfqp 2>> $errlog
# parse file to take 6 most recent tables and create CSV list.
mergelist=`grep “^${tableprefix}_” $listfqp | tail -n 6 | awk ‘{ ORS=”,” ; print $0 }’|sed -e ’s#,$#\n#g’`
echo $mergelist
# modify the master log archive table to use only the latest 6 log tables.
echo “altering $mergetable with new union list”
$mysql -e”alter table $mergetable union($mergelist)” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at master archive update” >> $errlog
report_error
exit
fi
fi
# use a “select into” to copy all records inserted between the last entry in the
# archive, and midnight last night.
echo “copying data into ${tableprefix}_${newtable}”
$mysql -e”insert into ${tableprefix}_${newtable} select * from $tableprefix where id < (select max(${keycolumn}) from ${tableprefix})”  2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at ’select * into…..’” >> $errlog
report_error
exit
fi

# and delete this, and all preceding records from the live log.
echo “deleting copied data from $tableprefix”
$mysql -e”delete from $tableprefix where $keycolumn<=(select max($keycolumn) from ${tableprefix}_${newtable})” 2>> $errlog
if [ "$?" != 0 ]
then
echo “failed at ‘deletion from live logs…..’” >> $errlog
report_error
exit
fi
echo “All done”
# All Done…  Bye bye
################################################################################

  • Share/Bookmark

A cheeky MySQL replication log flusher

|

Rather than relying on the expire_log_days feature to ensure your replicating master doesn’t fill up with obsolete binary logfiles, you can deploy this little SQL script.  It’s a bit cheeky and dirty, but it works.  You could use this to just keep your logs in check, or also for a system where multiple slaves are connecting to a master for periodic updates, and you can’t be sure which databases are up to date or not.

Pre-requisites are that the databases, the slave at least, supports federated tables, and that both support scheduled events and stored procedures.  other requirements are that replication is statement based, and when the events are created, they have the status “slave side disabled” on the slave servers.

A summary of the functionality is as follows:
On running the SQL on the master, a “replication check” database, two tables, a stored procedure and 2 events are created.
The tables hold the “heartbeat” information comprising the slave hostname and the timestamp of when the table was last updated.
The hostcheck table is federated from the slave database onto the slavecheck table on the master, so the master’s slavecheck table provides a reliable view of the last time the slave was updated.
One ‘heartbeat’ event runs every 10 minutes, where any slave running the replicated query would update it’s local hostcheck table with it’s hostname and the current timestamp.  This, via the federated table, updates the master with the same data.
The “replpurge” procedure runs daily, and purges all master logs before the lowest timestamp value (i.e. oldest) entry in the slavecheck table.
It should be noted that the “slavecheck” table requires to be populated with the hostnames of all slave servers that are required to operate within this system, and a corresponding date value which is early enough to ensure the master logs do not get prematurely purged.

The following SQL can be executed on any Master database, and when all slaves are “synced” it will automatically start working.

– Start SQL

DROP DATABASE IF EXISTS repl_check;

CREATE DATABASE repl_check;

USE repl_check;

GRANT INSERT,UPDATE,DELETE on repl_check.* to feduser@’%’ identified by ‘fedpass’;

DROP TABLE IF EXISTS `repl_check`.`hostcheck`;

DROP TABLE IF EXISTS `repl_check`.`slavecheck`;

CREATE TABLE  `repl_check`.`hostcheck` (

`hostname` varchar(255) NOT NULL,

`time_stamp` datetime NOT NULL,

PRIMARY KEY (`hostname`)

) ENGINE=FEDERATED CONNECTION=’mysql://feduser:fedpass@master.server.host:3306/repl_check/slavecheck’;

CREATE TABLE  `repl_check`.`slavecheck` (

`hostname` varchar(255) NOT NULL,

`time_stamp` datetime NOT NULL,

PRIMARY KEY (`hostname`)

) ENGINE=MyISAM ;

SET GLOBAL EVENT_SCHEDULER=’on’;

DROP EVENT IF EXISTS repl_check.replupdate;

CREATE EVENT repl_check.replupdate ON SCHEDULE EVERY 10 MINUTE DO REPLACE INTO repl_check.hostcheck (SELECT variable_value,NOW() FROM information_schema.global_variables WHERE variable_name=’hostname’);

DROP PROCEDURE IF EXISTS repl_check.binlog_cleanup;

DELIMITER $$

CREATE PROCEDURE  repl_check.binlog_cleanup()

BEGIN

DECLARE tstamp timestamp;

SELECT min(time_stamp) into tstamp from repl_check.slavecheck;

PURGE MASTER LOGS BEFORE tstamp;

END $$

DELIMITER ;

DROP EVENT IF EXISTS repl_check.replpurge;

CREATE EVENT repl_check.replpurge ON SCHEDULE EVERY 1 DAY DO CALL repl_check.binlog_cleanup();

  • Share/Bookmark

MySQL Partitioning: Automated Management

|

MySQL introduced partitioning with version 5.1, with alpha releases of 5.1.3 offering it available from December 2005. Since then, it’s  been developed further and with the relatively recent 5.2.41 release, many of the bugs and issues surrounding partitioning have been ironed out.

One issue I’ve had some users come across relates to the partition management on a live database. How big do partitions need to be? how often do we need to repartition? What do we do with the “old” data? How can we make the management of partitions easier in general?

I’ve addressed these questions with a stored procedure and by taking advantage of MySQL’s recent Event Scheduler.

My example uses a MyISAM table, partitioned on a bigint column which contains a unix timestamp and an ‘archive’ table that contains the same structure but using a different storage engine, for long-term storage and archiving.  My Sample also automates the creation of the log table, and can be run as a “deployment script” to set up a logging database ready for use.

First: create the log table:
DROP TABLE IF EXISTS table_log;
set @qry := concat(‘CREATE TABLE table_log (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
unix_time bigint(20) unsigned NOT NULL,
logging_stuff varchar(512) DEFAULT NULL,
PRIMARY KEY (id,unix_time)
) ENGINE=MyISAM
PARTITION BY RANGE ( unix_time) (
PARTITION  p0 VALUES LESS THAN (‘,unix_timestamp(now())*1000,’) ENGINE = MyISAM)’);
prepare stmnt from @qry;
execute stmnt;
Then, create the archive table
DROP TABLE IF EXISTS table_log_archive;
CREATE TABLE table_log_archive (
id int(10) unsigned NOT NULL AUTO_INCREMENT,
unix_time bigint(20) unsigned NOT NULL,
logging_stuff varchar(512) DEFAULT NULL,
PRIMARY KEY (id,unix_time)
) ENGINE=MyISAM
And now the stored Proc:
DROP PROCEDURE IF EXISTS partition_logs;
DELIMITER $$
CREATE  PROCEDURE  partition_logs(in tblname varchar(100),in partsize int,in lwater int,in hwater int)
BEGIN
DECLARE p_start_time bigint;
DECLARE p_end_time bigint;
DECLARE p_start_name bigint;
DECLARE p_end_name bigint;
DECLARE p_low_water bigint;
DECLARE p_high_water bigint;
DECLARE p_new_name bigint;
DECLARE p_new_time bigint;
DECLARE cnt int;
DECLARE deld int;
DECLARE partition_column varchar(100);
set partsize = partsize * 3600; — turn the passed partition size in hours, into the unixtimestamp.
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_start_name FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_end_name  FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position desc limit 1;
SELECT partition_description  into p_start_time FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT partition_description  into p_end_time  FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position desc limit 1;
SELECT partition_expression into partition_column from information_schema.PARTITIONS where table_name = tblname limit 1;
select unix_timestamp(date_sub(date(now()), interval lwater day)) into p_low_water;
select unix_timestamp(date_add(date(now()), interval hwater day)) into p_high_water;
set @plw := p_low_water;
set @tbl := tblname;
set @col := partition_column;
set @ins := concat(‘replace into ‘,@tbl,’_archive select * from ‘,@tbl,’ where ‘,@col,’ < ‘,@plw);
prepare instmnt from @ins;
execute instmnt;
select p_start_time,p_low_water;
set deld = 0;
while (p_start_time < p_low_water)
do
select p_start_time,p_low_water;
set @pstart := p_start_name;
set @droppart := concat(‘ALTER TABLE ‘,@tbl,’ drop partition p’,@pstart);
prepare dropstate from @droppart;
execute dropstate;
SELECT partition_description  into p_start_time FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
SELECT cast((substr(partition_name from 2)) as SIGNED ) into p_start_name FROM information_schema.PARTITIONS where table_name = tblname order by partition_ordinal_position limit 1;
set deld = deld+1;
END WHILE;
set cnt=0;
while (p_end_time < p_high_water)
do
set p_end_name = p_end_name + 1;
IF (p_end_time < p_low_water)
THEN set p_end_time = p_low_water;
ELSE set p_end_time = p_end_time + partsize;
END IF;
set @pendname := p_end_name;
set @phighwater := p_end_time;
set @alter_log :=  concat(‘ALTER TABLE ‘,@tbl,’ ADD PARTITION (PARTITION p’,@pendname,’ VALUES LESS THAN(‘,@phighwater,’))’);
prepare stmnt from @alter_log;
execute stmnt;
set cnt = cnt+1;
END WHILE;
select partsize as ’seconds’,p_low_water as ’start at’,p_high_water as ’stop at’,cnt as ‘partitions created’,deld as ‘partitions dropped’;
END$$
This procedure takes advantage of the partition information being held in the information_schema, and uses it to calculate the current, obsolete and new partitions.  it takes any “obsolete” data dumps it into the _archive table before dropping the partitions. In addition, it’s flexible enough that it can be called against any table that’s partitioned in the same way thus:
call partition_logs(‘table_name’,[hours per partition],[number of days to retain],[number of days in advance])
e.g. call partition_logs(‘table_log’,2,7,7);
…will repartition the table_log table, dumping any data older than 7 days old into the archive table and removing those partitions, and will create enough partitions for the next 7 days where each partition is 2hrs in size.
When called, the proc will also output some basic information on the specified partition size as well as how many partitions were dropped and created.  The only caveat with this procedure as it is, is that the “top” partition doesn’t use “less than max value” and has a hard limit.  This means that, if the procedure isn’t called when it should be, you may hit a ‘hard’ limit on the partitioned table where it won’t allow inserts due to the time integer being higher than the maximum partition value.
With a scheduled event calling this procedure on a regular basis, you have a completely automatic partition management system.
  • Share/Bookmark

A Hybrid that I’d Happily drive?

|

CMT-380Capstone Turbine are débuting  their CMT-380 hybrid car at the Los Angeles Auto Show this week.  It uses their 30 Kw C30 Microturbine as a generator to recharge Lithium ioin batteries when they run out of power, and houses the whole lot in the not-unattractive GTM Supercar from FactoryFive, which is normally sold as a kit.  The combination of electric motor, batteries and turbine gives the car a total range of 500 miles (between battery and bio-diesel), and some pretty respectable performance figures to boot: 0-60 mph in 3.9 seconds, 150 mph top speed not to mention the incredibly low emissions.

Unfortunately, while the CMT-380 is an attractive package, Capstone have no plans to build it.   If they did, however, I think it would be a far more attractive proposition than a Tesla Roadster or a Toyota Prius.

Capstone CMT-380: http://www.capstoneturbine.com/news/story.asp?id=536
FactoryFive GTM kit: http://www.factoryfive.com/gtmhome.html

  • Share/Bookmark

Progress: Workshop, bike and home.

|

Amazing what you can get done when you get up early at the weekend.

Saturday, we made a fair bit of progress on getting Gayle’s workroom sorted, with a new cutting table fitting over the sofabed.  Nice large panel of Spruce plywood should give ample space, and easily be strong enough.  Still have to work out a way of attaching it to the wall in a way that can allow it to be easily removed, but it’s getting there.  Also sorted out some shelves and a bit more hanging space.  I still need to clear out the “office” a bit more, as it’s more like a large cupboard right now, and it should allow storage for far more boxes of fabric and materials than it currently allows, and I might even manage to get a chair in at the desk.

Spent Sunday getting the lathe sorted and generally organising the workshop a bit more.  The lathe wasn’t running, and I couldn’t work out why, but tracked it down to a microswitch that was working opposite to how I was expecting.  Tracking this down wasn’t helped by a multimeter that had a near-flat battery, so when the good connection was sensed, it wasn’t going “beep”.  DOH!  Final task is to drill a hole in the coolant tray and fit the coolant drain pipe, then it’ll be good to go.  Other workshop related tasks were two sets of new strip-lights so I can see what I’m doing, and installing the new TIG welder in it’s home.  Just have to find a decent, local, Argon supplier now.

Modified the exhaust hanger to accept the slightly higher positioning of the can, due to the new exhaust, and the existing, complete bike is good to go again (and sounds awesome).  However, the end-can I’ve got on there sounds a little too loud.  I think it needs repacking.  Just as well I’ve recently added a compressor driven riveter to the tool box.

Shuffled the other bikes around on the drive to give me access to the “spare” bike on which much of the work is going to be done.  Frustratingly I found I had two chains on it, not just the one I knew about, and I can’t find the key for it anywhere.  Not to worry, the chain goes through the backwheel and over the seat, and since one of the first tasks is to remove the subframe, that’s not a big deal.

Got to work on the spare bike to get as much done as possible in the fading light, and removed the subframe, disconnected much of the wiring and coolant system.  Disconnected and removed the footpegs and gear linkage, and started on the engine mountings.  The plan is to get this down to a rolling frame and move it into the workshop where I can get to work on it, rather than relying on a few snatched minutes of fading light on the driveway.  Looks like the carbs are a bit knackered, since there was absolutely no movement on them, but I’m not too worried as I have a spare set on the shelf in good working order, lacking only a small spring that goes between bodies, and I’m intending on changing the carbs in the future anyway.

  • Share/Bookmark
Get Adobe Flash playerPlugin by wpburn.com wordpress themes