Idiots Guide to Backing-up MySQL on OS X | March 3, 2004

At work we use an old G4 to act as our file and development server. At the end of last week, while synchronising one of our local MySQL databases with the live database, I accidentally overwrote a local table with old data. Luckily we backup so this wasn't a problem. However I thought some of you may find it useful to know how we do this, just incase you ever end up in a similar predicament.

Backing up a local MySQL database on OS X is pretty simple. To do a backup, you make use of the mysqldump command. This simply dumps the contents of the specified databases into a text file as INSERT statements. The below example simply dumps the contents of my clowns database to the current directory.

(I'm using the ¬ symbol to indicate where a line wraps)

mysqldump --user=andy --password=secret¬
clowns > clowns.sql

If you want, you can add a number of parameters to change the way mysqldump behaves. In this example, I'm locking the database before running the dump, adding a drop table and then naming the file based on the day of the week. If I run this command each day, I'll end up storing 7 days of backups locally.

mysqldump --user=andy --password=secret clowns¬
--lock-tables --add-drop-table > clowns`date +%u`.sql

If I wanted, I could name the file based on the actual date and store one file for each day. This would probably make more sense, but storage space is pretty tight, so I'm happy with just the last 7 days.

The next thing to do is make a shell script file. Open up a blank document, name it mysqlbackup.sh and then add the following lines.

#!/bin/sh
mysqldump --user=andy --password=secret clowns¬
--lock-tables --add-drop-table > clowns`date +%u`.sql;

In my backup script, I've added a bit of logic and error reporting so I know what's going on. If you wanted, you could also turn things like the username and password into variables to make editing the script easier.

Here is an example of the script I'm using.

#!/bin/sh

# echo start message
echo "Backup Script Processing"

# navigate to backup dir
if
	cd /Users/andy/clownbackups/
then
	echo "...successfully navigated to backup dir"
else
	echo "could not locate backup directory"
	exit 0
fi

# echo message
echo "exporting SQL dump"

if
	#dump the db into a .sql file
	mysqldump --user=andy --password=secret clowns¬
--lock-tables --add-drop-table > clowns`date +%u`.sql;
then
	#echo success message
	echo "SQL dump successful"
else
	#echo error message
	echo "mysqldump error"
	exit 0
fi

You can test your script out by running it from the terminal using the sh command. You'll see the output echo'd to the screen and if it's been successful, your mysql dump will appear in the directory you specified.

Now this is pretty cool, but nobody wants to open up the terminal and run this script every day before you back-up to disk (you do back-up every day, don't you?).

To get round this, you need to set up a cron job. When I first heard about cron jobs, they filled me with fear (as you can tell, I'm not hugely server savvy). However cron jobs are actually very simple. A cron job is simply a job that you want to automate, so that it runs at a set time. OS X does a number of cron jobs by default, mostly involving cleaning up logs and temporary files. The file that manages the system cron jobs is called crontab (short for cron table) and on OS X, is located at /private/etc.

This file is probably owned by root, so you'll need to sudo pico into it to make any changes. Once open, you'll see that it's a simple simple table of jobs (hence the name). Along the top is the frequency, and down the right hand side is the job to perform. Here is an example of my default crontab file.

# /etc/crontab
SHELL=/bin/sh
PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin
HOME=/var/log
#
#minute hour    mday    month   wday    who     command
#
15      3       *       *       *       root    daily
30      4       *       *       6       root    weekly
30      5       1       *       *       root    monthly

What we need to do is add a new line to this file that tells OS X to run your back-up script at a set period of time. We usually back-up at 5:30pm so I'm going to run my mysql backup script at 5:15pm and then save the file. If you're not comfortable doing this manually, cronnix is a useful OS X GUI for editing your cron jobs.

 15     17      *       *       *       root    ¬
sh /private/etc/mysqldump.sh¬
>> /private/etc/cron_error.log 2>&1

In the above example I'm also sending the output to an error log so I can make sure the script is executing correctly. When I run the script, the mysqlbackup file appears in my backup folder. However if I open it, it's empty. Looking at the error log tells me that the mysqldump command couldn't be found. Now this may seem strange as you can run the same shell script from the command line and it works fine. The reason lies in environmental variables.

When you log in to the terminal, one of the first things that happens is OS X goes off and looks for a file called .tcshrc (if you're using tsch) which contains customisation information about the user. One of these bits of information is a path which tells OS X where certain programs are stored. If you run a cron job, you need to add this path information to your crontab. If you look at the above example of my crontab file, you'll see the following text.

PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin

What you need to do is add to this the path to mysql. When I do this, my path becomes.

PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin¬
:/usr/local/mysql/bin

Depending on how you installed MySQL, your path information may be different. If you're not exactly sure where myslq is stored, you could either do a quick locate or have a look in your .tcshrc file (it's quite interesting to see what's in there anyway). With this set, your specified databases should get backed up every day and you'll no longer have to worry about overwriting or loosing data. For more fun, you can change your shell script to backup and download from a remote server.

Word of warning though. I'm no *NIX expert so all the usual disclaimers apply. Don't talk to strangers, always wear clean underwear if your going out, and make sure that you create copies of any system files you're editing. There are probably better ways of accomplishing this, but the above method is pretty simple and quite a good example of what can be done with a little command line tinkering on OS X.

Posted at March 3, 2004 9:48 AM

Comments

hudson said on March 4, 2004 4:58 AM

for even more security, you could us this script to backup the databases and whisk them offsite via email:

http://www.dunc-it.com/carrot.php

works great on OSX. it saves a database backup locally and can be configured to mail to a secure email drop box offsite.

Brandon said on March 4, 2004 3:43 PM

This helps me out hugely. I haven’t needed to rely on a backup yet, but I know I will eventually. I’ve dabbled in Unix now for several years, but never got much beyond the basics. I’m more interested in sheel scripting, but don’t know enough to sit down and work something out. (quite frankly, I haven’t thought of a good thing to use a shell script for, though I know it can be incredibly useful). This gives me a practical use that I can actually put to work now. Thanks for the great explanation!

DarkBlue said on March 6, 2004 1:07 AM

I feel I should add a couple of points that could be incorporated into the backup script:

mysqldump reads the entire contents of a table into memory before writing it out. Obviously this can cause problems with large tables. mysqldump has a “—quick” option that speeds up the process, and minimises the memory impact, by writing each row as soon as it has been read (compare a 16Kb row with a table of 10000 * 16Kb rows).

mysqldump has another useful option, “—opt” which speeds up the dump considerably (it incorporates “—quick” so you don’t need to explicitly request “—quick” if you use “—opt”. “—opt” also locks all the tables for the duration of the dump to maintain integrity.

If your database is large, then obviously the resulting dump will be of a significant size too as it contains SQL defining the table structures as well as the data (unless you have explicitly excluded one or the other). So it is a good idea to compress the dumps on-the-fly as they are created:

mysqldump clowns | gzip > clowns`date +%u`.gz

You don’t need to “cd” in your script, just direct mysqldump to an absolute path:

mysqldump clowns | gzip > /Users/andy/clownbackups/clowns`date +%u`.gz

Although mysqldump is perfectly adequate for backing up databases, it is (relatively) inefficient. It hammers the database and the server (in terms of processor load and memory consumption).

A far more efficient way to backup the database is to copy from the file system:

cp -r /path/to/mysql/data /Users/andy/clownbackups

Of course, if you want to ensure integrity (and why wouldn’t you?) you will have to ensure that nothing gets written to the database during the copy.

You can either shutdown the MySQL server prior to the copy:

mysqladmin shutdown —user=blah —password=whatever

then bring it back up afterwards:

cd /path/to/mysql/bin;./safe_mysqld &

or you can lock the tables before the copy:

lock tables my_table1 WRITE, my_table2 WRITE…

and release the locks afterwards:

unlock tables

The very best solution, if you have the means, is to backup directly onto a MySQL server on a seperate machine. This would obviously provide you with a “hot-swappable” database for recovery from any kind of failure.

Let’s assume you have MySQL databases running on a production server and a standby server with hostnames “production.andybudd.com” and “standby.andybudd.com” respectively. We want to automagically maintain a mirror database of the production database on the standby database:

mysqldump —add-drop-table clowns | mysql -h standby.andybudd.com clowns

What does this do?

“mysqldump —add-drop-table clowns” creates the dump of the “clowns” database. The output of which is “piped” through to the MySQL server running on host (-h) “standby.andybudd.com”, where it is imported into the “clowns” database. The “—add-drop-table” simply stops the standby server from rejecting the import due to duplicate table names.

Still with me?

Now we have a nice mirror that can be scheduled to run periodically (but not too frequently - remember the load and memory issues?). Let’s assume we’re following Andy’s schedule and backing up once per day, at 17:30.

We still don’t have a true mirror - since any updates that occur between backups will obviously be lost should we have to restore.

However, we can ensure that we are able to recover as much data as possible if we use MySQL’s “update log”. This keeps a record of all statements that modify the database (update, delete, insert, replace, create table, drop table, grant, revoke, etc) in SQL form - which can be used as input into MySQL. This means that, should disaster strike, we are in the enviable position of having a fully functioning server with a database up-to-date to the last backup (the previous 17:30) AND we have a record of every transaction that was enacted since that backup! Therefore, we can completely recover our database.

Let’s look at how we’d do this.

First we need to start the database server with logging enabled:

cd /path/to/mysql/bin;./safe_mysqld —log-update

Once per day our “copy to another server” takes place:

mysqldump —add-drop-table —flush-logs —opt clowns | mysql -h standby.andybudd.com clowns

Notice the addition of the “—flush-logs”? This tells MySQL to start with a fresh update log after the backup has completed successfully - thus, the updates SINCE the backup will be recorded.

Let’s pretend that a disaster occurs and the database on the production server goes down. We then bring “standby.andybudd.com” online and import our update log into the database:

mysql clowns < /path/to/mysql/data/dir/update_log —user=blah —password=noway

The “standby” database is then updated with every change that was made on production after standby’s database was last updated - and we have a fully restored database server. The whole restore process takes just minutes!

You just gotta love *nix and MySQL (oh yeah - and cron).

Jeez - I’ve been typing for ages, I hope this information is useful to someone! :-)

DarkBlue said on March 6, 2004 1:48 AM

Oh yeah, one more thing I should mention - If you were to use the mirror-database as suggested above and you suffered a catastrophic database failure, the act of importing the updates onto the standby will result in an entry in that server’s update log to reflect the change.

However, this new entry is not a complete record of the updates in the update log. Rather, it is a record of the import transaction.

Therefore, in order to be able to perform a subsequent recovery, you would need to retain the update log from the production server until a full backup, or mirror, has been performed on the standby server (which, by this time, has been promoted to a production server).

I hope that makes sense! :-O

david said on March 6, 2004 5:39 AM

timing is everything. today I downloaded the mysqladmin alpha from mysql.com. binaries are available for linux macos x and windows.it has an attractive gui that allows me to backup my databases with a few clicks. dumps it all into one tidy .sql file.

DarkBlue said on March 6, 2004 12:06 PM

Nice on David. MySQL-Front is another alternative that I often use (http://www.mysqlfront.de/).

DarkBlue said on March 6, 2004 12:07 PM

Obviously that should have read, “Nice one David”! 8-)

DarkBlue said on March 6, 2004 5:55 PM

I’ve been thinking a little more about this… if you have replicated databases across two or more servers, then you can run mysqldump without worrying about the impact on performance if you dump from one of the slaves rather than the master.

Andy Budd said on March 8, 2004 5:12 PM

Thanks for the great feedback DarkBlue. Some really useful additions you’ve mentioned there.

Andy Budd said on March 8, 2004 7:38 PM

For more info on cron, have a look at this

http://www.unixgeeks.org/security/newbie/unix/cron-1.html

DarkBlue said on March 8, 2004 10:03 PM

No, thank you - for posting an interesting, well-written article that inspired me to reappraise my own database backups and replication.

Because of your post I finally got off my backside and set up the slave servers I had been planning (oh, for about two years now). I now have replication and load balancing of my databases across four servers and can revert to any one of 42 snapshots (taken every 4 hours with 7 day’s worth of archiving).

And none of these backups and snapshots have any impact on performance whatsoever, since they are all taken, in rotation, from the slave servers.

I’m really proud of myself now! :-)

Thanks again Andy.

Jon said on March 13, 2004 2:24 PM

Help, Really want to understand this stuff as I develop with mysql/php on my Mac then have to transfer it all to the hosted server. Looks like there is a lot of scope in the info on this page to automate that quite a bit but……….
Tried to test the first part of the shell and got:

Jon-Reids-Computer:~ jonreid$ #!/bin/sh
-bash: !/bin/sh: event not found

Jon