Case Study: Hot MySQL Backups w/ Percona - Scaling PHP Applications

Scaling PHP Applications (2014)

Case Study: Hot MySQL Backups w/ Percona

Being able to automatically back up MySQL to an off-site location is an important and often neglected part of building a redundant failure-proof setup. The reason most people neglect taking backups is that once your database grows to a large size, it becomes very difficult to work with due to the massive file sizes. Like, it’s easy to back up 1GB of data but becomes much more difficult when we’re talking about hundreds or even terabytes worth of data.

Backups are important for two reasons:

1. Allows you to restore your database in case of corruption or accidental loss of data (DELETE FROM foobar, DROP TABLE foobar, etc.)

2. Allows you to spin up a new slave server quickly. Instead of having to remove an existing slave in order to clone it, you can just restore it from a backup and have it catch up to the current replication.

Furthermore, most of the built-in backup solutions for MySQL suck! Let’s talk about some of the not-so-good options.

mysqldump

mysqldump is a tool that comes with MySQL for backing up the database. Running mysqldump produces a text file backup of your database in the form of CREATE TABLE and INSERT statements. It’s the most basic way to take a backup and works well for users with small amounts of data.

The downsides: it’s painfully slow, the backups take up much more space than the underlying data, and it’s extremely slow to restore from because all of the SQL statements need to be parsed to recreate the underlying data files. mysqldump also requires write-lock to create consistent backups. A 100GB backup with mysqldump can take several hours, if not days.

mysqlhotcopy

mysqlhotcopy is another tool that comes out of the box with MySQL. It can make a hot copy of the raw data files while the server is running. It doesn’t support InnoDB, though, so it’s pretty much unusable for most people.

MySQL Enterprise InnoDB Hot Backup

Oracle provides an InnoDB backup solution that works similarly to mysqlhotcopy, but it costs $5000 per server. Unfortunately, that’s too expensive for most startups.

Backup Slave

Some people believe that having MySQL Slaves is “good enough” as a backup solution. Relying on MySQL Replication is a bad idea, though, because although it protects you from data corruption, it doesn’t protect you from rogue queries. For example, if you relied solely on replication as your backup, a DROP TABLE foobar would not only destroy your working dataset, but also your backup, since all statements are replicated.

That being said, it’s a great idea to have a dedicated slave for taking backups. When you combine having a backup slave with a correct backup solution, taking backups won’t impact the I/O performance of your production systems.

Copying the MySQL/InnoDB Data Files

One low-tech solution is to just make a copy of your InnoDB data files inside of your MySQL data directory. It’s able to back up the data as fast as the disk drive can spin and allows for quick restores.

The downsides: You need to shut down your MySQL database or grab a write lock on the database while the copy is being made. Failure to do so will result in a corrupted backup.

Taking it one step further with LVM

You can use this technique, which isn’t terribly bad (just risky), by using LVM to create snapshots of your data. LVM adds a copy-on-write layer to the Linux File System, so backups become almost instantaneous. Again, the downsides are that you need to be absolutely sure MySQL is shutdown or that you have a write-lock because failure to do so will cause corrupt backups.

More on LVM Backups here. (but don’t bother, the best solution is next)

The BEST (and free) Solution: Percona XtraBackup

Percona XtraBackup is a free and open-source hot-backup solution that works with both InnoDB and MyISAM database. You don’t even need to be using Percona Server to use it! Percona XtraBackup is able to make backups of your database while it’s running, without having to lock tables or block writes.

On top of that, it’s extremely fast (runs as fast as your disks), can backup multiple tables in parallel, and is easy to use.

1 $ apt-get install xtrabackup

2 $ innobackupex /backup/

3 $ innobackupex --apply-log /backup

4 $ tar -zcf backup.tar.gz -C /backup

What sort of magic does XtraBackup use? It’s actually pretty clever. When you first run innobackupex, it makes a copy of the raw InnoDB files. This has the advantage of being extremely fast since it’s just a file system copy and is just a sequential write.

But, you’re thinking, “Won’t the data be corrupted? Don’t you need to shutdown MySQL or get a lock?” You’re right, the data will be corrupted, and that’s okay. The magic is in the next command.

The second command, innobackupex --apply-log must be run to uncorrupt the backup. Basically, what this command does is starts up a second copy of the MySQL server pointing to the backup data. MySQL sees that the InnoDB data files are corrupted and uses the InnoDB write ahead log to fix the data files and put them into a consistent state. Viola! Your backup is done. Lastly (if you want), you can tar + gzip/lzop it up to save space (we see around a 100% decrease in space usage when compressing MySQL backups, more on lzop later in this case study).

Moving from mysqldump to Percona XtraBackup

Before we discovered Percona and their awesome XtraBackup software, we used to use mysqldump to back up our database. When I say that it’s slow, know that I really mean it. Backing up terabytes of MySQL data with mysqldump is downright painful.

Before we had a dedicated slave server for making our backups, we would use one of our production slaves. It got to the point where making backups would take over 24 hours and would severely impact the performance of the slave making the backup. Furthermore, mysqldump requires a write-lock to make backups, which would be unacceptable to do on a production system, so our backups were coming out inconsistent and not truly “point-in-time”. It was an all-around bad situation.

Taking over 24 hours to perform a backup is nasty and it’s at this point I decided we needed to move to something more robust.

Going into our new backup system, I had a couple of goals in mind for a “perfect” solution.

1. It should not impact the speed of the production systems.

2. Backups should be fast, ideally take less than 1 hour, so we can back up multiple times per day.

3. Recent backups should be stored on-site. All backups should be retained off-site.

4. The system should be very simple. Less complexity means it’s easier to verify our backup process is working.

Dedicated Backup Server

The first step was to get a dedicated backup server. The hardware doesn’t need to be anything special- a large RAID-1 for storing local backups and a fast drive for running the MySQL instance. It doesn’t even need much memory, since a write-only server doesn’t need a big InnoDB Buffer Pool.

Percona XtraBackup

I ended up using a highly modified version of this script by Owen Carter, running via a Cron Job, to make the database backups. The backups are consistent, point-in-time, and don’t slow down or block replication. Multiple backups can even be taken at the same time.

I highly recommend adding the slave-info option to the USEROPTIONS variable in the script, which will include the necessary Master_Log_Pos and Master_Log_File options in the backup so you can use the backups to bootstrap new slave servers.

More XtraBackup configuration options can be found here

Data retention and long-term storage

I decided that I wanted to keep 7 days’ worth of backups on our local system and an indefinite amount of backups on Amazon S3 for long-term archival. The 7 days’ worth of backups are useful for spinning up new slave servers.

Since I don’t want to deal with being able to retain terabytes worth of MySQL backups locally, we use Amazon S3 to retain a subset of our database backups. Using boto, the process is pretty easy to script inside of our backup cron. Gotcha here! You need to use the multi-part upload API when sending files to Amazon S3 that are bigger than 5GB.

We store the data compressed and see over 100% space savings when compressing our MySQL backups. You can use gzip to gain the most space savings, but it’s very slow. Instead, for large backups, I recommend using lzop. lzop is a file compressor that runs significantly faster, but is slightly less efficient at compressing. Additionally, lzop can compress in parallel, allowing it to run even faster.

Notifications

Lastly, I wanted the backup system to be very visible. Since we use Campfire for our team communication, I was able to use the API to send a message in Campfire every time the backup script ran. It’s reassuring to wake up every morning and see “MySQL Backup Finished” posted in Campfire.

It’s as easy as adding something like this inside of the backup script:

1 SIZE=`ls -sh $THISBACKUP.tar.gz | cut -f 1 -d " "`

2 curl -i -u $KEY:X -H 'Content-Type: application/json' -d "{\"message\":{\"body\

3 \":\"MySQL Backup Finished (size: $SIZE)\"}}" $CAMPFIRE_URL