The inner workings of a hard disk drive

Backing Up a WordPress Site

🌳 Evergreen
In

Since I last updated this article, I have moved even further away from fully-self hosting my web sites. I now use a managed server (and therefore still managed databases). This necessitated further adjusting my scripts to suit some differences in the way my host is now run. I will leave the article unaltered in this respect, as I think my latest customisations are really only valid for my host, and the description below is much more widely applicable. Some minor edits have, however, been made to the text.

I run multiple — too many — WordPress sites and these need to be regularly backed up. I have a whole-of-server process which runs courtesy of my hosting provider, but I like to have more granular backup on my own computer for each web site I host. Historically, I have created a script for each site and had cron run them on the server. Then my computer runs its own cron (or launchd) task to download them with rsync. It is a process that has worked flawlessly for years.

When I opted to set up a new server, I tried to implement best practices and be consistent with the setup of each site. This is partly for my own sanity, but also as I began to set the new server up, I realised the previous one had grown organically and my scattered documentation didn’t always apply to every site. The new server was documented to a level where I felt I could build it from scratch again, just by following the steps.

Once I had some WordPress sites migrated to the new server, I wanted to migrate my backup scripts. In doing so I realised there was scope for using a single script to back up any site, and that’s what this post details.

The basic structure of my original scripts was pretty straightforward: dump the WordPress database to a file, then tar and gzip the entire document directory, including the dump, followed by some cleanup.

!#/bin/bash
BACKUPPATH=/var/www/mysite.com/public_html
mysqldump -u mydbuser --password='#not-my-password#' mydb > $BACKUPPATH/mysql_mysite.com.sql
rm -f backup_mysite.com.tar.gz
tar -czf $BACKUPPATH/backup_mysite.com.tar.gz $BACKUPPATH/*
rm -f $BACKUPPATH/mysql_mysite.com.sql

Aside from using a single, generic script, I also wanted to move the backups out of the document directories to a more central location. You will also notice the script includes the password. I started looking around for smarter ways to store that until I realised something really basic — the database details are already stored in the WordPress configuration file.

With much Googling for correct bash syntax, I figured out how to extract values from the wp-config.php file. The thing that took me the longest was figuring out that a) the result of the grep | sed pair included an end of line marker (which MySQL does not like) and b) that the marker is not the \n I would expect of the Unix world, but in fact a \r!!

Given the wp-config.php line:

define('DB_NAME', 'mydb');

Then the value mydb can be extracted with this bash assignment:

DB_NAME=$(grep 'DB_NAME' $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_NAME'\'', ?'\''(.*)'\'' ?\);\r/\1/')

Be sure to scroll the box above to see the complete line.

grep is used to locate the line and sed isolates the value in a match group, enabling the whole to be replaced with just the value we need. You can see the aforementioned \r at the end of the regular expression, and also note the crazy approach to matching the single quotes in a single quoted string. To match one quote, you first close the quote, then place an escaped quote, then re-open the quote to continue. The result is '\'' matches a single quote in the source value.

A twist I decided to include in my new script was to cope with the few non-WordPress sites I have. This was pretty simply done by testing for that wp-config.php file and only dumping the database if it is found.

The final script ended up looking like this.

#!/bin/bash
BACKUP_PATH=/var/www/$1
if [ -f $BACKUP_PATH/wp-config.php ]; then
  DB_PASS=$(grep 'DB_PASSWORD' $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_PASSWORD'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  DB_NAME=$(grep 'DB_NAME'     $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_NAME'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  DB_USER=$(grep 'DB_USER'     $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_USER'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  mysqldump -u $DB_USER --password=$DB_PASS $DB_NAME > $BACKUP_PATH/mysql_$1.sql
fi
rm -f ~/backups/backup_full_$1.tar.gz
cd $BACKUP_PATH
tar -czf ~/backups/backup_full_$1.tar.gz .
if [ -f $BACKUP_PATH/wp-config.php ]; then
  rm -f $BACKUP_PATH/mysql_$1.sql
fi

You will see that the sole argument passed to the script is the site name, or more specifically, the directory name within /var/www which is also used for the MySQL dump and archive names. The . as the content to archive is important as * does not match hidden files, so you’d lose your .htaccess and similar.

The script also removes the existing (now old) archive just prior to creating it anew, which means you can leave a backup lying around on the server until it is replaced.

The cron entries then look like this:

0 15 * * 6 ~/backup_site.sh mysite.com > /dev/null 2>&1

Saturdays at 15:00 UTC (Sundays at 03:00 for me) it backs up mysite.com to the ~/backups directory. I have multiple such entries, one for each site.

I then run rsync on my local computer later in the day to fetch all the latest backups.

rsync -rt me@myserver.com:~/backups/ ~/backups

This command relies on my private key to make an ssh connection to the server. The -rt options recurse — grabbing all files (backups) in the directory rather than just the directory itself — and maintain the timestamps of the files. By keeping the timestamps, rsync can optimise the transfer, skipping any files that haven’t changed since last fetched. Plus it can be useful to see, locally, when the backup was actually made.

So there you have it — backups of WordPress and static sites with easy expansion to as many sites as needed. Which, as I mentioned at the top, is too many in my case.

Restoring from one of these backups is pretty simple — a reversal of the basic steps taken to create them. The following commands were used for real after a recent WordPress update went wrong somehow and the entire site was reduced to “An error has occurred.”

sudo su -
cd /var/www/mysite.com
rm -rf *
tar -xvf /home/myuser/backups/backup_full_mysite.com.tar.gz
mysql -u root -p mysite_db < mysql_mysite.com.sql

Hopefully everything should be reasonably self explanatory, but line by line, here’s what they do.

I elected to switch to the root user for the restore, as most of the files are owned by the Apache user.

Next I changed to the site’s documents directory (the root of the site’s content).

I removed everything there. While it’s possible to restore ‘over the top’ it’s possible any problems might be caused by files added since the backup.

The tar command does most of the heavy lifting by unzipping and unpacking the contents of the archive back into the directory. The v flag means you will see a huge scree of file names whizz by showing what it is unpacking.

Finally, the mysql command executes the SQL dump which, in a single pass, removes and recreates each table with all of the contents.

Upon completing these steps, I refreshed my browser page and the site had sprung back to life.

Remote MySQL Servers

Since the original writing of this article, I switched to a managed database server. This meant that MySQL commands needed to take into account a host other than (the default) localhost.

I confess to not even considering the backups when I made the change, but the updated entries in my wp-config.php files, to allow WordPress to work, were enough for the backup scripts, too. What I only discovered in a moment of need, was that the restore process needed a significant change, albeit only to the final mysql command.

mysql -u mydb_user --password=mydb_password --database=mydb --host=remote.server.com -P 1234 --wait < mysql_mysite.com.sql

The managed database server did not allow root access, so I had to use the actual assigned user for the database in question, and therefore also supply the password for that. I elected to place the password in the command; I do not know if the remote server would successfully prompt for the password.

The biggest change is the specification of the database host address using the --host and -P (port) options. This will be the same as you have in your wp-config.php file, but with the port split out. (It does not work to include the port on the host name.)

Finally, I added the --wait flag to tell MySQL to wait and try to reconnect if the connection is lost. I did this because I did actually lose the connection when I ran the command for the first time. Luckily, the mysqldump output is such that any failure part way through can be addressed by running the whole thing again.