Backing Up a WordPress Site

Published
The inner workings of a hard disk drive

I run multiple — too many — WordPress sites and these need to be regularly backed up. I have a whole-of-server process which runs courtesy of my hosting provider, but I like to have more granular backup on my own computer for each web site I host. Historically, I have created a script for each site and had cron run them on the server. Then my computer runs its own cron (or launchd) task to download them with rsync. It is a process that has worked flawlessly for years.

Recently I opted (for reasons that will be a topic in a later post), to set up a new server and I have been trying to implement best practices and be consistent with the setup of each site. This is partly for my own sanity, but also as I began to set the new server up, I realised the previous one had grown organically and my scattered documentation didn’t always apply to every site. The new server has been documented to a level where I feel I could build it again just by following the steps.

At this time I have only two WordPress sites migrated to the new server, neither critical in nature (you are looking at one of them now), and I wanted to migrate my backup scripts. In doing so I realised there was scope for using a single script to back up any one site, and that’s what this post details.

The basic structure of the original scripts was pretty straightforward: dump the WordPress database to a file, then tar and gzip the entire document directory, including the dump, along with some cleanup.

!#/bin/bash
BACKUPPATH=/var/www/mysite.com/public_html
mysqldump -u mydbuser --password='#not-my-password#' mydb > $BACKUPPATH/mysql_mysite.com.sql
rm -f backup_mysite.com.tar.gz
tar -czf $BACKUPPATH/backup_mysite.com.tar.gz $BACKUPPATH/*
rm -f $BACKUPPATH/mysql_mysite.com.sql

Aside from using a single, generic script, I also wanted to move the backups out of the document directories to a more central location. You will also notice the script includes the password. I started looking around for smarter ways to store that until I realised something really basic — the database details are already stored in the WordPress configuration file.

With much Googling for correct bash syntax, I figured out how to extract values from the wp-config.php file. The thing that took me the longest was figuring out that a) the result of the grep | sed pair included an end of line marker (which MySQL does not like) and b) that the marker is not the \n I would expect of the Unix world, but in fact a \r!!

Given the wp-config.php line:

define('DB_NAME', 'mydb');

Then the value mydb can be extracted with this bash assignment:

DB_NAME=$(grep 'DB_NAME' $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_NAME'\'', ?'\''(.*)'\'' ?\);\r/\1/')

grep is used to locate the line and sed isolates the value in a match group, enabling the whole to be replaced with just the value we need. You can see the aforementioned \r at the end of the regular expression, and also note the crazy approach to matching the single quotes in a single quoted string. To match one quote, you first close the quote, then place an escaped quote, then re-open the quote to continue. The result is '\'' matches a single quote in the source value.

A twist I decided to include in my new script was to cope with the few non-WordPress sites I have. This was pretty simply done by testing for that wp-config.php file and only dumping the database if it is found.

The final script ended up looking like this.

#!/bin/bash
BACKUP_PATH=/var/www/$1
if [ -f $BACKUP_PATH/wp-config.php ]; then
  DB_PASS=$(grep 'DB_PASSWORD' $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_PASSWORD'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  DB_NAME=$(grep 'DB_NAME'     $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_NAME'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  DB_USER=$(grep 'DB_USER'     $BACKUP_PATH/wp-config.php | sed -E 's/define\( ?'\''DB_USER'\'', ?'\''(.*)'\'' ?\);\r/\1/')
  mysqldump -u $DB_USER --password=$DB_PASS $DB_NAME > $BACKUP_PATH/mysql_$1.sql
fi
rm -f ~/backups/backup_full_$1.tar.gz
cd $BACKUP_PATH
tar -czf ~/backups/backup_full_$1.tar.gz .
if [ -f $BACKUP_PATH/wp-config.php ]; then
  rm -f $BACKUP_PATH/mysql_$1.sql
fi

You will see that the sole argument passed to the script is the site name, or more specifically, the directory name within /var/www which is also used for the MySQL dump and archive names. The . as the content to archive is important as * does not match hidden files, so you’d lose your .htaccess and similar.

The script also removes the existing (now old) archive just prior to creating it anew, which means you can leave a backup lying around on the server until it is replaced.

The cron entries then look like this:

0 15 * * 6 ~/backup_site.sh mysite.com > /dev/null 2>&1

Saturdays at 15:00 UTC (Sundays at 03:00 for me) it backs up mysite.com to the ~/backups directory. I have multiple such entries, one for each site.

I then run rsync on my local computer later in the day to fetch all the latest backups.

rsync -rt me@myserver.com:~/backups/ ~/backups

This command relies on my private key to make an ssh connection to the server. The -rt options recurse — grabbing all files (backups) in the directory rather than just the directory itself — and maintain the timestamps of the files. By keeping the timestamps, rsync can optimise the transfer, skipping any files that haven’t changed since last fetched. Plus it can be useful to see, locally, when the backup was actually made.

So there you have it — backups of WordPress and static sites with easy expansion to as many sites as needed. Which, as I mentioned at the top, is too many in my case.


Heading photo by Shubham Bombarde on Unsplash.