Automating backup tasks with cron

In order to make this process more useful, we could automate our backup tasks by creating a cronjob to run backups on a regular basis, by means of a shell script.

It is also best practice to create a separate backup user account in your cloud project that is only given rights to access object storage. The main justification for this is that in order to have scripts run commands unattended it is necessary to embed plaintext password information in the scripts.

Creating the backup user

To create a new user account, go to Management -> Project Users in the left hand menu of the dashboard, then click on the +Invite User button.

Fill in the Invite User form as shown, making sure the only Role selected is Object Storage.

../_images/invite_object_user.png

Once you receive the invite, complete the sign-in process as the new user. There should now be a new user with Object Storage as their only available role.

../_images/object_user.png

You can then download a copy of the backup user’s OpenStack RC file: see Source an openstack RC file, which will provide the credential information for the following section.

Creating the backup scripts

Now we can create our backup process. This will consist of:

  • the backup script itself

  • the variables file to control the backup script and provide authentication information

  • the cron job to run the backup task

Here is the basic script to manage the running of a Duplicity backup. Typically, this would be placed somewhere like /usr/local/bin.

#!/bin/bash

# Source SWIFT access variables required by duplicity
source /etc/duplicity/duplicity-vars.sh
BACKUP_DEFINITIONS_DIR="/etc/duplicity/backup_sources.d"
BACKUP_CONFIG="${1}"

if [ -z "${BACKUP_CONFIG}" ]; then
   BACKUP_CONFIG='*'
fi

# Run backups defined in BACKUP_DEFINITIONS_DIR or only the one specified as $1
# The BACKUP_* variables need NOT to be double-quoted for the shell name expansion to work
for BACKUP_DEFINITION_FILE in ${BACKUP_DEFINITIONS_DIR}/${BACKUP_CONFIG}.conf
do
   # Make sure we don't have any leftover variables set before next loop run
   unset SRC
   unset DEST
   unset PRE_BACKUP_CMD
   unset POST_BACKUP_CMD
   unset DUPLICITY_BACKUP_RETENTION
   unset DUPLICITY_BACKUP_CYCLE
   unset DUPLICITY_VOLSIZE
   unset DUPLICITY_NUM_RETRIES

   # Source variables used on each loop run
   if [ ! -f "${BACKUP_DEFINITION_FILE}" ]; then
      INFO="No backups defined in ${BACKUP_DEFINITIONS_DIR}/ or ${BACKUP_DEFINITION_FILE} is not a file"
      echo $INFO
      continue
   fi
   # Source the main config file again as we overwrite some variables in backup definitions
   source /etc/duplicity/duplicity.vars
   source "${BACKUP_DEFINITION_FILE}"

   # Check if the src and dest backup vars are not empty
   if [ ! -z "${SRC}" ] && [ ! -z "${DEST}" ]; then

      # Run defined tasks before doing the backup
      if [ ! -z "${PRE_BACKUP_CMD}" ]; then
         eval "${PRE_BACKUP_CMD}"
         rc=$?
         if [ ${rc} -gt 0 ]
         then
            # Error handling
            INFO="Pre backup command failed with rc = ${rc}"
            echo $INFO
            continue
         fi
      fi

      # Run backup
      duplicity --verbosity Notice \
                --full-if-older-than ${DUPLICITY_BACKUP_CYCLE} \
                --num-retries ${DUPLICITY_NUM_RETRIES} \
                --asynchronous-upload \
                --no-encryption \
                --volsize ${DUPLICITY_VOLSIZE} \
                "${SRC}" "${DEST}"
      rc=$?
      if [ ${rc} -gt 0 ]
      then
         # Error handling
         INFO="Backup failed with rc = ${rc}"
         echo $INFO
         continue
      fi

      # Duplicity cleanups
      duplicity remove-older-than ${DUPLICITY_BACKUP_RETENTION} --verbosity notice --force "${DEST}"
      rc=$?
      if [ ${rc} -gt 0 ]
      then
         # Error handling
         INFO="Deleting old backups failed with rc = ${rc}"
         echo $INFO
         continue
      fi

      # Duplicity collection status summary
      duplicity collection-status "${DEST}"
      rc=$?
      if [ ${rc} -gt 0 ]
      then
         # Error handling
         INFO="Collection status failed with rc = ${rc}"
         echo $INFO
         continue
      fi

      # Run a command after doing the backup
      if [ ! -z "${POST_BACKUP_CMD}" ]; then
         eval "${POST_BACKUP_CMD}"
         rc=$?
         if [ ${rc} -gt 0 ]
         then
            # Error handling
            INFO="Post backup command failed with rc = ${rc}"
            echo $INFO
            continue
         fi
      fi

   else
      INFO="No backup source or destination defined in ${BACKUP_DEFINITION_FILE}"
      echo $INFO
      continue
   fi

   # If the script managed to reach this point all backup steps succeeded so we can report that to icinga
   INFO="Backup succeeded"
   echo $INFO

done

This script defines the control parameters such as retention and frequency for the backup tasks as well as providing authentication information for object storage. The previous script is expecting to find this in /etc/duplicity/duplicity-vars.sh.

#!/bin/bash

# Variables used by the backup script

# Duplicity specific variables
export DUPLICITY_BACKUP_CYCLE='7D' #7 days
export DUPLICITY_BACKUP_RETENTION='14D' #14 days
export DUPLICITY_VOLSIZE='512' #object chunk size in bytes
export DUPLICITY_NUM_RETRIES='3'

# Catalyst Cloud object storage credential information
export SWIFT_USERNAME='<your-backup-user>@<your-project-name>'
export SWIFT_REGIONNAME='nz_wlg_2'
export SWIFT_TENANTNAME='<your-project-name>'
export SWIFT_PASSWORD='<your-openrc-password>'
export SWIFT_AUTHURL='https://api.cloud.catalyst.net.nz:5000'
export SWIFT_AUTHVERSION='3'
export SWIFT_USER_DOMAIN_NAME="default"
export SWIFT_PROJECT_DOMAIN_NAME="default"

Then we need to define the backup definitions. Create a file with a name relevant to the backup task in /etc/duplicity/backup_sources.d and add at least the following two entries

SRC="/path/to/files/"
DEST="swift://<container-name>"

Depending on the nature of the thing you wish to back up, you may also need to include pre-backup commands such as the one shown below. This is to ensure that the data you wish to capture, in this case the contents of a gitlab repository, have been written to disk prior to the backup task running.

PRE_BACKUP_CMD="CRON=1 /opt/gitlab/bin/gitlab-rake gitlab:backup:create"

Finally you’ll create a new file called duplicity-backup-cron in /etc/cron.d/. This is the cron job that will be responsible for running the backups. See (cron) for more information on this.

#
35 2 * * * root /usr/local/bin/duplicity-backup.sh >> /var/log/backup/duplicity.log 2>&1