rsync backup tool

Introduction

rsync is a powerful command-line tool for backup and synchronization of files between computers.

The generic format for an rsync command is
rsync <options> <source> <dest>

It is often a good idea to pipe the output of the command to a file - in which the above becomes
rsync <options> <source> <dest> >rsync.log

If both source and destination are on the local computer, the host doesn't need to be specified.
A typical command in this case might be
rsync <options> /source/dir/ /destination/dir/ >rsync.log

If the source is remote and the destination local, try
rsync <options> <host>:/remote/source/dir/ /local/destination/dir/ >rsync.log
The host can be specified as an IP address or as a domain name - just don't forget the full colon before the remote directory.

Options

Perhaps the most important option of all is the -n switch. This puts rsync in 'dry run' mode - i.e. it won't actually make any changes; it just shows you what would happen. For example,
rsync -n <source> <dest> >rsync.log
will simulate the process of synchronising the destination with the source specified. The output file 'rsync.log' can be examined to see exactly what changes would have been made. If everything looks ok the '-n' switch can be removed in order to actually perform the changes.

If you want to use SSH to connect to the remote host, you will need to specify '-e ssh'. E.g.
rsync -e ssh myhost.net:public_html/file.txt ..
(The single dot at the end of this command just specifies the current directory as the destination.)

A very useful option is '--max-size' - which specifies an upper limit on the size of files that rsync will transfer or synchronise. E.g., '--max-size 25k' will instruct rsync to ignore any files larger than 25 kilobytes. For example, the command
rsync -e ssh --max-size=25k myhost.net:code/*cpp .
will synchronise all files whose filename ends in 'cpp' in the directory 'code' on the host 'myhost.net' to the current directory - but only if their size is not larger than 25 kilobytes.

Data backup in general

I don't think there are hard and fast rules for data backup (except perhaps to do it regularly and be careful when you're doing it), and the process will naturally depend on the nature of the data you are trying to preserve. Nonetheless, I tend to bear the following in mind.

  • Save the recipe - not the cake.
  • Inclusive vs. exclusive approach.
  • Don't drink and rsync: rsync is very powerful. If you don't understand what a command does, don't type it - when you're using this command-line tool at least.
Tags: