Backing up data on a regular basis is useful, even necessary, but setting it up is a pain. Mainly that's because to get it working you have to make some smart choices and existing solutions are all unsatisfactory. Simple Backup Solution creates a single big tarball for every backup and doesn't tell me when it is making a backup. Flyback can only write to locally mounted ext3 file systems. Timevault works constantly in the background. All of them only backup at specific times, or manually.
Big tarballs are risky: if they get damaged or only partially copied, because, say, I'm shutting down my computer, I lose the entire backup. In such a scenario it is likely I'm not making backups for weeks in a row. The data I want to backup is on computers that are not always on, so I can't use scheduled backups. Also, most of the files are in compressed formats already so compression won't do much good anyway. I don't want to have to remember having to make a backup, so I can't use manual backups either. The disk I want to store the backups on is not locally mounted. Making a backup takes time and resources, and I don't want it to get in my way while I'm doing something else. All this is why I have put it off for too long, and I imagine so have many people.
I have one computer that is in fact always on: my home server. For that one I have already setup a simple cron job scheduled rsync based backup system in place which works fine. I have another desktop box, and two laptops. The latter three are the ones that I need a tailored backup mechanism for.
All my machines are Ubuntu machines. The home server runs Ubuntu 8.04 LTS Server, the other three run Ubuntu 9.04. Adjusting this howto for other Linux distros should be relatively straightforward though. I have written this howto partly to remember later on how I've done this, should the need arise, and partly because I hope someone else will find it useful. Any feedback can be directed to this post on my blog.
These demands lead me to the following solution:
Visual clarification attempt:
This forces me to take care of something I should have done a long time ago. I always use ssh to access remote machines, and I use passwords for authentication, which is inconvenient and not very secure. It is better to use key pairs. I already have setup an account for my users on the server machine, with a home directory. To make a key pair to access my server box from my desktop box, I do the following on my desktop box:
user@desktopbox:~$ ssh-keygen
I answer the questions by accepting the defaults (press enter).
This generates a private and a public key. I copy over the public key to the server machine:
user@desktopbox:~$ scp .ssh/id_rsa.pub serverbox:~/
user@serverbox's password:
I make the private key readable for me only:
user@desktopbox:~$ chmod 600 .ssh/id_rsa
Then I ssh over to the server machine and log in with my password.
user@desktopbox:~$ ssh serverbox
user@serverbox's password:
If my home directory on the serverbox does not have an .ssh directory, I create it:
user@serverbox:~$ mkdir .ssh
I add the key to the list of authorized keys, and remove the file:
user@serverbox:~$ cat id_rsa.pub >> .ssh/authorized_keys
user@serverbox:~$ rm id_rsa.pub
I repeat these steps for all my computers and users on those computers.
Configuring the backup server
I create a backup directory to store the backups in and make it writable for my users, which are all members of the group users:
user@serverbox:~$ mkdir <root dir for all backups>
user@serverbox:~$ sudo chown root:users <root dir for all backups>
user@serverbox:~$ sudo chmod 770 <root dir for all backups>
Then I exit back to the desktop box:
user@serverbox:~$ exit
Configuring the client machines
I now have configured the backup server fully. Now I configure each of the machines that I want to backup. First I create a directory in the home directories for settings and stuff:
user@desktopbox:~$ mkdir .backup
Then I create a file with the names of everything I don't want to backup, and put it there (I use vim, use your favorite editor, like nano, gedit, kedit):
user@desktopbox:~$ vim .backup/exclude
I enter these lines (adjust to your needs, but keep the last line):
.gvfs
.cache
.thumbnails
.mozilla/firefox/*/Cache
.viminfo
.xsession-errors
.gnome2/gnome-power-manager
backup.log
The last file is the log that the backup script below will generate. I will transfer it to the backup machine, but only after rsync has completed.
To perform the backup, I have a script with these contents:
#!/bin/bash
user=<my username>
localhost=`/bin/hostname`
localdir=/home/$user
remotehost=<server box hostname>
remotedir=<root dir for all backups>/
mydir=$localdir/.backup
excludefile=$mydir/exclude
keyfile=$localdir/.ssh/id_rsa
logfile=$mydir/backup.log
mysqluser=<a mysql user that can read all my databases>
mysqlpassword=<mysqluser's password>
mysqlfile=$mydir/mysqldbs-`date +%Y%m%d-%H:%M`.sql
echo "*** Backup " `date` >> $logfile
/bin/rm $mydir/mysql*
/usr/bin/mysqldump --all-databases -u$mysqluser -p$mysqlpassword > "$mysqlfile"
/bin/chown $user "$mysqlfile"
/usr/bin/rsync -avz --delete --update \
--exclude-from "$excludefile" \
-e "ssh -i $keyfile" $localdir $user@$remotehost:$remotedir/$localhost/ \
>> $logfile
/bin/chown $user $logfile
/usr/bin/scp -i $keyfile $logfile $user@$remotehost:$remotedir/$localhost/
I create this file in ~/bin/ :
user@desktopbox:~$ mkdir bin
user@desktopbox:~$ vim bin/backup
Paste the text and adjust it to your situation. At <server box hostname> you need to enter the ip of the server machine if you have not either setup DNS for your network or put the hostname with its ip in /etc/hosts. The first line states that the script must be executed by bash. The next ten lines (excluding white lines) assign values to paramaters used in the rest of the script. To adjust the script to your machine(s), adjust these.
The three lines thereafter are only needed if you want to backup mysql databases. If you don't, comment them out or delete them. If you have other files outside your home directory that you want to backup, you could copy them to ~/.backup/ here. The first of these lines removes any old dumps, the second one makes a new dump, and the third one makes me its owner. Note that in the second line I have to put my mysql password in plain text, which is totally insecure. For me personally that's not a problem, but for you it may be. Maybe I'll fix this someday.
The next four lines do the actual work. You can find out what the parameters passed to rsync mean by typing "man rsync" in a terminal.
The \ (backslashes) at each line tell bash that the command continues on the next line. This way I can split the command over several lines for readability. The >> (double greater than) redirect the output to the logfile so that it is added to the end of it. A single > would have replaced the file.
Rotating the log
The output of the script is added to the backup file, which is put in ~/.backup/, and copied to the server after the backup. To prevent it from growing too big I rotate it weekly. As I'm using Ubuntu, I create a file with these contents:
/home/<user name>/.backup/backup.log {
rotate 0
weekly
missingok
}
user@desktopbox:~$ sudo vim /etc/logrotate.d/backup
user@desktopbox's password:
Starting the script automatically at logoff
The last thing I do is specify when the script is executed. I want it to run when I log off the machine, after I saved my work and before the network connection is shut down. Ubuntu uses network-manager for wireless connections, so I create a file named .xsession in my user's home directory, and make it executable:
user@desktopbox:~$ vim .xsession
user@desktopbox:~$ chmod ug+x .xsession
The file has these contents:
gnome-session
/home/user/bin/backup
When I log in, .xsession is executed by the login manager (gdm). First a gnome session is started, and when that shuts down, the backup script is executed.
I configure all the machines that I want to backup like this, each adjusting to their specifics.
Testing
To test it, I run the script manually the first time, while simultaneously keeping an eye on the log file. Forthe laptops I use cable connections this time. Since it's the first backup it will take a long time, and wireless connection break more easily. I open two terminals. In the first one I type:
user@desktopbox:~$ touch .backup/backup.log
user@desktopbox:~$ tail -f .backup/backup.log
In the other one I start the initial backup:
user@desktopbox:~$ ./bin/backup
If everything works out as it should, the next time I power down the computer it will backup any changes. And that is what I wanted.
Last text update: August 12th, 2009