My workshop runs on four Linux workstation, on Macbook and one Windows box. We also manage some VPS servers. Here’s backup solution in a nutshell:
Separate backup server (Linux), running Rsnapshot every hour, taking a snapshot of local workstations and VPS servers.
Each night latest backup is uploaded to Amazon S3. Besides that everybody archives their projects on DVDs.
If you’re not familiar with Rsnapshot it makes copies of selected directories. Each snapshot is a full regular copy of all protected data - no big compressed archives, incremental copies etc.
What’s cool is that files that do not change between snapshots become hard links. This means that such unchanged file is present in both snapshots, but it’s stored only once on the hard drive. No magic - it’s a feature of Unix file systems.You eat cookie (saving space) and have cookie (each snapshot is full backup).
It’s because the backups are plain directories. I can open my file browser of choice (mine is Midnight Commander), enter a snapshot from two hours ago, find a working directory of my current project and browse it. In other Midnigh Commander’s panel I can navigate to the same project’s working directory in a week old snapshot, and see two versions side by side. I can run diff to compare two versions of a file, run “du -csh *” to get size stats of my project yesterday.
In other words I can use any regular tools on any snapshot. No need to “revert” or “restore” files from backup, run any special software. This is mighty convenient for me and my co workers.
1. Quite a bit work to set up
You need to:
- set up each workstation to make it’s data available to backup server (using NFS, Samba, SSH or Windows File Sharing - depending on situation)
- set up serves to pull backups
- set up some kind of monitoring on backup server
- handle security (don’t want one user to see other user’s backups)
2. Backups are not encrypted or compressed
3. Pull architecture - backup stored in the same location
This is Rsnapshot’s architecture: backup server pulls data from workstation and stores them locally. To set up backup server in remote location you’d need to make your workstation serve their data over Internet - not a best idea.
An advantage is that you manage one central place for all backups. If you need to add a workstation you only need to share it’s data (set up SSH, Samba or NFS server) and add a line in Rsnapshot config to backup this workstation.
I also thought of keeping the backup server local and storing only data remotely, but Rsnapshot needs to be able to create hard links. NFS supports this, and it worked fine when I mounted an NFS share on backup server and told Rsnapshot to store backups there. I guess you could tunnel NFS through a SSH connection - but it’s an overkill.
Bottom line: backup is in the same location so it gives no protection against disasters like flood or fire that affects whole office/workshop/LAN. My solution: each night duplicity is launched and backs up latest local backup to Amazon S3.
4. Rotates data and doesn’t replace archiving
I use three cycles in Rsnapshot: hourly, daily and weekly. Each hourly snapshot is kept for 8 hours, daily backups are kept for 9 days and weekly backup is kept for 4 weeks. This means that if I delete a file locally, after a month it disappears from the backups.
Every developer is still obliged to archive his projects when they are completed, paused or reach a milestone, then burn them on a DVD and label appropriately. Backup and archiving are different things, see: know difference - backup vs archive
5. Backups are not compressed or encrypted
6. Small change in a file results in a new copy
Let’s say you have a 20G VirtualBox image. You launch the virtual system, do some work and shut it down. There are always some minor changes in the image (e.g. in Windows virtual memory file, registry, log files). Rsnapshot would store completely new copy of the image. If the virtual image is used often, this means 20G of backup every hour - not good.
- Do not change the image or reset itafter each use (using VirtualBox snapshots). You’d need to store documents on separate virtual disk or on host system through network or shared folders.
- Make make image fixed and store changes as differential files (using VirtualBox immutable disks)
That’s roughly it. And… where are my manners… welcome to my new shiny blog!
I don’t expect to write often - but I’ll certainly post answers to all questions I’m frequently asked.