LWN.net Logo

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Linux Journal builds a terabyte-sized backup server. "High-capacity disk drives are now widely available at prices that are incredibly cheap compared to those of only a few years ago. In addition, with so many Linux users now ripping CDs to disk, saving images from their digital cameras and recording video using digital camcorders and DVRs, such as MythTV, the need for backing up and archiving large amounts of data is becoming critical. Losing pictures and videos of your kids--or your audio music library--because of a disk crash would be a catastrophe. Fortunately, a high-capacity, Linux-based backup server can be built easily and cheaply using inexpensive disk drives and free software."
(Log in to post comments)

Should I wait for the Holodisk

Posted Nov 29, 2005 20:44 UTC (Tue) by chel (guest, #11544) [Link]

New holographic disk technology promises up to 1TB on a single disk. That could change backup systems again:
http://www.techtree.com/techtree/jsp/article.jsp?article_...

rdiff-backup a better choice than rsync

Posted Nov 29, 2005 21:31 UTC (Tue) by nigelm (subscriber, #622) [Link]

Personally I would suggest the use of rdiff-backup rather than rsync for the backup part of this.

rdiff-backup has the same bandwidth efficient properties as rsync, but can keep a current backup (this is an exact copy of the source files just like rsync), and a number of deltas to older backups. This means you can recover the file you accidently deleted last week but didn't notice until several backups had been done.

The downsides are that it uses a bit more disk space (although remarkably little) and can be rather heavy on CPU. Just like rsync it can run over ssh.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 29, 2005 22:16 UTC (Tue) by HappyCamp (subscriber, #29230) [Link]

I like BackupPC myself. http://backuppc.sf.net/

I had previously used rdiff-backup but I like BackupPC a lot more. It is really nice for backing up multiple computers.

It does pooling, which only keeps one copy of a file, if backing up multiple systems, and then hardlinking the other files to that file. For example. /bin/ls is most likely the same on multiple systems if they are running the exact same OS with updates. So it will store one copy of /bin/ls and then have hardlinks to that file.

It also does compression, so it will compress all the files to save space also.

So on my backup system, I currently have:

181 full backups of total size 916.43GB (prior to pooling and compression),
995 incr backups of total size 204.16GB (prior to pooling and compression).

But the disk partition where all this is stored has only used 170GB of space.

I can currently recover data that was backed up within the last 90 days on a day by day basis. After that I can recover data on a monthly basis.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 29, 2005 22:57 UTC (Tue) by jwb (subscriber, #15467) [Link]

Why would you backup /bin?

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 29, 2005 23:21 UTC (Tue) by arcticwolf (guest, #8341) [Link]

Why not? If you want to restore your box to the exact state it was in before things broke, it's easier to back up /bin and stuff as well than it is to go through the hassles of installing your favourite distro again.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 30, 2005 19:37 UTC (Wed) by kokopelli (guest, #11341) [Link]

That's the point. You'll could be reinstalling compromised binaries.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Dec 1, 2005 7:51 UTC (Thu) by MortFurd (guest, #9389) [Link]

Unless you are restoring after a hardware failure.

I've had to do this a time or two. A small company had a fileserver/mailserver with no raid and the data and OS on one drive. The IDE controller on the motherboard died, and hosed up the HD.

No problem. Install a new motherboard and HD, boot from a live CD and restore the last full backup (with the entire OS) and all the incrementals. Once the last backup is restored, you reboot from the HD and go on about your business, with no more lost than what was done between the last incremental and the actual time of the crash - in this case zero, because the crash happened overnight after the latest incremental.

No reinstalling the OS and having to reinstall all the additional software that was on it, no fiddling with settings or scrambling to find the password and username to setup the internet access.

Quick, straightforward, and about as painless as recovering from disaster can be.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 29, 2005 23:44 UTC (Tue) by fjf33 (subscriber, #5768) [Link]

I personally like AMANDA although it takes a little bit of effort to get it up and running. It does compressed tars and can do differential and incremental backups too.

Backup windows systems

Posted Nov 30, 2005 1:14 UTC (Wed) by rvfh (subscriber, #31018) [Link]

Bacula, Amanda, Mondo rescue, Simple backup solution, BackupPC, rdiff-backup, MirDir... what else am I missing? So many.

Can somebody tell me which one to use to backup 4-5 computers onto a 200GB hard disk, knowing that some computers may be off, or even switched during the backup?

For the moment I am using my own script!

Thanks in advance.

Backup windows systems

Posted Nov 30, 2005 5:36 UTC (Wed) by fjf33 (subscriber, #5768) [Link]

I am using Amanda for exactly that (laptop, 2 PCs and a NAS). The great things about Amanda is that it is opportunistic in its backup and will bump differentials to full if it has room. The problem is (as you state) when computers are off, but even then it will warn you when it is about to overwrite the last available full backup. I use a sript (that comes with it) that allows you to use the HD as a tape library so you breake it in say 10 20GB tapes, and as long as your partitions are smaller than that it would be OK (I actually split a 300GB in 8 tapes).

Backup windows systems

Posted Nov 30, 2005 5:58 UTC (Wed) by HappyCamp (subscriber, #29230) [Link]

I would recommend BackupPC ( http://backuppc.sf.net/ )

It has been designed from the ground up to be a disk based backup system.

Nice thing is that it will save space if you have identical files on multiple systems.

If the systems are turned off it will send you an email warning and will try again at another time.

Backup windows systems

Posted Nov 30, 2005 15:26 UTC (Wed) by treed (subscriber, #11432) [Link]

You are missing Bacula which is currently my preferred backup solution.

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 30, 2005 14:30 UTC (Wed) by drag (subscriber, #31333) [Link]

Though not realy a backup tool I've just started running 'unison'...

http://www.cis.upenn.edu/~bcpierce/unison/

I use it to sync my /home/username/ directories (as well as a couple others) between my laptop and my desktop.

Before I just used rsync...

The thing that has me convinced to go to unison is the handy unison-gtk (called that in Debian at least) GUI front end for it. I have 2 different archatectures that I am dealing with, PowerPC and x86.. and since I am using Debian Sid on both of them then 95% of everything is compatable, as far as .filename/.directorynames go. All I have is a startup script to run update-menu for my user when I log in since I don't have all the same software installed on both machines.

The thing is that some things like Straw or some games use binary database stuff to keep track of information and that doesn't store information in a endian-neutral-friendly way. The unison GUI allows me quickly to select what files and directories to ignore when syncing, which is usefull since I have a such hard time thinking ahead and figuring out what parts to block out beforehand.

Also it's nice in such a way that it keeps track of deleted files and such and if I delete a file on one machine it'll delete the file on the other (which is pretty safe since it gives a list of actions it does before doing the actual sync)

I know it's not a backup mechanism, but it offers some form of data protection in mirroring my home directory and also provides a limited 'undelete' type option.

Another similar thing along the lines of increased-data-protection-but-not-as-ideal-as-real-backups are things like 'Log' based filesystems.

Not like logging, like journalling for ext3, but logging like database or log files were you only append to files, and never overwrite anything. (unless you run out of space then it has a 'garbage collection' style mechanism to free up space. (forgive me if either of these were covered in a previous lwn article)

Two that I know of that are under active developement are NILFS and LFS..
http://www.nilfs.org/
http://logfs.sourceforge.net/

Nilfs is brought to you by a the Japanese company Nippon.

What it does is do a 'rolling' write of your files to the filesystem from beginning to end in a similar fasion to a tape or whatnot. This is optimized for high write speeds and that sort of thing.

What it also allows you to do is do things like do a read-only mount of the filesystem.. not as the current system, but as it existed some time in the past. And you can do that while your running the filesystem at a different mount point in read-write mode. Also it makes doing snapshots easier and virtually garrentees data integrety for anything that happens short of a format or harddrive destruction or similar horrible event. If a file gets gibberish written into it, you just undo the changes until you get to clean data.

I could imagine it would be usefull for recording data at a high rate or under harse conditions (like maybe a scientific recording device in a robot) or keeping track of streaming media (such as video recording or voice recording) and such things.

Not that I realy understand most of it..

Build a Home Terabyte Backup System Using Linux (Linux Journal)

Posted Nov 30, 2005 15:31 UTC (Wed) by vondo (guest, #256) [Link]

I use unison as well. It is good for keeping a bi-directional mirror.

As far as a backup server goes, protecting about disk crashes and accidental deletion is good, but I have my back-up server at work so my digitial files (lots of photos) can survive a household/natural disaster too.

For people who can't use work, maybe there should be a "backup buddy system"

I hate to burst everyone's bubble...

Posted Nov 30, 2005 19:23 UTC (Wed) by Baylink (subscriber, #755) [Link]

But if you can't a) make 3 of them and b) take two of them off site, then they're not backups; they're near-line storage.

"How to build your own NetApp Filer" is all well and good, but let us not deceive ourselves by calling it "backup". It may be that from the limited perspective of a given workstation... but it isn't out here in the Real World.

Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds