LWN.net Logo

How The Backup Process Has Changed

How The Backup Process Has Changed

Posted Nov 30, 2007 3:59 UTC (Fri) by sitaram (subscriber, #5959)
In reply to: How The Backup Process Has Changed by Los__D
Parent article: How The Backup Process Has Changed

I have this bad habit of doing all my research in one shot, as exhaustively as I can,
summarising it for quick reference, and then discarding the raw data :-)

With that caveat, here is a dump of the entry titled "Choosing between LZMA, BZIP2, and GZIP"
in my personal quickref wiki:

---------------------------

LZMA is the new kid on the block: less space and faster decompression (than
BZIP2) at the cost of much, *much* slower compressions.

(Default compression levels are GZIP: 6, BZIP2: 9, and LZMA: 7)

Summary
-------

    Use none when
      - almost all files in the dataset are already compressed (DUH!)

    Use GZIP when
      - time is more important than space, or
      - system memory is very limited, or
      - a lot of files in the dataset are already compressed but nowhere near
        all of them

    Use LZMA when
      - space is more important than time, or
      - space is important AND the file will be decompressed many times

    Benchmark BZIP2, LZMA at level 1 and perhaps LZMA at level 2 when
      - both space AND (compression) time are important, and
      - you're going to be compressing this same dataset frequently (like a
        daily backup script for your email folders)

    Otherwise just use BZIP2


(Log in to post comments)

How The Backup Process Has Changed

Posted Dec 8, 2007 1:52 UTC (Sat) by roelofs (subscriber, #2599) [Link]

Use none when
- almost all files in the dataset are already compressed (DUH!)

Major omission, both here and in the main article: also use none when your backup medium (and/or the path to it, including RAM) may have errors. Both compression and encryption largely destroy any ability to recover data past the error location. (I discovered two bad bits in 1 GB of memory while verifying a backup to DVD+R.)

Otherwise just use BZIP2

bzip2 is much, much slower than gzip on decompression, too. If it's read-once (or read-none), then that may not matter. But for read-many it's pretty bad. (I have no data on LZMA or other alternatives. Capacity is cheaper than CPU, however.)

Greg

Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.