Tux3: the other next-generation filesystem
Tux3: the other next-generation filesystem
Posted Dec 5, 2008 23:58 UTC (Fri) by njs (subscriber, #40338)In reply to: Tux3: the other next-generation filesystem by ncm
Parent article: Tux3: the other next-generation filesystem
Well, maybe...
Within reason, my goal is to have a much confidence as possible in my data's safety, with as little investment of my time and attention. Leaving safety up to individual apps is a pretty wretched system for achieving this -- it defaults to "unsafe", then I have to manually figure out which stuff needs more guarantees, which I'll screw up, plus I have to worry about all the bugs that may exist in the eleventeen different checksumming systems being used in different codebases... This is the same reason I do whole disk backups instead of trying to pick and choose which files to save, or leaving backup functionality up to each individual app. (Not as crazy as an idea as it sounds -- that DVCS basically has its own backup system, for instance; but I'm not going around adding that functionality to my photo editor and word processor too.)
Obviously if checksumming ends up causing unacceptable slowdowns, then compromises have to be made. But I'm pretty skeptical; it's not like CRC (or even SHA-1) is expensive compared to disk access latency, and the Btrfs and ZFS folks seem to think usable full disk checksumming is possible.
If it's possible I want it.
Posted Dec 6, 2008 8:26 UTC (Sat)
by ncm (guest, #165)
[Link] (2 responses)
Similarly, if your application is seek-bound, it's in trouble anyway. If performance matters, it should be limited by the sustained streaming capacity of the file system, and then delays from redundant checksum operations really do hurt.
Hence the argument for reliable metadata, anyway: the application can't do that for itself, and it had better not depend on metadata operations being especially fast. Traditionally, serious databases used raw block devices to avoid depending on file system metadata.
Posted Dec 6, 2008 8:55 UTC (Sat)
by njs (subscriber, #40338)
[Link] (1 responses)
Backups are also great, but there are cases (slow quiet unreported corruption that can easily persist undetected for weeks+, see upthread) where they do not protect you.
(In some cases you can actually increase integrity too -- if your app checks its checksum when loading a file and it fails, then the data is lost but at least you know it; if btrfs checks a checksum while loading a block and it fails, then it can go pull an uncorrupted copy from the RAID mirror and prevent the data from being lost at all.)
>If performance matters, it should be limited by the sustained streaming capacity of the file system, and then delays from redundant checksum operations really do hurt.
Again, I'm not convinced. My year-old laptop does SHA-1 at 200 MB/s (using one core only); the fastest hard-drive in the world (according to storagereview.com) streams at 135 MB/s. Not that you want to devote a CPU to this sort of thing, and RAID arrays can stream faster than a single disk, but CRC32 goes *way* faster than SHA-1 too, and my laptop has neither RAID nor a fancy 15k RPM server drive anyway.
And anyway my desktop is often seek-bound, alas, and yours is too; it does make things slow, but I don't see why it should make me care less about my data.
Posted Dec 7, 2008 21:33 UTC (Sun)
by ncm (guest, #165)
[Link]
Tux3: the other next-generation filesystem
Tux3: the other next-generation filesystem
Tux3: the other next-generation filesystem
