LWN.net Logo

File checksums needed?

File checksums needed?

Posted Dec 6, 2008 18:57 UTC (Sat) by giraffedata (subscriber, #1954)
In reply to: Correctness by ncm
Parent article: Tux3: the other next-generation filesystem

A disk will happily write half a sector and scribble trash. Most times reading that sector will report a failure, but you only get reasonable odds.

Actually, I think the probability of reading such a sector without error indication is negligible. There are much more likely failure modes for which file checksums are needed. One is where the disk writes the data to the wrong track. Another is where it doesn't write anything but reports that it did. Another is that the power left the client slightly before the disk drive and the client sent garbage to the drive, which then correctly wrote it.

I've seen a handful of studies that showed these failure modes, and I'm pretty sure none of them showed simple sector CRC failure.

If sector CRC failure were the problem, adding a file checksum is probably no better than just using stronger sector CRC.


(Log in to post comments)

File checksums needed?

Posted Dec 16, 2008 1:57 UTC (Tue) by daniel (subscriber, #3181) [Link]

There are much more likely failure modes for which file checksums are needed. One is where the disk writes the data to the wrong track. Another is where it doesn't write anything but reports that it did. Another is that the power left the client slightly before the disk drive and the client sent garbage to the drive, which then correctly wrote it.

Scribble on final write is something we plan to detect, by checksumming the commit block. I seem to recall reading that SGI ran into hardware that would lose power to the memory before the drive controller lost its power-good, and had to do something special in XFS to survive it. Better would be if hardware was engineered not to do that.

Please, stop...

Posted Dec 20, 2008 3:31 UTC (Sat) by sandeen (guest, #42852) [Link]

Can we just drop the whole "XFS expects and/or works around special hardware" meme? This has been kicked around for years without a shred of evidence. I may as well assert that XFS requires death-rays from mars for proper functionality.

XFS, like any journaling filesystem, expects that when the storage says data is safe on disk, it is safe on disk and the filesystem can proceed with whatever comes next. That's it; no special capacitors, no power-fail interrupts, no death-rays from mars. There is no special-ness required (unless you consider barriers to prevent re-ordering to be special, and xfs is not unique in that respect either).

Please, stop...

Posted Dec 20, 2008 3:55 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

You must have seriously misread the post to which you responded. It doesn't mention special features of hardware. It does mention special flaws in hardware and how XFS works in spite of them.

I too remember reports that in testing, systems running early versions of XFS didn't work because XFS assumed, like pretty much everyone else, that the hardware would not write garbage to the disk and subsequently read it back with no error indication. The testing showed that real world hardware does in fact do that and, supposedly, XFS developers improved XFS so it could maintain data integrity in spite of it.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds