LWN.net Logo

Fighting massive data loss bugs

Fighting massive data loss bugs

Posted Jul 23, 2009 21:42 UTC (Thu) by tialaramex (subscriber, #21167)
In reply to: Fighting massive data loss bugs by Cato
Parent article: Fighting small bugs

My #1 guess would be failing RAM. DIMMs don't often go bad, but it does happen and there is nothing in most PCs that will detect it, you just start to see the wrong bits, and of course most of those bits are either coming from or going to files, so it's easy to blame the filesystem.

Filesystem bugs are like any other bugs, they tend to be repeatable, they do something stupid and wrong but not entirely ridiculous (e.g. they don't flip a few bits in the middle of a file, but overwrite an entire block with something else) and so on. If you see weird problems, and especially if you see problems that don't have any clear pattern, that's _much_ more likely to be bad RAM.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds