LWN.net Logo

ReiserFS's stability is not actually quite good

ReiserFS's stability is not actually quite good

Posted Apr 28, 2006 4:46 UTC (Fri) by zooko (subscriber, #2589)
In reply to: ReiserFS's stability is not actually quite good by nix
Parent article: Filesystems (ext3, reiser, xfs, jfs) comparison on Debian Etch (Debian Administration)

You would prefer that the filesystem optimistically ignore errors when it can and the admin can restore from backups if that goes bad. This is a reasonable strategy for some workloads, where availability is more important than correctness in the short term, and the admin can manually restore correctness in the long term. If you have such a workload, you should probably not user ReiserFS. If on the other hand you have a workload where correctness is more important than availability, then ReiserFS would be a good tool for that use.

Regards,

Zooko


(Log in to post comments)

ReiserFS's stability is not actually quite good

Posted Apr 29, 2006 11:02 UTC (Sat) by nix (subscriber, #2304) [Link]

No, of course not. If an error is detected, the FS should go read-only as far as is possible so that data recovery can continue and the system can at least stay mostly running.

This is trivially possible: ext2fs and ext3fs both do it. That reiserfs does not is not to its credit.

ReiserFS's stability is not actually quite good

Posted May 8, 2008 15:57 UTC (Thu) by zooko (subscriber, #2589) [Link]

Hello.  Two years ago was the last comment in this thread, and here I am to add another one.
:-)

My comment is that this fault-injection analysis:

https://www.cs.wisc.edu/wind/Publications/iron-sosp05.pdf

says that what ext2fs and ext3fs do for many of the errors that they measured is nothing, i.e.
carry on as if nothing happened.  What reiserfs does for those same errors is stop.  You could
argue that switching to read-only mode would be better than stopping.  I don't know about
that.  Perhaps the "propagate" option in iron-sosp05.pdf would be better than the "stop"
option because then outer layer code (i.e. the kernel or even userland) can detect the error
and remount read-only.

But, as far as comparing the safety of ext2fs and ext3fs vs. reiserfs, the iron-sosp05.pdf
document seems to make it clean that ext2fs and ext3fs err on the side of increased
availability at the cost of higher risk of corruption, where reiserfs errs on the side of
increased correctness at the cost of higher risk of unavailability.

ReiserFS's stability is not actually quite good

Posted May 8, 2008 22:18 UTC (Thu) by nix (subscriber, #2304) [Link]

I'm still here :)

That increased availability is important.

As I see it, there are two types of file storage one might be interested 
in. There's files for which availability is more important than 
correctness-of-content-under-errors, and there are files where integrity 
is all.

The former case should be handled by detecting errors and going read-only 
(i.e. what ext2+ does, only perhaps with added integrity hashes so you can 
spot more failure cases). The latter case should be handled by making *the 
filesystem objects that are corrupted* unreadable (not the whole disk 
unless there's so much corruption that you can't be sure of anything).

Files that satisfy the former constraint are far more common than those 
that satisfy the latter, because you almost always want the fs to be 
mountable so you can recover as much as possible before hitting the 
backups. Files that satisfy the latter constraint... well, I'm trying to 
think of any and I'm coming up cryptographic keys. (Definitely not 
financial data: I work in that industry, and what matters there is 
availability above all. If one bit is flipped you cope and carry on, you 
don't go unavailable: after all, your competitors haven't stopped just 
because you're having system problems...)


Of course neither fs satisfies these constaints: reiserfs stops too hard 
(and panics the whole machine!), ext* doesn't spot enough failures before 
they cascade into something horrid.


Recently I had a failure mode where the heads didn't bother to move after 
a journal write, leading to rubbish dumped into the journal. The drive 
went more demented shortly after that, mashed another fs, and the machine 
got rebooted... and ext3 thought ooh, we can just roll the journal 
forward! Oops, wrong, that scattered more corruption around, and because 
no fsck had been done the first we knew of it was when a dozen NFS mounts 
from that filesystem suddenly went read-only. e2fsck did a sterling job 
and got essentially everything back, even though we had part of an inode 
table claimed simultaneously by the journal inode and by a bunch of 
logfiles, with a bunch of logfile output written into it in place of 
inodes. (e2fsck wasn't stupid and realised that this was in an inode table 
so the other two files must be liars. Obviously we lost the metadata for 
most of those files, but all the important stuff came back OK. Amazing.)

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds