User: Password:
|
|
Subscribe / Log in / New account

Maybe btrfs has no fsck,

Maybe btrfs has no fsck,

Posted Jun 17, 2009 17:54 UTC (Wed) by drag (subscriber, #31333)
In reply to: Maybe btrfs has no fsck, by anselm
Parent article: What ever happened to chunkfs?

Yes. Resierfs needs a fsck... whether or not it provides is kinda irrelevent. I think that a fsck for reiserfs is provided, but I am not sure. Maybe only for later versions. Never used it much myself.

One of the strengths of Ext3 over XFS and Reiserfs is it's fsck. The journalling features of XFS and Reiserfs only protect the filesystem (aka metadata) from corruption, it does not help protect your actual data or detect problems with your data. For that you need to do fsck for Ext3.


(Log in to post comments)

Maybe btrfs has no fsck,

Posted Jun 17, 2009 21:36 UTC (Wed) by anselm (subscriber, #2796) [Link]

What I've heard about the Reiserfs fsck is that it will, among other issues, mistake a file system superblock in the middle of a partition for the start of a whole new file system and get thoroughly confused. With virtualisation in widespread use and people keeping file system images in files this is more of a problem than it used to be when Reiserfs was new.

If this is actually true, it is one more reason to bury Reiserfs deeply, with a wooden stake driven through its heart.

Maybe btrfs has no fsck,

Posted Jun 18, 2009 8:46 UTC (Thu) by pcampe (guest, #28223) [Link]

I vaguely remember the problem you have described, something like a report from a key kernel developer; I am not sure it was ReiserFS the file system, nor if a workaround has been implemented.

Maybe btrfs has no fsck,

Posted Jun 18, 2009 11:33 UTC (Thu) by nye (guest, #51576) [Link]

That was basically FUD. In reiserfsck there's an option to rebuild the filesystem entirely - you're essentially telling it 'this filesystem is trashed; just look through the disk for anything that might be a valid filesystem structure and cobble it together if you can'.

The 'problem' of mistaking a superblock in some image you have somewhere for the start of a new fs is *exactly what you asked it to do*, so those who complained so loudly about it really have nobody to blame but themselves.

It'll be funny if it was not so sad

Posted Jun 18, 2009 12:16 UTC (Thu) by khim (subscriber, #9252) [Link]

The 'problem' of mistaking a superblock in some image you have somewhere for the start of a new fs is *exactly what you asked it to do*, so those who complained so loudly about it really have nobody to blame but themselves.

Yup. They did one mistake, but that mistake was grave: they assumed they can trust reiserfs. I've seen few cases where tiny nimber of badblocks killed reiserfs completely: nothing except this "cobble it together if you can" option worked and "cobbled together" filesystem was a mess (because there were some virtual images on that filesystem). Note: this exactly type of corruption SSD shows in real world. If reiserfs's "gentle" fsck does not work and if "last resort" approach is unusable then the only solution is to switch to other filesystem...

It'll be funny if it was not so sad

Posted Jun 18, 2009 16:29 UTC (Thu) by nye (guest, #51576) [Link]

Meh, the only FS I've ever lost data to - aside from understandably unrecoverable data on damaged disk blocks - was XFS[0], and I used reiser3 extensively until a couple of years ago when it was obvious that ship had sailed.

Perhaps I merely got lucky with my disks not failing in exactly the wrong way, but there's always going to be an anecdote for everything.

[0](it's certainly the only FS that I actively despise, and would unreservedly recommend against in all circumstances)

The problem is: disk DO fail and reiserfs is TOTALLY unready

Posted Jun 19, 2009 10:58 UTC (Fri) by khim (subscriber, #9252) [Link]

Meh, the only FS I've ever lost data to - aside from understandably unrecoverable data on damaged disk blocks

But the bad blocks DO exist in real life - you can not just ignore them! Reiserfs design NEVER ever considered this facet of life: if you have ONE bad block in wrong place - you are screwed 100%. When disk size is 512 bytes and HDD size of 2TiB loss of a 0.0000000003% of your data means 100% of your stuff is lost. This is not even funny.

XFS is also not a good idea (I was biten by it too) - but here we have bad implementation, not bad design. Implementation can be fixed, design mistake is unfixable.

The problem is: disk DO fail and reiserfs is TOTALLY unready

Posted Jun 22, 2009 16:22 UTC (Mon) by nye (guest, #51576) [Link]

>But the bad blocks DO exist in real life - you can not just ignore them!

I certainly don't disagree; I was referring to the data actually on the damaged part of the disk, which I wouldn't reasonably expect to be able to recover without great expense.

It'll be funny if it was not so sad

Posted Jun 21, 2009 6:47 UTC (Sun) by cventers (guest, #31465) [Link]

I was a reiser3 user for quite a while and often sung its praises. A few
years into that stretch, my PC started having stability problems which I
tracked to bad RAM.

I caught the bad RAM pretty quickly, and considered myself lucky that I
hadn't obviously lost any big chunks of data... I had seen the ReiserFS
journal check making some noise in dmesg but everything seemed to work.

However, replacing the RAM didn't solve the stability problem. The nature
of the problem changed... it became a random system freeze. At the time, I
didn't realize that I had a new problem - hidden filesystem corruption.

After a couple of big scares with "md" after the system had randomly
frozen, I made a full backup of the filesystem. I continued using the
computer, but the stability problem seemed to be getting worse. I
installed a brand new monster power supply and over the course of the next
month or two I burned a lot of money replacing the rest of the system,
thoroughly confused that I hadn't nailed the problem. (Mockingly, it often
seemed that replacing a part would make the problem go away for a day or
two, leading me to believe I'd fixed it until it slapped me in the face in
the middle of my work yet again.)

My full filesystem backup became handy after I was unable to bring the
filesystem online one time. reiserfsck made lots of noise about problems
with my data and was unable to repair it. I was frustrated to have lost a
month's worth of data, but thrilled that I had a backup at all.

Sadly, I lost the filesystem a few more times and burned even more time
and money on the computer before I realized that with all the hardware
having been replaced, I needed to consider what I had considered to be the
unlikely cause: the software. I became suspicious of reiserfs. This time,
rather than restoring again from my old reiserfs image, I made an ext3
partition, mounted the reiserfs image read-only and migrated.

My system never froze again.

I don't know enough about the reiserfs design to know how plausible my
hypothesis is, but it seems that the bad RAM I dealt with a long time ago
had led to a reiserfs filesystem which was "doomed". I assume the bad RAM
provided the initial corruption, some sort of corruption that made the
reiserfs kernel code fall on its face. Sometimes, the system accessed the
"wrong" bit of corrupted data and the kernel would panic or hang somewhere
inside reiserfs, spreading the corruption in the process.

There's a shocking bit of irony in this particular failure mode. Because
the backup I always restored from was a reiserfs image taken with dd, the
only way I was ever going to escape the crashes and repeated loss of my
data was to abandon reiserfs.

Maybe btrfs has no fsck,

Posted Jun 18, 2009 11:37 UTC (Thu) by viiru (subscriber, #53129) [Link]

> One of the strengths of Ext3 over XFS and Reiserfs is it's fsck. The
> journalling features of XFS and Reiserfs only protect the filesystem (aka
> metadata) from corruption, it does not help protect your actual data or
> detect problems with your data. For that you need to do fsck for Ext3.

Well. Actually XFS has both the ability to recover from an unclean shutdown without fsck, and a full featured repair tool. Don't be confused by the fact that the fsck.xfs-tool is essentially /bin/true. The repair tool exists, but it goes by the name of xfs_repair.

I've been using XFS on Linux in production on most of my machines for the past six years or so, and have needed to run xfs_repair twice. Haven't lost any files, either.

The "it eats your filez"-reputation of XFS has been greatly exaggerated.

If only

Posted Jun 18, 2009 12:20 UTC (Thu) by khim (subscriber, #9252) [Link]

The "it eats your filez"-reputation of XFS has been greatly exaggerated.

I've had rock solid way to reproduce this effect: run bittorrent client on 100% full filesystem. Sure, this is not nice thing to do for a filesystem (and currently btrfs does not handle this case all that well), but stuff happens. If I can not trust my filesystem in such conditions how can I trust it at all?

It looks like XFS problems are in the past but trust is easy to lose, hard to resurrect - now I'm firmly in ext3 camp.

Maybe btrfs has no fsck,

Posted Jun 23, 2009 6:18 UTC (Tue) by nix (subscriber, #2304) [Link]

Journals do not protect the filesystem metadata from corruption. They
protect it from being in an inconsistent state after a crash (i.e. part of
the metadata written, other parts not written).

If a kernel bug, cosmic-ray-induced bitflip, or transient drive bug
corrupts the filesystem you still need a fsck. And in the end, that *will*
happen, even with PCIe and ECCRAM: after all, the *CPU* doesn't checksum
everything inside itself, and ECCRAM can't detect all possible failure
modes.

Maybe btrfs has no fsck,

Posted Jun 24, 2009 19:19 UTC (Wed) by salimma (subscriber, #34460) [Link]

That's why Btrfs, ZFS and (I think) Dragonfly BSD's HammerFS have checksums for each block.

Maybe btrfs has no fsck,

Posted Jun 26, 2009 11:16 UTC (Fri) by mangoo (guest, #32602) [Link]

How would that help if I, for example, copy one block and its checksum into another area of the disk? Essentially, the block will be valid (checksum matches), but the filesystem will be corrupted.

In a virtual environment, it's not so hard to do such a mistake: just accidentally mount the filesystem twice (i.e. from a guest and a host), and two different kernels will write correct blocks all over, each one corrupting the filesystem.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds