LWN.net Logo

Whither btrfsck?

Whither btrfsck?

Posted Oct 11, 2011 21:48 UTC (Tue) by jmorris42 (subscriber, #2203)
Parent article: Whither btrfsck?

Sorry, I'm not understanding this argument. The code should be out there. If it isn't believed to be safe the distributions shouldn't package it, but people who want to try and understand it should be able to try to contribute. It couldn't hurt and might help. Not seeing the downside.

And perhaps had the fsck tool been developed in the open alongside the filesystem itself people developing the filesystem might have been motivated to add to the checking tool when adding a filesystem feature. And if they couldn't figure out how to fsck that feature they might have rethought it's implementation to include sufficient information to allow recovery in as many situations as they could plan for. Filesystem checking isn't something to worry about after the filesystem's on disk format is set in stone.

If someone is desperate enough to go pull a git repo, build a tool with UNSTABLE written all over it and run it, that they get to keep the pieces is something they probably understand. It would be a desperation move anyway, things would already be terribly wrong so why not let them give it a go? At least you might get a bug report out of it. What is the answer right now? Too bad, so sad, time to reformat? Even a buggy fsck tool beats that answer.


(Log in to post comments)

Whither btrfsck?

Posted Oct 11, 2011 22:04 UTC (Tue) by dlang (✭ supporter ✭, #313) [Link]

I also don't buy this argument.

how is it better to leave people with a corrupted filesystem and no tool that can recover it than a corrupted filesystem with a tool that may fix the problem or may corrupt it more?

in the first case they will always have a corrupted filesystem, in the second case they may have a corrupted filesystem, or they may be able to recover it.

It's not as if people who fine their system is corrupted are going to keep the corrupted image around for years until a fsck program is available (and for that matter, without something to check the filsystem, how will they even know that it's corrupted?

Whither btrfsck?

Posted Oct 12, 2011 0:32 UTC (Wed) by lordsutch (guest, #53) [Link]

"for that matter, without something to check the filsystem, how will they even know that it's corrupted?"

More than likely the same way as with any other filesystem: kernel messages when file accesses are attempted. I don't know the default settings in mke2fs these days, but even if they're relatively conservative (on the order of every 10-20 mounts/30 days) a full filesystem check is a rare event on most Linux boxes.

Not to say that btrfsck isn't needed, but most of the time you're relying on what the kernel filesystem code is doing to maintain integrity no matter whether fsck is available or not.

Whither btrfsck?

Posted Oct 12, 2011 4:04 UTC (Wed) by dlang (✭ supporter ✭, #313) [Link]

some forms of corruption will cause kernel messages to pop up, other forms of corruption will be silent and can gradually destroy other files the longer you use the filesystem.

What does the ck mean?

Posted Oct 12, 2011 2:42 UTC (Wed) by ncm (subscriber, #165) [Link]

This last line seems to me the key to the whole problem.

You can't (further) corrupt a file system if you don't
write to it. There's a great deal of checking that you can
only afford to do when some program isn't waiting for read()
to come back. On btrfs, if it's designed right, you should
be able to run a consistency checker in background on live
file systems.

It's no fun to be informed that your file system is corrupt
and, further, that it can't be fixed, but that's much better
than *not* being informed that your file system is corrupt,
when it is. The sooner you find out, the fewer backups will
also be corrupt. A tool that can only constrain the locus of
the corruption would still be helpful; only the faulty part
needs to be reloaded from backups.

A widely used checker would result in better bug reports for the
file system proper, as corruption is found early. How many bugs
are still waiting to be found just because nothing is looking?

The way forward, then, is to release a pure checker, first,
and then begin to release repair capabilities one at a time as
they become ready. If the repair tool generated a journal of
changes without writing them to the file system proper, then
you could run a full check on the sum of fs+journal, and only
commit the changes if the result is clearly better than before.
Ideally the repair machinery would actually be the same well
tested code that, in production, integrates more usual changes
into the file system.

What does the ck mean?

Posted Oct 13, 2011 23:42 UTC (Thu) by NRArnot (subscriber, #3033) [Link]

Exactly what I was thinking. A slightly corrupt filesystem is likely to become a seriously corrupt filesystem and then a heap of completely useless bytes, if no-one gets to know about the corruption while it is too minor to be screamingly obvious.

Especially with a new filesystem like btrfs it would make sense to combine backup with fsck. Do backup, do btrfsck, if filesystem structure checks AOK allow it to continue being used, and recycle the oldest of your backups.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds