User: Password:
Subscribe / Log in / New account

ext3 metaclustering

ext3 metaclustering

Posted Jan 17, 2008 17:30 UTC (Thu) by bronson (subscriber, #4806)
In reply to: ext3 metaclustering by Velmont
Parent article: ext3 metaclustering

... except when a tiny inconsistency spreads and ends up corrupting half of your partition
(the half with the presentation on it of course).

Oh, if only you'd run periodic fscks!  The corruption would have been caught early and fixed
without you ever knowing about it.   :-P

(Log in to post comments)

ext3 metaclustering

Posted Jan 17, 2008 18:05 UTC (Thu) by sbergman27 (guest, #10767) [Link]

I'd like to see some proof that this really happens in the real world.  That such spreading of
a tiny corruption to destroy one's whole file system is a real concern for real people.  Sure,
ext3 can, rarely, experience serious corruption.  But how do we know whether or not it started
out as a tiny problem?  Until it's proven, it's sounds like hearsay to me.

ext3 metaclustering

Posted Jan 17, 2008 22:43 UTC (Thu) by rfunk (subscriber, #4054) [Link]

Disks deteriorate underneath the filesystem.  Usually it's age causing 
increasingly large bad spots.  The filesystem may have no bugs, but things 
can happen to the disk to corrupt the filesystem.  I'd rather know about 

ext3 metaclustering

Posted Jan 17, 2008 23:15 UTC (Thu) by sbergman27 (guest, #10767) [Link]

Then have "badblocks" run independently of e2fsck at 3AM once a week or so?  Instead of
forcing the user and conference attendees to sit through a mandatory hour-long e2fsck at the
whim of the machine, hoping that the e2fsck happens to catch what badblocks is far better
designed to catch?

ext3 metaclustering

Posted Jan 17, 2008 23:31 UTC (Thu) by rfunk (subscriber, #4054) [Link]

Hour-long e2fsck?

1.  Partition your disk -- root, /var, /usr, /home on different filesystems.
2.  If you ever do enable the automatic checks, set the mounts-before-check count on 
each filesystem to be a different prime number.  That way multiple filesystems almost 
never get checked at the same time.

I've never had an fsck on a non-server system (which seems to be the topic here) go 
anywhere near an hour.  Maybe five minutes at most.

In my experience, badblocks is far far slower than e2fsck.

And running anything automatically at 3am generally isn't an option on 
conference-presentation laptops.

ext3 metaclustering

Posted Jan 18, 2008 14:56 UTC (Fri) by fatrat (subscriber, #1518) [Link]

Not sure that partitions help here. If we are taking personal box/laptop /home is the only
thing I care about and it'll have all the disk space as well.

ext3 metaclustering

Posted Jan 18, 2008 15:06 UTC (Fri) by rfunk (subscriber, #4054) [Link]

OK, then you won't mind if I rm -rf /usr on your machine.  :-)

Try:  du -shc /var /usr /home
(There's also the root stuff not in those, but it's harder to measure that.)
You may be surprised at how much is in /var and /usr.

ext3 metaclustering

Posted Jan 18, 2008 15:19 UTC (Fri) by fatrat (subscriber, #1518) [Link]

My home dir contains ~82 Gb. Compared to that, /usr and /var don't contain a lot (under 10gb).
I'm sure most people are similar, hence my comment.

ext3 metaclustering

Posted Jan 18, 2008 15:45 UTC (Fri) by rfunk (subscriber, #4054) [Link]

10GB is still a big important chunk of disk, whether the rest is 20GB or 82GB.  Checking 
it separately *will* speed up each check, and separating it into a separate filesystem will 
make sure that errors on one part won't mess up the other part.

(Come to think of it, I suspect that the fsck speed is more dependent on number of files 
than data size, though I don't know for sure.)

ext3 metaclustering

Posted Jan 19, 2008 22:22 UTC (Sat) by Frej (subscriber, #4165) [Link]

Partitioning is fixing the symptoms, not the problem. 

Multiple fscks

Posted Jan 30, 2008 3:28 UTC (Wed) by Max.Hyre (guest, #1054) [Link]

[S]et the mounts-before-check count on each filesystem to be a different prime number. That way multiple filesystems almost never get checked at the same time.
Even better is setting the mounts/count to the same number on all filesystems, then use tunefs to set the starting count to a different value on each.

Voila! Never a multiple fsck.

ext3 metaclustering

Posted Jan 17, 2008 23:19 UTC (Thu) by magila (subscriber, #49627) [Link]

Disks these days are pretty good at hiding bad sectors from the host. If it gets bad enough
that the OS starts seeing bad data then the drive is probably on it's last legs and will soon
fail completely. In any case monitoring the SMART logs will usually catch a drive that is
gradually degrading without the frustrating fsck delays.

ext3 metaclustering

Posted Jan 17, 2008 23:33 UTC (Thu) by rfunk (subscriber, #4054) [Link]

True, but how many people monitor SMART logs on a laptop, or even a desktop?
More to the point, how many of the people disabling the auto-fsck monitor their SMART 

ext3 metaclustering

Posted Jan 18, 2008 22:00 UTC (Fri) by nix (subscriber, #2304) [Link]

smartd can send you emails when things go suspiciously wrong.

ext3 metaclustering

Posted Jan 18, 2008 22:05 UTC (Fri) by rfunk (subscriber, #4054) [Link]

True.  How many people have system-level email working properly on their laptops, and 
are able to get such emails?

ext3 metaclustering

Posted Jan 18, 2008 22:14 UTC (Fri) by nix (subscriber, #2304) [Link]

Um, anyone competent? All sorts of other email, some security-important, 
gets sent by various daemons and shouldn't just be binned or ignored... of 
course a lot of people aren't competent :/

ext3 metaclustering

Posted Jan 18, 2008 22:20 UTC (Fri) by rfunk (subscriber, #4054) [Link]

My programmer coworkers have enough trouble with the task, and they're techies.  
Forget about the non-techie user that is adopting Linux more and more.

Everyone sets up their GUI mail program, and totally ignores the system-level MTA 
(sendmail/postfix/exim).  They just never get those emails.

(Sysadmin types being the exception, of course, but they're few and far between these 

ext3 metaclustering

Posted Jan 19, 2008 18:35 UTC (Sat) by raxyx (subscriber, #50026) [Link]

So THAT's that these MTAs are for. Cool. 

> Everyone sets up their GUI mail program, and totally ignores the system-level MTA 
> (sendmail/postfix/exim).  They just never get those emails.

Full ack on that. On some of my Debian machines, during the boot sequence, the thing that
takes the most time to get loaded is exim4, so one day I got fed up with it and removed it,
didn't notice any difference afterwards. I guess I'm going to rethink that move :-)

lightweight MTAs for outgoing mail only

Posted Jan 19, 2008 20:08 UTC (Sat) by liamh (subscriber, #4872) [Link]

I have taken to removing exim4 and installing either ssmtp or nullmailer
 aptitude install ssmtp exim4- exim4-base- exim4-config- exim4-daemon-light-
Just enough MTA to get the word out.  Since few people want/need a full MTA, this seems like
it should be the default.  But I don't smart disk monitoring; a few years back I tried it and
it led to some unreliable system behavior.

ext3 metaclustering

Posted Jan 19, 2008 1:56 UTC (Sat) by cortana (subscriber, #24596) [Link]

Well, Debian configures smartd to both mail root and display a notification on the desktops of
currently-logged-in users. :)

ext3 metaclustering

Posted Jan 17, 2008 23:56 UTC (Thu) by dberkholz (guest, #23346) [Link]

Google published a paper fairly recently on a large study of disk failures. As I recall, they
found that SMART logs were not reliable indicators.

ext3 metaclustering

Posted Jan 18, 2008 4:12 UTC (Fri) by magila (subscriber, #49627) [Link]

Notice I said gradually degrading. SMART won't help in the event of a catastrophic mechanical
failure, which is what most of the unanticipated failures in the Google study probably were.
Fsck doesn't help in that case either though. It's only the kinds of failures that cause a
slow accumulation of bad sectors that fsck would matter for, and those are the kinds of
failures that SMART is piratically guaranteed to catch.

ext3 metaclustering

Posted Jan 18, 2008 8:51 UTC (Fri) by njs (guest, #40338) [Link]

piratically... guaranteed...?

ext3 metaclustering

Posted Jan 18, 2008 22:03 UTC (Fri) by nix (subscriber, #2304) [Link]

That's SMArrrT for you.

Using fsck to defend against disk failures?

Posted Jan 27, 2008 15:45 UTC (Sun) by anton (subscriber, #25547) [Link]

That and the "spreading inconsistency" theory and some other things I have read by people writing about fsck are failure types that I have never seen or read a first-hand report of, so I guess they are just myths or a perverted form of wishful thinking.

The kinds of disk failures I have seen have always been different. In particular, even if a drive developed a bad block, it recognized that itself (very slowly) and returned an error rather than wrong data. I'm not sure if fsck programs are up to dealing with a bad block of this kind in the metadata, but if a drive has a bad block, that's certainly a good time to replace the drive and restore the data from backup. Or you run RAID 1 or RAID 5, you just need to replace the drive (and make it known to the RAID driver).

Moreover, even if a disk drive deteriorates over time, that's more likely to hit the data first rather than the meta-data. But fsck checks only some kinds of errors in the meta-data, so if fsck is your defense against bad blocks, you don't value your data at all. Making a backup is more likely to unveil bad blocks than fsck (also in data), and has obvious additional benefits.

Finally, a good way (much better than fsck) to test the drive for bad blocks is "smartctl -t long", even though I am sceptical about the predictive capabilities of SMART.

Overall, I am very sceptical about the value of fsck for dealing with hardware failures, and a little bit less sceptical about its value when dealing with software failures (but I think I have not been bitten by a file system bug yet); in many cases (especially the hardware ones) we have to restore from backup anyway.

Using fsck to defend against disk failures?

Posted Jan 27, 2008 16:32 UTC (Sun) by nix (subscriber, #2304) [Link]

My mum's ancient 486 laptop had a really strange disk failure this 
Christmas. It started with a single bad sector, but then within about 
fifteen minutes one third of the sectors on the disk (in contiguous runs 
of varying length) were returning, not bad sectors, but `sector not 
found', i.e. the drive couldn't even find the sector address markers.

What I suspect may have happened, based on my extensive lack of experience 
in hard drive design, is that all the G forces the head assembly is 
exposed to whenever a seek happens had over time twisted the head reading 
the farthest side of whichever platter didn't contain the servo track out 
of true, so that when the servo track said it was over track X, the 
topmost heads were actually midway between tracks or something like that. 
In that position they couldn't read the sector addresses, couldn't find 
any data, and whoompfh, goodbye data.

(I've never heard of this failure mode anywhere else, and perhaps it was 
something different, but still, it was very strange. Disks *can* go mostly 
bad all at once. It's just rare.)

Disk failures

Posted Jan 27, 2008 21:58 UTC (Sun) by anton (subscriber, #25547) [Link]

Disk drives have not used servo tracks for a long time, because one could no longer align all the heads precisely enough (e.g., because of thermal expansion). Instead, servo information exists on each platter, interspersed in some way with the data. I don't know when this change happened; a 15+-year old disk (486 generation) might still have a servo track. But couldn't the symptoms also be explained by the failure of just one of the heads?

Disk failures

Posted Jan 27, 2008 22:55 UTC (Sun) by nix (subscriber, #2304) [Link]

I said it was a prehistoric system, and indeed anything more modern than 
about, what, 1991 won't have this problem.

I'm not sure if a head failure could cause a failure to find sector 
address markers: I'm not sure if you could even distinguish the two cases 
without digging into the drive. (As I said, my expertise in hard drive 
engineering is notable mainly by its absence.)

It's just that heads are solid-state, and solid-state stuff doesn't die 
all that often, while the head assembly itself is being wrenched all over 
the place: simple bending could explain this, I think.

ext3 metaclustering

Posted Jan 18, 2008 0:24 UTC (Fri) by iabervon (subscriber, #722) [Link]

These days, it doesn't make much sense to use -c periodic checking, since disk data errors are
unlikely to be associated with mounting. It makes a lot more sense to use -i periodic
checking, which you can schedule for some time when you're not giving a presentation.

Actually, it would make most sense to do it at shutdown sometime the system is plugged in and
you're going to bed, controlled with cron/anacron for noticing the need to check it and
shutdown scripts to identify that it's appropriate. Obviously, there's practically no chance
that the periodic check would actually happen to trigger on the first mount after disk
corruption occurs, and it's more likely that corruption would happen during a write (and thus,
while it's mounted) anyway.

Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds