LWN.net Logo

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 10:54 UTC (Fri) by dlang (✭ supporter ✭, #313)
In reply to: Ext4 to be standard for Fedora 11, Btrfs also included (heise online) by Seegras
Parent article: Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

given that modern drives will remap bad blocks on the disk to hide this from the OS, how much use does such a feature get nowdays?


(Log in to post comments)

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 17:53 UTC (Fri) by martinfick (subscriber, #4455) [Link]

Only until the HD runs out of spare blocks, then the FS sees it.

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 18:37 UTC (Fri) by tialaramex (subscriber, #21167) [Link]

But the disks are designed to have enough such blocks to last the lifetime of the disk. It will show up in SMART under that "old age" category, like the slow loss of motor performance caused by wearing of the tiny moving parts.

If you're interested because you run disks way beyond their design lifespan then you're in a niche, like the Bad RAM patch that lets Linux map out ranges of DRAM known to be damaged or faulty so that you can use DIMMS which failed QA.

Everybody else ought to be replacing disks that approach the limit on remapped blocks. That's a pretty clear sign something's wrong. Maybe it's not fatal. but it's definitely not good, so replace the disk.

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 19:19 UTC (Fri) by martinfick (subscriber, #4455) [Link]

You are assuming modern disks that even have SMART. I still have disks that don't. The disk can be fine until suddenly one day: bad block!

But, even with modern disks, do you really think the average consumer has even heard of SMART? Do you think they have any clue that their disk needs to be replaced? Do any distros even enable SMART monitoring* by default? Do you think bad blocks only ever appear gradually?

Unless you have a specific good reason to not have bad block relocation, why would you not want it? You shouldn't need checksums either, but they are a good idea. A modern FS should certainly be able to handle bad blocks, I would find it hard to call it modern of it can't.

* I am kind of curious when distros will become proactive and install all sorts of monitoring/smtp stuff by default?

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 20:31 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

when did you last purchase a drive that doesn't do bad-block remapping?

The users don't need to know about SMART, they need to know that if the OS ever sees a bad block on the drive that the drive should be replaced. It hasn't been the case for _many_ years that it was normal to have some number of user-visible bad blocks on a drive. you almost have to go back to the days of MFM/RLL/ESDI drives (I think this was 12+ years ago)

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 21:31 UTC (Fri) by martinfick (subscriber, #4455) [Link]

I think my whole point must have gone right over your head? :)

Even my old drives do bad block remapping, but there is no way to know about it until they run out since they do not have SMART!! Given enough time, they will run out or the drive will crash entirely. I have older drives that do remap blocks, but have run out of spare blocks. I no longer use them since they are obviously risky to use, but in the meantime I am happy to have not lost any data until I realized the problem.

Your OS point is idealistic as I tried to explain previously. I don't know of any linux distro that enables smartmon by default, so, no, the average user will not know when the drive is going bad even thought the tools exists to find out. Even if they did have smartmon enabled, do you think the average user looks through logs to see if there are any problems? How would a linux desktop GUI only user ever know?

Not to mention, that you still have not provided a logical reason to NOT perform bad block allocation except for your stance that it is not really needed.

Just to add to this, my personal opinion is that the HD is the wrong place for this feature anyway. It requires you to waste an arbitrary number of blocks as spares. There is no way to get that number right, either you will have allocated too many or too few. Neither case is ideal. Since this really is a software feature (remapping), let the OS know and don't pretend it's a hardware feature. To be honest, it probably belongs at the block layer since it will not make sense to do it at the FS layer if using LVM.

Perhaps LVM should take care of this? If I use LVM, I should be able to have my spare blocks on a separate disk. In fact LVM should probably have thresholds that make it remap the entire disk when it starts to see too many bad blocks! I guess ideally all the layers should be able to handle badblocks somehow. :)

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 22:03 UTC (Fri) by dlang (✭ supporter ✭, #313) [Link]

one point, when a bad block gets remapped, you do (stand a very good chance that you) loose the data on that block.

the problem with having the OS do bad-block mapping is that people use drives with different OS's, and they would all have to know about all the bad blocks, what they were remapped to, etc.

drives used to work this way, the reliability of having the OS do this (even in cases where you weren't dual-booting) was bad enough that at the time it migrated into the drives so that the OS could forget about this, everyone considered this a step forward.

it's far more complicated for the drive to do the remapping than to let the OS do it, so it wasn't just a whim that change things so that every drive manufactured does it internally.

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 24, 2009 12:43 UTC (Sat) by Los__D (guest, #15263) [Link]

When the spares runs out, the drive should be replaced immediately, if you care about your data.

No OS bad block remapping scheme will change that.

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 24, 2009 13:36 UTC (Sat) by nix (subscriber, #2304) [Link]

Yes: but you can see bad blocks appear before any spares have been used at
all, if you only saw them on read, rather than after a rewrite.

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 24, 2009 13:51 UTC (Sat) by dlang (✭ supporter ✭, #313) [Link]

but in that case the correct action isn't to mark the sector as bad, but to try to write to it (so that the sector will be remapped)

however, if the drive really is going bad, doing so could further damage your data

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 23, 2009 23:58 UTC (Fri) by nix (subscriber, #2304) [Link]

If the OS sees a bad block on a drive *after a write* the drive should be
replaced. If it's just a read, then the drive hasn't had a chance to do
any bad block remapping yet, so nothing unexpected is going on.

(Using RAID will let you do that rewrite without losing the data that was
in the sector that went bad.)

Ext4 to be standard for Fedora 11, Btrfs also included (heise online)

Posted Jan 24, 2009 1:08 UTC (Sat) by EzDi (guest, #56297) [Link]

Exactly, I've had this problem with several drives where I get a bad sector somewhere, often underneath swap, a journal, or some other file system bit, because that's what's being dealt with before the incident that caused it.

Drives often don't remap UNTIL YOU DO A WRITE. So I keep getting errors while trying to mount, fsck, or when doing something to an adjacent sector, until I manually zero out that sector. Then my hard drive is perfect again from the OS's viewpoint.

Of course using bad block lists can often not do the right thing, because they only do a read test...

SMART is not that usefull

Posted Jan 24, 2009 13:51 UTC (Sat) by zuki (subscriber, #41808) [Link]

Google did a large survey (100000s disks) of disk failures,
(http://research.google.com/archive/disk_failures.pdf, sec. 3.5.6)
and about 1/3 report no SMART errors at all before failing, and only a
minority report failures that would allow for early replacement.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds