I haven't crashed many disks but my limited experience is similar. If you get to the point where data loss is caused by the hardware, it is likely to be trashing a whole lot more than the contents of the cache at shutdown. The solution to this problem is RAID; journals solve a different problem altogether.
Posted Apr 3, 2009 0:02 UTC (Fri) by nix (subscriber, #2304)
[Link]
RAID doesn't really solve sudden-power-loss situations: in fact RAID-5 in
particular can make it much worse (turning small-range corruption into
apparent scattershot corruption).
A UPS, or battery-backing, is the answer (well, moves the failure point:
if it's a UPS, the UPS must fail before you lose: if it's battery-backed,
you often have to lose the battery first, then power, which is likely to
happen because you often have no idea the battery has failed until it's
too late).
In conclusion: we all suck, our data is doomed, the Second Law shall
triumph and Sod and Murphy shall dance above our mangled filesystems.
fsync() and disk flushes
Posted Apr 4, 2009 0:01 UTC (Sat) by giraffedata (subscriber, #1954)
[Link]
The answer is RAID and UPS, but not that way. The RAID goes over the UPS; e.g. a mirror of two disk drives, each with its own UPS.
Such redundancy also makes it possible to test the UPS regularly and avoid the problem of two dead batteries when the external power fails.
The UPS doesn't count if you don't test, measure, and/or replace the its battery regularly.
fsync() and disk flushes
Posted Apr 3, 2009 23:49 UTC (Fri) by giraffedata (subscriber, #1954)
[Link]
It's hard to believe there are disk drives out there (not counting an occasional broken one) that write trash over random areas as they power down. Disk drives I have seen have a special circuit to disconnect and park the head the moment voltage begins to drop. It has to park the head because you can't let the head land on good recording surface, and it has to cut off the write current because otherwise it's dragging a writing head all the way across the disk, pretty much guaranteeing the disk will never come back. I believe it's a simple circuit that doesn't involve any controller intelligence.
There is a related failure mode where the drive's client loses power and in its death throes ends up instructing the drive to trash itself while the drive still has enough power to operate normally. I've heard that's not unusual, and it's the best argument I know for a UPS that powers a system long enough for it to shut down cleanly.
fsync() and disk flushes
Posted Apr 27, 2009 6:24 UTC (Mon) by bersl2 (subscriber, #34928)
[Link]
It's hard to believe there are disk drives out there (not counting an occasional broken one) that write trash over random areas as they power down. Disk drives I have seen have a special circuit to disconnect and park the head the moment voltage begins to drop. It has to park the head because you can't let the head land on good recording surface, and it has to cut off the write current because otherwise it's dragging a writing head all the way across the disk, pretty much guaranteeing the disk will never come back. I believe it's a simple circuit that doesn't involve any controller intelligence.
There is a related failure mode where the drive's client loses power and in its death throes ends up instructing the drive to trash itself while the drive still has enough power to operate normally. I've heard that's not unusual, and it's the best argument I know for a UPS that powers a system long enough for it to shut down cleanly.
One of these happened to me. $DEITY as my witness, I will never run an important system without an UPS again.
Bonus: The drive was a Maxtor. Serves me right.
Double bonus: That still wasn't traumatic enough to compel me to make backups.
fsync() and disk flushes
Posted Apr 27, 2009 10:43 UTC (Mon) by nix (subscriber, #2304)
[Link]
You don't need a UPS. A battery-backed disk controller is just as good
(and perhaps better because the battery failing doesn't take your machine
down if the power is otherwise OK, while the UPS failing *does*).