fsync() and disk flushes
fsync() and disk flushes
Posted Apr 2, 2009 19:15 UTC (Thu) by iabervon (subscriber, #722)Parent article: That massive filesystem thread
Posted Apr 2, 2009 23:28 UTC (Thu)
by anton (subscriber, #25547)
[Link]
But I have tested two drives with a test program for
out-of-order writing, and found that they both wrote data several
seconds out of order with a certain access sequence. If we don't see
more frequent problems from this, that's probably because the disks don't
optimize accesses as aggressively as some people imagine.
Posted Apr 2, 2009 23:31 UTC (Thu)
by xoddam (guest, #2322)
[Link] (5 responses)
Posted Apr 3, 2009 0:02 UTC (Fri)
by nix (subscriber, #2304)
[Link] (1 responses)
A UPS, or battery-backing, is the answer (well, moves the failure point:
In conclusion: we all suck, our data is doomed, the Second Law shall
Posted Apr 4, 2009 0:01 UTC (Sat)
by giraffedata (guest, #1954)
[Link]
Such redundancy also makes it possible to test the UPS regularly and avoid the problem of two dead batteries when the external power fails.
The UPS doesn't count if you don't test, measure, and/or replace the its battery regularly.
Posted Apr 3, 2009 23:49 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (2 responses)
It's hard to believe there are disk drives out there (not counting an occasional broken one) that write trash over random areas as they power down. Disk drives I have seen have a special circuit to disconnect and park the head the moment voltage begins to drop. It has to park the head because you can't let the head land on good recording surface, and it has to cut off the write current because otherwise it's dragging a writing head all the way across the disk, pretty much guaranteeing the disk will never come back. I believe it's a simple circuit that doesn't involve any controller intelligence.
There is a related failure mode where the drive's client loses power and in its death throes ends up instructing the drive to trash itself while the drive still has enough power to operate normally. I've heard that's not unusual, and it's the best argument I know for a UPS that powers a system long enough for it to shut down cleanly.
Posted Apr 27, 2009 6:24 UTC (Mon)
by bersl2 (guest, #34928)
[Link] (1 responses)
It's hard to believe there are disk drives out there (not counting an occasional broken one) that write trash over random areas as they power down. Disk drives I have seen have a special circuit to disconnect and park the head the moment voltage begins to drop. It has to park the head because you can't let the head land on good recording surface, and it has to cut off the write current because otherwise it's dragging a writing head all the way across the disk, pretty much guaranteeing the disk will never come back. I believe it's a simple circuit that doesn't involve any controller intelligence. There is a related failure mode where the drive's client loses power and in its death throes ends up instructing the drive to trash itself while the drive still has enough power to operate normally. I've heard that's not unusual, and it's the best argument I know for a UPS that powers a system long enough for it to shut down cleanly. One of these happened to me. $DEITY as my witness, I will never run an important system without an UPS again. Bonus: The drive was a Maxtor. Serves me right.
Posted Apr 27, 2009 10:43 UTC (Mon)
by nix (subscriber, #2304)
[Link]
fsync() and disk flushes
And if you suddenly lose power, in his experience, the
drive is actually much more likely to wipe out some arbitrary track of
data from the disk than it is to have anything in the write cache and
lose it.
While I have experienced drives that damage sectors or tracks on power
loss, I consider these drives faulty; and with such drives the problem
does not seem to be limited to drives that are trying to write
something at the time. However, most drives don't wipe out arbitrary
data in my experience.
fsync() and disk flushes
fsync() and disk flushes
particular can make it much worse (turning small-range corruption into
apparent scattershot corruption).
if it's a UPS, the UPS must fail before you lose: if it's battery-backed,
you often have to lose the battery first, then power, which is likely to
happen because you often have no idea the battery has failed until it's
too late).
triumph and Sod and Murphy shall dance above our mangled filesystems.
The answer is RAID and UPS, but not that way. The RAID goes over the UPS; e.g. a mirror of two disk drives, each with its own UPS.
fsync() and disk flushes
fsync() and disk flushes
fsync() and disk flushes
Double bonus: That still wasn't traumatic enough to compel me to make backups.fsync() and disk flushes
(and perhaps better because the battery failing doesn't take your machine
down if the power is otherwise OK, while the UPS failing *does*).