Journaling no protection against power drop
Posted Sep 8, 2009 20:54 UTC (Tue) by
anton (subscriber, #25547)
In reply to:
Journaling no protection against power drop by ncm
Parent article:
Ext3 and RAID: silent data killers?
[Engineers at drive manufacturers] say they happily stop
writing halfway in the middle of a sector, and respond to power drop
only by parking the head.
The results from
my experiments
on cutting power on disk drives are consistent with the theory
that the drives I tested complete the sector they write when the power
goes away. However, I have seen drives that corrupt sectors on
unusual power conditions; the manufacturers of these drives (IBM,
Maxtor) and their successors (Hitachi) went to my don't-buy list and
are still there.
Some drives only report blocks written to the platter
after they really have been, but that's bad for benchmarks, so most
drives fake it, particularly when they detect benchmark-like
behavior.
Write-back caching (reporting completion before the data hits the
platter) is normally enabled in PATA and also SATA drives (running
benchmarks or not), because without tagged commands (mostly absent in
PATA, and not universally supported for SATA) performance is very bad
otherwise. You can disable that with
hdparm -W0. Or you
can ask for barriers (e.g., as an ext3 mount option), which should
give the same consistency guarantees at lower cost if the file system
is implemented properly; however, my trust in the proper
implementation in Linux is severely undermined by the statements that
some prominent kernel developers have made in recent months on file
systems.
Everyone serious about reliability uses battery
backup
Do you mean a UPS? So how does that help when the UPS fails? Yes, we
have had that (while power was alive), and we concluded that our power
grid is just as reliable as a UPS. One could protect against a
failing UPS with dual (redundant) power supplies and dual UPSs, but
that would probably double the cost of our servers. A better option
would be to have an OS that sets up the hardware for good reliability
(i.e., disable write caching if necessary) and works hard in the OS to
ensure data and metadata consistency. Unfortunately, it seems that
that OS is not Linux.
(
Log in to post comments)