Posted Jul 14, 2006 9:47 UTC (Fri) by ncm
Parent article: Crash-only software: More than meets the eye
First, this (otherwise excellent) article repeats a common misconception: in fact, journaling file systems, like journaling databases, are no generally safe against power drops. The underlying reason is that, for evidently unavoidable marketing reasons, most drives *lie* about whether blocks sent to the drive really are physically on the disk; it may take a few seconds for blocks actually to get written after the drive has sworn up-and-down that it's already been done. Drives that never lie lose performance benchmarks.
Furthermore, there is an urban myth that many/most drives will use the motor as a generator to provide power to finish writing the current block and park the head. It is generally false -- all drives you're likely to encounter happily write random stuff if voltage drops while they're writing, even if they do park the head afterward.
The implication is that a high-reliability system really must either have some form of battery-backed up disk drive power (e.g. UPS), or must somehow ensure its drives really don't lie (good luck verifying that!), or their journal blocks must have 64-bit-or-better checksums independent of whatever the drive uses, and be prepared for late journal blocks to be unreadable or just wrong. (Available drives have an appallingly high specified bit-error rate, so using 64-bit checksums everywhere reliability matters is a good idea anyhow.) Finally, when testing failure recovery, don't confuse hardware or software reset with power failure.
Second, the principles behind crash-only design are embodied, thus far uniquely, in the C++ exception mechanism. A well-designed C++ program will have only a very few places that catch and process exceptions, and almost all the code that is executed during an exception is also run frequently during normal operation, in destructors. This differs fundamentally from languages with superficially similar exception features that depend on the "try-finally" construct. There, exception-handling code is scattered pervasively throughout the system, and much of it cannot be executed in any practical test process.
Third, crash-only design is tied very closely to logging and log-replaying. Often log-replaying code can be recycled for user-level undo/redo, and thus exercised in normal operation. Note that if you're not worried about power failure (because of your UPS), there's no need for the program to flush (i.e. fsync) the log file frequently. The kernel will do that if the program crashes, and mmapping a big data-structure image (e.g. to support undoing a deletion) to the end of the log file is very cheap. "Auto-save" is a very poor substitute for good logging.
to post comments)