LWN.net Logo

The Journal - a proposed syslog replacement

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 7:08 UTC (Mon) by yoe (subscriber, #25743)
Parent article: The Journal - a proposed syslog replacement

Binary logging is a very bad idea.

"oh, crap, it seems like my hard disk broke down. Let's quickly check the logs to see what happened."

"error: could not read log file: log file corrupt."


(Log in to post comments)

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 10:25 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

That's not an argument against binary logging. That's an argument in favour of having multiple copies of the log on different spindles (and possibly even different computers). If the disk has thrown a wobbly, you may well have garbage instead of metadata, garbage instead of data in whatever blocks were most recently written to, etc.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 13:08 UTC (Mon) by SEJeff (subscriber, #51588) [Link]

Hopefully next gen filesystems such as btrfs will partially solve this problem via checksumming data more until true ECC hardware and memory is the norm.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 19:06 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

ECC seems vanishingly unlikely to be the norm until companies are forced to do it (as in, a sufficiently-large jurisdiction's laws make it either criminal or sufficiently expensively tortious to sell end-product devices that don't use some kind of ECC technology on their memory), because it increases the design and manufacturing costs of the end product without increasing sales of the end product.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 19:22 UTC (Mon) by SEJeff (subscriber, #51588) [Link]

Agreed. That is why filesystems which checksum and keep multiple copies of data/metadata will help paper over the problem of unreliable hardware. Things like ZFS and Btrfs truly are the future.

The Journal - a proposed syslog replacement

Posted Nov 24, 2011 8:06 UTC (Thu) by cas (subscriber, #52554) [Link]

because it increases the design and manufacturing costs of the end product without increasing sales of the end product.

and the really annoying thing is that it would just be a short-term once-off design cost until the chipsets for various device types all had ECC support (all AMD CPU chipset motherboards have ECC support and have had for several years - the catch is that ECC RAM is much more expensive). And the economies of scale for producing just one kind of RAM instead of "server RAM" and "desktop RAM", would quickly offset even those costs within a HW design generation

which is, of course, the reason why it hasn't happened - artificial market segmentation is extremely profitable. You can only charge more for "server-class" hardware if they have a few things which don't exist or are uncommon on "consumer" motherboards - e.g. ECC being uncommon outside of AMD chipsets, consumer motherboards having SATA rather than SAS (which should just replace SATA entirely), and consumer drives being SATA interface rather than SAS.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 22:29 UTC (Mon) by jrn (subscriber, #64214) [Link]

Or maybe an argument in favor of using error correcting codes in that binary format. :)

Though it's hard to beat the robustness against corruption of an uncompressed text file. Entire sectors can be missing, and the rest is still easily readable without requiring specialized skills.

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 5:45 UTC (Tue) by slashdot (guest, #22014) [Link]

This is because text has a record delimiter that is not used within the records (the newline character), making synchronization trivial.

A binary format with record hashes is also similarly recoverable, since you can just try all record start positions until one hashes properly (much more expensive computationally, but on current CPU it won't be noticeable if records aren't huge).

Add sync markers and tags to binary files.

Posted Nov 22, 2011 6:31 UTC (Tue) by eru (subscriber, #2753) [Link]

This is because text has a record delimiter that is not used within the records (the newline character), making synchronization trivial.

Binary format can easily have the same property: A synchronization marker (not necessarily a single byte) that is guaranteed to not appear in the data. This means the actual data needs some processing to avoid the marker, but this can be cheaper than a conversion to text. Eg. if your sync marker is 0x55, double it if it appears in the payload data. Some other byte combinations starting with 0x55 could tag the type of following data (date, numbers of different sizes, string etc), which also helps parse possibly corrupted files.

The Journal - a proposed syslog replacement

Posted Dec 20, 2011 8:03 UTC (Tue) by topher (guest, #2223) [Link]

That's not an argument against binary logging.

Actually, it is. And it's a valid one. Take the case of a corrupted block on your filesystem. We're not caring how it got there (physical problem, filesystem problem, whatever), but it's there. If you've got a text file with a corrupted chunk in it, you can generally recover everything except for the corrupted part with no special tools.

Now imagine the same scenario, but with a binary file. You better hope and pray that whoever wrote the tools for that (currently undocumented) binary format have specialized tools for analyzing and recovering from file corruption. Otherwise, there's an excellent chance you just lost that entire file. Depending on the the internal format of the file, you might be screwed no matter what (especially if the file is doing internal compression functionality, which could mean your corruption just "infected" your entire file).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds