>PS. you forgot from your list the fact that most this kind of binary formats take more space than text formats.
You might want to read the docs again:
"The fields, an entry consists off, are stored as individual objects in the journal file, which are then referenced by all entries, which need them. This saves substantial disk space since journal entries are usually highly repetitive (think: every local message will include the same _HOSTNAME= and _MACHINE_ID= field). Data fields are compressed in order to save disk space. The net effect is that even though substantially more meta data is logged by the journal than by classic syslog the disk footprint does not immediately reflect that."
Posted Nov 18, 2011 23:53 UTC (Fri) by jubal (subscriber, #67202)
[Link]
That's grand. Now could you please tell me how would you recover that information in case of disk failure?
The Journal - a proposed syslog replacement
Posted Nov 19, 2011 1:40 UTC (Sat) by HelloWorld (guest, #56129)
[Link]
> That's grand. Now could you please tell me how would you recover that information in case of disk failure?
Using the "cp" command and the backup, obviously.
The Journal - a proposed syslog replacement
Posted Nov 19, 2011 2:02 UTC (Sat) by jubal (subscriber, #67202)
[Link]
I think that my sarcasm detector failed here. Surely you're not seriously telling me that in the hypothetic case of a disk failure the way to recover currently written log entries is to combine recovered, compressed and possibly damaged binary blobs with random compressed binary blobs?
See, no matter if it was done intentionally or not, the redundancy of the syslog entries is a feature, not a bug.
The Journal - a proposed syslog replacement
Posted Nov 19, 2011 2:11 UTC (Sat) by HelloWorld (guest, #56129)
[Link]
You can't trust data from a potentially damaged disk anyway. The only sane thing to do is make sure that your backup infrastructure is in place, so you don't need the data from the damaged disk. Or, if you can't afford to lose the data between two backups, use a RAID array.
The Journal - a proposed syslog replacement
Posted Nov 19, 2011 2:24 UTC (Sat) by jubal (subscriber, #67202)
[Link]
You do realise that you might be looking for the data written directly before the crash? (Also: no, RAID is not a silver bullet, and please don't tell me that a SAN or NAS storage appliance will always prevent data loss it should, but sometimes it can't). And remote log daemon won't help much if it's the remote log daemon's storage that just evaporated.
The Journal - a proposed syslog replacement
Posted Nov 19, 2011 3:16 UTC (Sat) by cwillu (subscriber, #67268)
[Link]
"Trusted" is not a binary distinction.
Data off a broken disk isn't as reliable as data from a functional disk. However, data off a broken disk isn't completely unreliable, and data from a functional disk isn't completely reliable.
What matters is whether the data is useful, and in my experience, that data sitting on a broken disk is frequently useful.
The Journal - a proposed syslog replacement
Posted Nov 21, 2011 14:22 UTC (Mon) by nix (subscriber, #2304)
[Link]
Quite. I have repeatedly in the past been able to get a lot of useful info from logs damaged by disk or fs damage. Sure, a few blocks were full of \0s or just plain garbage --- but the rest was readable. Who cares if the file as a whole no longer conforms to any formal grammar? Human beings don't need one!
But computers do. A binary->text tool would probably have given up in the face of such damage. At best it would go down a rarely-used hence buggy parser-error-recovery code path.
This depends on the design, actually
Posted Nov 21, 2011 18:27 UTC (Mon) by khim (subscriber, #9252)
[Link]
But computers do. A binary->text tool would probably have given up in the face of such damage. At best it would go down a rarely-used hence buggy parser-error-recovery code path.
Computers do what they are programmed to do. If you'll write program which tolerates corrupt files then it'll just that. It's not as hard to do as you think: you don't need cryptohash for that, CRC32 is enough and it's exteremely fast novadays. But sure, you must plan for that in advance.
This is what Google does with it's petabytes of logs - works fine from what I've heard.