|
|
Log in / Subscribe / Register

The Journal - a proposed syslog replacement

Lennart Poettering and Kay Sievers discussed their concept of the "journal" at the 2011 Kernel Summit; now they have posted a detailed document describing how they think their syslog replacement should work. "Break-ins on high-profile web sites have become very common, including the recent widely reported kernel.org break-in. After a successful break-in the attacker usually attempts to hide his traces by editing the log files. Such manipulations are hard to detect with classic syslog: since the files are plain text files no cryptographic authentication is done, and changes are not tracked. Inspired by git, in the journal all entries are cryptographically hashed along with the hash of the previous entry in the file. This results in a chain of entries, where each entry authenticates all previous ones. If the top-most hash is regularly saved to a secure write-only location, the full chain is authenticated by it. Manipulations by the attacker can hence easily be detected." The plan is to get an initial implementation into the Fedora 17 release.

to post comments

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:13 UTC (Fri) by tshow (subscriber, #6411) [Link]

Sounds like a decent idea, but what it probably means is we keep seeing news stories talking about high-profile breakins, with a footnote that the victim site kept their log authentication keys and hash in /root...

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:17 UTC (Fri) by willnewton (guest, #68395) [Link] (69 responses)

So it replaces text files that can be read and processed with the standard UNIX tools with an undocumented binary format that can only be read by a single tool?

Think I'll pass.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:26 UTC (Fri) by JoeBuck (guest, #2330) [Link] (13 responses)

Why must the format be undocumented? It could probably be done in a simple enough way that a few lines of Perl or python would dump it in a human-readable form, since all you need is a clean record definition. Each record would have a timestamp, hash, and message.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:51 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (12 responses)

From the FAQ:
Will the journal file format be standardized? Where can I find an explanation of the on-disk data structures?
At this point we have no intention to standardize the format and we take the liberty to alter it as we see fit. We might document the on-disk format eventually, but at this point we don’t want any other software to read, write or manipulate our journal files directly. The access is granted by a shared library and a command line tool. (But then again, it’s Free Software, so you can always read the source code!)
No, thanks. Text is universally forward- and backward-compatible. This monstrosity won't be.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:17 UTC (Fri) by JoeBuck (guest, #2330) [Link] (1 responses)

This seems ill-advised, because logs might need to be analyzed some point in the future. They are critical data, and you never want to have a problem reading your critical data.

I don't care if they leave things loose during the initial design and evaluation, but you don't want to deploy this thing on a server you care about unless they tighten it up.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:14 UTC (Fri) by niner (guest, #26151) [Link]

I also keep critical data in a PostgreSQL database which can only be read by a single program - PostgreSQL. It would probably be nice to be able to read it with a text editor, but then I'd probably lose all the benefits PostgreSQL gives me, so I can live with it. As I will with having to use some command like logread on my wireless router to get at the logs.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:29 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

Well, it makes sense to use undocumented format at first, so others won't treat it as ABI. But it's not going to be complex so it's definitely going to be replicated in other tools.

As for backwards compatibility, I'm personally going to write a FUSE file system which exposes journal records as regular log files. After all, I already have one that does that for Windows...

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 19:59 UTC (Fri) by drag (guest, #31333) [Link] (4 responses)

I don't mind piping out binary files to grep for things.

I do it for syslogs at work anyways. All the files are stored in gzip format and I have to use 'gunzip -c' to read them.

However, Log files are definitely a weak point for Linux. A huge pain in the ass. If it's a mostly-text file with something like null terminated fields then it would make things a lot easier and more efficient to parse.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:18 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)

Ever heard of zgrep?

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 7:49 UTC (Sun) by drag (guest, #31333) [Link] (2 responses)

... sure. What about it?

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 23:56 UTC (Mon) by kmself (guest, #11565) [Link] (1 responses)

I believe the point is that you can access / search / manage / view compressed files with readily available shell tools.

zgrep will search them.
zcat or zless will dump them to stdout.

lesspipe + less will allow you to view them with less, as you would any other file (lesspipe will also render numerous other file formats as straight text, which is particularly useful).

Woe unto you if you're on a system without these tools, but that's another story. Writing your own shell wrappers (scripts/functions) is trivial.

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 0:55 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I was more pointing out that tools exist that do the gunzip -c internally. I almost never use gunzip except via the z* tools and tar xf. I always appreciate shotcuts for things like that (at least those commonly available on random machines, e.g. my vim bindings don't modify core behavior that is easy to get a bad habit with when sshing around, like nnoremap jj <Esc>).

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 2:54 UTC (Sun) by THe_ZiPMaN (guest, #27460) [Link] (3 responses)

Who did say you cannot use text format?
You could even use standard syslog with only one minor change; it's enough to add at the end of the line another piece of text with the hash computed concatenating the syslog line with the hash of the previous line.

line 1: Timestamp1 Message1 Hash(line1 + salt)
line 2: Timestamp2 Message2 Hash(line2 + hash1)
line 3: Timestamp3 Message3 Hash(line3 + hash2)
line 4: Timestamp4 Message4 Hash(line4 + hash3)
...

3 years ago, in Italy, a new discussed law required to guarantee the integrity of the logs, and at that time I started to write a patch for rsyslog to implement exactly the above schema. The law wasn't approved in that way so I abandoned the develop of such thing, but it's not too complicated to write it.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 6:51 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

one interesting issue with using a hash like this (both your proposal and the journal proposal) is that it will cause problems when logs get sent over the network.

the logs may end up arriving in a different order than you sent them for a number of reasons (UDP messages can pass each other on a network, you coudl send some logs to a relay box that fails before it re-sends them and the backup relay box sends newer logs on first, etc)

also if you are combining logs from many systems on one box, the you now have many different hash chains to track (and things get even more fun if you have more than one server sending with the same ID, something that happens in real life, even though it's a really bad idea)

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 14:04 UTC (Sun) by THe_ZiPMaN (guest, #27460) [Link]

No, that's not the case because you are mixing 2 different problems.

Syslog already allows for secure, autenticated and reliable network communications, particulary using the RELP protocol (rfc 3195), so ther's no need to add an overhead to that phase.

The hash is then added only when the message is saved on the disk, non before sending it over the network. It's scope is only to guarantee that the log has not been modified after being written, not to guarantee it's immutable during network paths.
A "continuous hashing" calculated on a per file basis (or per DB if you prefer to save data in a database) is a possible cheap solution to this problem.

The solution is not complete as a journal, but it's really really simple to add to existing daemons (rsyslog and syslog-ng), does not require any invasive operation on the syslog infrastructure, allows a complete backward compatibility and satisfy a big part of the problem addressed by journal in a simple.

The Journal - a proposed syslog dropout?

Posted Dec 1, 2011 17:06 UTC (Thu) by gvy (guest, #11981) [Link]

It's weird that neither msyslogd or the preceding ssyslogd are even mentioned -- it's much better to reinvent the wheel at least knowing of prior art, /and/ to criticize the process thereof either. :)

Thanks Solar Designer, here's the link (in Russian).

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 19:39 UTC (Fri) by jmorris42 (guest, #2203) [Link] (47 responses)

More importantly it is of a piece with the rest of this cast of fools past work. They despise UNIX, the philosophy behind it and the horse it rode in on. Each project tends to be designed to attack a core UNIX idea. This is a direct attack on Everything is a File. They prefer the Windows philosophy of Everything is an API.

These incidents point to a common problem in Linux land. We have allowed immigrants commit access to key parts of our cultural heritage before we assimilated them to our ways. Unless we find a way to reverse this trend we are soon going to find ourselves strangers in our own land, doing things the Windows Way because they fled Windows and remade Linux in it's image due to their superior numbers.

Not trying to get political, but this notion does have a real world analogy. Look at population and voting patterns in the Western US. People are fleeing the failed Blue coastal states like CA yet bringing the same failed Blue state ideas with them, turning the inner band of states Blue as the migrate in and repeating the cycle as they flee in ever increasing numbers. Bringing it back, these Windows refugees are doing the same thing. Windows doesn't suck because Microsoft is incompetent (they have really bright folks) the ideas that underlie it suck. They are now bringing the suck into Linux because they won't face the fact the ideas THEY hold suck. That THEY are the problem.

Credit where it is due, systemd does actually work and is documented, unlike, example at random..., pulseaudio. But this syslog replacement is a defective solution in search of a problem that doesn't even exist. Syslog supports logging to the network, that is the only truly secure solution to any security problems.

That Fedora is picking this turkey up just confirms that as a current Fedora user that it is time to GO.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:14 UTC (Fri) by raven667 (subscriber, #5198) [Link] (1 responses)

We have allowed immigrants commit access to key parts of our cultural heritage...[snip]

Really? We can't discuss ideas based on their technical merit because that would be offensive to our "UNIX cultural heritage"? Really?! That's the most asinine thing I have read all day.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 3:52 UTC (Sat) by wmf (guest, #33791) [Link]

I wouldn't use the word "heritage" because that sounds like a fallacious appeal to tradition, but Unix is definitely a culture and it can be useful to think in those terms. (Check out http://www.faqs.org/docs/artu/philosophychapter.html if you haven't already.)

There are now several projects that are using Linux but aren't really part of Unix culture (OLPC, Android, Ubuntu, to a lesser extent Fedora with all its Lennartisms). I think it's arrogant to assume that Unix has already gotten everything right so I support all this exploration, but perhaps it deserves a warning label like "Fedora's not Linux as you know it; Slackware is that way".

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:17 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (4 responses)

> They prefer the Windows philosophy of Everything is an API.

That's odd. I found systemd to be more EIAF than SysV init. To change the default TTYs from 1-6 to just 1 and 2 with SysV init, I had to sed a sysconfig file. For F15, I deleted the tty3-6 files from systemd's /etc directory. Of course, now it defaults to just 1 and auto spawns, which I like much better. You can enable services by symlinking files /lib into /etc, not by the chkconfig API.

> Credit where it is due, systemd does actually work and is documented, unlike, example at random..., pulseaudio.

Pulseaudio works for me (though the first audio-using app may not work due to a race condition, it's my fault for wanting to do it closer to on-demand than running things I don't need...should probably file a bug). True, the documentation could be better, but what project doesn't have that problem?

I am a little disappointed that it's aiming for F17 instead of F18, but I'll see how it works out on my Rawhide box at home first. Hopefully the binary format is dropped first...

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 8:37 UTC (Tue) by Seegras (guest, #20463) [Link] (2 responses)

> For F15, I deleted the tty3-6 files from systemd's /etc directory.

That sounds more like D.J.Bernsteinish...

And...

Posted Nov 22, 2011 12:58 UTC (Tue) by khim (subscriber, #9252) [Link]

You said it like it's a bad thing.

I actually respect D.J.Bernsteinish very much. I refuse to use his creations not because they are bad, but because he insists that he, alone, knows the truth. He's right in about 95% or may be even 99% cases, but because sometimes he is wrong and you he does not tolerate deviations at all... well, it's safer to not even try.

But I fail to see why these same approaches used by people who don't share DJB "you have no right to touch my baby" should be bad...

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 16:38 UTC (Tue) by mathstuf (subscriber, #69389) [Link]

I was pointing out that deleting files to disable things is more EIAF than running sed on a sysconfig file.

Out of curiosity, what makes it djb-ish?

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 1:52 UTC (Wed) by Baylink (guest, #755) [Link]

> You can enable services by symlinking files /lib into /etc, not by the chkconfig API.

You *do* know what chkconfig (and insserv under it) actually does, right?

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:59 UTC (Fri) by AndreE (guest, #60148) [Link] (1 responses)

Immigrants who try to improve things are far more valuable than self-proclaimed cultural guardians who rely on slinging mud and are far removed from rational discourse.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 0:09 UTC (Mon) by Tara_Li (guest, #26706) [Link]

Here's the thing - it's not really an improvement.

It's like a friend of mine who had problem with her Kindle. She could not get it to connect to her wireless network. She couldn't, however, manage it - and the Kindle's only advice was to check the Amazon website. *ALL* of the Kindle's help files are stored on the Amazon servers - but if you need help because you can't access those servers - you are out of luck. Here lies the problem with much of the current path of "progress". Can't find the file you had stored in the cloud? Access the cloud help information - stored in the cloud! Can't get online? The help you need is stored - online. Can't get your computer to boot? The help you need - is on that CD you need the *BLEEEEEEEEEEEEEEEEEEEEPING* computer to access!

Exploring new paths is a good thing - but this isn't a new path. This is a path that's already been explored - and found wanting.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:41 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (3 responses)

> People are fleeing the failed Blue coastal states like CA yet bringing the same failed Blue state ideas with them, turning the inner band of states Blue as the migrate in and repeating the cycle as they flee in ever increasing numbers.

Let's keep politics out of this discussion. Not everyone agrees with your assessment of our recent history, and in any case, the analogy adds more heat than light.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:22 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (1 responses)

> the analogy adds more heat than light

I like this analogy. Mind if I later use this quotemstr? ;)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 3:43 UTC (Sat) by k8to (guest, #15413) [Link]

That's a classic, though well applied.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 1:54 UTC (Wed) by Baylink (guest, #755) [Link]

Yes, we have enough political and religious wars in computing about topics having nothing to do with politics and religion; I don't see that we need to discuss the actual items, either.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 0:04 UTC (Sat) by cmccabe (guest, #60281) [Link] (5 responses)

I'm going to have to disagree. Security is a pretty major issue, and anything that improves system security on desktop and server Linux is probably a win.

It's a shame that we will lose the simplicity of the plain-text syslog format. But syslogs are usually compressed using gzip anyway. So essentially for me, all this means is that I use <magic-lennart-tool> instead of gzcat as the first part of my shell command.

The big issue that I see is that a lot of system administrators will treat this as magic security dust, and not realize that they need to periodically save those hashes to a remote (and secure!) system in order to get any security benefit.

I also hope Lennart and co. realize the absolute necessity of backwards compatibility for the on-disk format. It would really embitter a lot of system administrators if their old logs became unreadable after upgrading to the shiniest new version. But assuming this is managed well, I don't see any reason why this couldn't be a good idea.

P.S. Providing a FUSE filesystem that can read these files is a good idea!

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 16:52 UTC (Sat) by jmorris42 (guest, #2203) [Link] (4 responses)

> I'm going to have to disagree. Security is a pretty major issue, and
> anything that improves system security on desktop and server Linux is
> probably a win.

Except it isn't. It greatly complicates the system for little or no net gain in security. First rule: if the host might be compromised you must not trust it. They admit the existence of this rule and admit the only solution is to put some of the checkpoints on a different system. They fail to see the obvious, put them ALL on a different system. The people who designed syslog understood security better than Lennart does.

That is The UNIX Way. We learn what works and keep it.

"The access is granted by a shared library and a command line tool."

Haven't we seen this story enough times to see the pattern? The tool will be minimal, just for quick use by UNIX diehards to shut them up, while all 'serious' use will be expected to be through the API/library.

For a 'real programmer' it doesn't matter, but UNIX philosophy envisions a spectrum of skills, not a binary user/developer divide. Windows philosophy on the other hand is different. Remember that a key Microsoft design goal for the registry was to fit into their overarching project to end 'power users.' Professional developers write code, users click on widgets and nothing in the middle.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 23:07 UTC (Sat) by cmccabe (guest, #60281) [Link] (3 responses)

The problem with putting all your syslogs on a remote system is that it can eat quite a bit of bandwidth. I honestly haven't seen any systems actually configured for remote syslog in my time in the industry. In contrast, sending a single hash to a central single-purpose server every hour doesn't raise any bandwidth issues.

This whole project was motivated by the administrators' inability to trust the kernel.org syslogs after the breakin. It's always a good sign when a project solves a real problem that some smart people encountered.

> Haven't we seen this story enough times to see the pattern? The tool will
> be minimal, just for quick use by UNIX diehards to shut them up, while all
> 'serious' use will be expected to be through the API/library.

I agree that there is a lot of DBUS-itis and shared-library-itis out there, especially from the GNOME devs. (The GNOME devs have done a lot of good work as well, but I have to call it like I see it.) But I don't think this falls into that category.

As long as 'logger' continues to work, and there is some way to cat the syslog to a file, it will be as easy to use syslog on the command line as it ever was. I mean, syslogd has been a daemon and syslog(2) has been an API since the seventies; you can hardly complain that Lennart is adding yet more daemons and APIs to the system. If you add a FUSE interface that exposes a writable file named /var/log/syslog that has all the log entries, it's as scriptable as it ever was.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 21:14 UTC (Wed) by jeremiah (subscriber, #1221) [Link] (2 responses)

> I honestly haven't seen any systems actually configured for remote syslog in my time in the industry.

um... We do this on all of our servers. I love being able to tail one single file and get everything going on. Now, we don't do this outside of our local network. But a gig network seems to be able to handle the bandwidth just fine. I'd love a nice tool that would allow us to securely archive log files. And to me it seems that Journal is trying to do both, Archive, and transport. I don't think these two processes should effect each other. We also find that Splunk works pretty for the archiving and searching. As well as getting a hold of log files that don't output to syslog ie. request.log.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 23:36 UTC (Wed) by dlang (guest, #313) [Link] (1 responses)

splunk is great for searching logs (I have a large cluster of machines for doing exactly this), but in terms of gathering and transporting logs, it's far from the best.

take a look at syslog-ng and rsyslog and the options they have to gather data from log files written by other apps.

The Journal - a proposed syslog replacement

Posted Nov 25, 2011 15:02 UTC (Fri) by jeremiah (subscriber, #1221) [Link]

those are on my list. It's mostly an issue of upgrading servers at this point to version that support them.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:01 UTC (Sat) by HelloWorld (guest, #56129) [Link] (12 responses)

You know what? Nobody in the real world cares about the so-called "Unix philosophy". Rob Pike said it best: "Not only is UNIX dead, it's starting to smell really bad."
What people do care about is problems being solved, like the problems that Lennart documented in his proposal. And the thing that the self-proclaimed guardians of the "Unix philosophy" all have in common is that they don't solve problems but deny them, point to 20-year-old would-be solutions that are known to suck and blather about things such as "cultural heritage". That's why Lennart's projects keep being adopted by pretty much all major distros, despite "Unix philosophers" whining and bitching about it.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:16 UTC (Sat) by jubal (subscriber, #67202) [Link] (5 responses)

Mr World, excuse me, are you a sysadmin?

The quoted journald document promises:
– breaking compatibility with existing standard and protocols, used not only by Linux systems, but by almost everything else,
– coupling journald with systemd, thus making it non-portable, thus locking out non-Linux systems, all Linux systems not using systemd as an init replacement and most appliances,
– using *undocumented* binary format for storing compressed and possibly encrypted and/or signed data.

Frankly, these three issues nullify all other described advantages.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:55 UTC (Sat) by HelloWorld (guest, #56129) [Link] (3 responses)

> – breaking compatibility with existing standard and protocols, used not only by Linux systems, but by almost everything else,
Yeah, except that it doesn't. From said document: "Data can be generated from a variety of sources: [...], userspace messages generated with syslog(3)"

> – coupling journald with systemd, thus making it non-portable, thus locking out non-Linux systems, all Linux systems not using systemd as an init replacement and most appliances,
If this is a problem for you, then use something else. Just don't expect Lennart to make his life harder by supporting non-Linux and non-systemd boxes.

> – using *undocumented* binary format for storing compressed and possibly encrypted and/or signed data.
Why would one document a file format that isn't stable and may be changed in the future? Especially when you can just use the library to read it...

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 3:40 UTC (Sat) by dskoll (subscriber, #1630) [Link] (2 responses)

Why would one document a file format that isn't stable and may be changed in the future? Especially when you can just use the library to read it...

We're talking about logs here. They could contain important information needed a long time after it is generated, possibly even for legal reasons. If you are given a court order to produce two-year-old logs and can no longer read your old logs because the file format has changed in the meantime... you will not be very happy (and neither will the court.)

Indexing log files for performance is not a bad idea. Making them tamper-evident is a good idea too. But both of those can be done by adding to existing plain-text log file formats. You don't need to throw that out in favour of a binary format.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 20:59 UTC (Sat) by HelloWorld (guest, #56129) [Link] (1 responses)

> We're talking about logs here. They could contain important information needed a long time after it is generated, possibly even for legal reasons. If you are given a court order to produce two-year-old logs and can no longer read your old logs because the file format has changed in the meantime...
I trust Lennart to develop a solution to this, should this case actually arise. There are two obvious solutions. One is to implement a conversion tool after a format change that will be maintained indefinitely (a tool that simply reads a log file and outputs it again in a new format is unlikely to require a lot of maintenance). Another one is to simply keep the code for reading pre-format-change log files in the library, so that everything just works even with newer tools as long as they use the library.
Also, one should keep in mind that journald is a very young project. I'd guess they'll document the file format eventually when they're sure it does what they need it to do.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 14:40 UTC (Mon) by regala (guest, #15745) [Link]

> journald is a very young project. I'd guess they'll document the file format eventually when they're sure it does what they need it to do

given we're talking about Lennart and Kay, it will never do what they need it to do long enough to call it "stable format". ;)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 7:43 UTC (Sat) by slashdot (guest, #22014) [Link]

Being open source makes these problems all easily fixable:
- Anyone can add support for whatever "standard and protocol" he wants
- Anyone can decouple it from systemd
- Anyone can read the source and document the format

But Red Hat ought to hire a PR spokesman for Poettering, to stop him from making these unnecessarily divisive statements, since he could just say that they "haven't yet had the time" to do those things, rather than outright saying "HAHA NO!" to reasonable points like those.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:38 UTC (Sat) by nybble41 (subscriber, #55106) [Link]

> What people do care about is problems being solved, like the problems that Lennart documented in his proposal.

We also care that these solutions don't reintroduce other problems which we've spent quite a bit of effort over the years resolving. That's what the "UNIX philosophy" is really about--learning from our past mistakes.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:58 UTC (Sat) by cmccabe (guest, #60281) [Link]

You are really taking that quote out of context.

Rob Pike created Plan9, which in tried to take UNIX to its logical conclusion. Plan9 tries to really make everything a file, including sockets. It uses the filesystem to control access to resources. And so on. The reason why UNIX "smell[ed] really bad" to Rob Pike is because Plan9 never caught on, because people didn't want to break compatibility. It really is a shame because Plan9 was the superior OS.

That being said, I am in favor of the proposal here-- at least in the context of server systems.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:25 UTC (Sat) by rilder (guest, #59804) [Link] (3 responses)

It is a philosophy, not an implementation. Read that again. It is like a specification. You can write a new tool tomorrow conforming to unix philosophy (and many have), and FYI Plan9 has adopted unix philosophy (actually its both way) and Rob Pike, who you are so fancifully quoting is a one of the main developers of Plan9. Go language from Google, is inspired by Plan9 semantics.

"That's why Lennart's projects keep being adopted by pretty much all major distros, despite "Unix philosophers" whining and bitching about it."

Yes, he is doing good work but don't jump the gun. Fedora (the home distro of systemd) is pretty shaky about fully adopting it. RHEL (where the money lies) is not even close. SUSE has delayed it. Ubuntu is going with upstart right ?

If you want to PR work, don't use LWN, use digg -- you will find plenty there who will buy your stuff.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 14:37 UTC (Sat) by vonbrand (subscriber, #4458) [Link]

I see no "shakiness" in Fedora's uptake of systemd; quite the contrary, the conversion of legacy sysvinit scipts is comming along nicely. That a very conservative distribution like RHEL hasn't yet adopted systemd is no surprise, AFAIU RHEL 6 was forked off around Fedora 14, way before systemd was solid enough. Debian prefers playing with non-Linux kernels, so systemd is out for them. Ubuntu is home to upstart, so it in little surprise it will take time to win them over.

Again: you are trying to hard...

Posted Nov 19, 2011 14:49 UTC (Sat) by khim (subscriber, #9252) [Link] (1 responses)

"That's why Lennart's projects keep being adopted by pretty much all major distros, despite "Unix philosophers" whining and bitching about it."

Yes, he is doing good work but don't jump the gun. Fedora (the home distro of systemd) is pretty shaky about fully adopting it. RHEL (where the money lies) is not even close. SUSE has delayed it. Ubuntu is going with upstart right ?

This is funny. Systemd is not the first large Lennart's project. More like third (not counting minor ones). First two (avahi and pulseaudio) were met by similar comments as systemd yet now it's hard to find distribution which have not adopted them.

And systemd adoption is in pretty good shape if you'll recall that it's pretty young project still.

Again: you are trying to hard...

Posted Nov 21, 2011 7:01 UTC (Mon) by yoe (guest, #25743) [Link]

Pulseaudio is crap. It hides the devices that ALSA exports for little purpose, and IME introduces more problems than it tries to solve.

Avahi is interesting in theory; but when I try to say "ping celtic.local" on my home network, there's about a 50-50 chance that it'll find what the right IP address is.

I'm not using systemd, and won't change to it, either. Enabling or disabling a sysv init scripts requires moving about a few symlinks. What more does one need?

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 6:51 UTC (Sun) by skissane (subscriber, #38675) [Link] (10 responses)

I disagree completely. I think the UNIX emphasis on plain text, while it had its value in its original historical context of trying to get something working quickly, it holds us back to keep it like it was some kind of unquestionable religious dogma.

What is bad about binary formats? Nothing inherently wrong with them. Sure, if they are poorly designed, poorly documented, lack good tools, etc., then no doubt they can cause a lot of grief, but those are problems with poorly designed and poorly implemented binary formats, not with binary formats per se.

What I would like to see, is a simple, flexible, self-describing, binary format. ASN.1 or binary XML are both good places to start, although I think both suffer from useless extra complexity. Maybe something like Google protocol buffers would be a good choice?

If you really want text, you can easily write a tool to dump the binary out to text. Hey, you could have a format with two official serialisations, a text-based and a binary one.

But the problem I see with most Unix-style plain text formats, is every one is different. There is a lack of consistency/standardisation, especially when it comes to how to escape special characters, etc.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 7:11 UTC (Sun) by dlang (guest, #313) [Link] (5 responses)

What makes you think that binary formats would be any more standardized than the existing text formats?

There are ways to do self-describing text formats, but developers don't do it.

With text formats it's a lot easier to examine the file and reverse engineer the format than it is from a binary format.

Damaged/lost files are also a place where text files are easier to recover than binary files.

In theory none of this should ever be needed and binary files are just fine. But this is where the quote "in theory, theory and practice are the same, but in practice they are not" applies

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 7:55 UTC (Sun) by skissane (subscriber, #38675) [Link] (2 responses)

I think, if you want to stick to text, it would be much better if tools output in some standardised text format, e.g. XML, JSON, YAML, etc.

But then, once you have a standardised text format, why not save some space and processing time with an efficient binary serialization of XML/JSON/YAML/what-have-you?

And then you can have a tool, e.g. bin2text, which reads the binary format on standard input and writes the text format on standard output, and vice versa. With such a tool, reverse-engineering/examination should be no harder than with a plain text format.

I think this would be better than both (1) the rather poorly-defined text formats used at present by many tools and (2) binary is more efficient than text.

The point you make about trying to recover from corrupted files being easier when they are in text is true, but how often do you have to deal with that? If there were provided some good quality libraries (say C with bindings to other common languages such as C++, Java, Perl, Python, etc.), the odds of a corrupt file due to programmer error should be low, outside of some mid-transaction failure scenario. And if we had transaction support in the library or the underlying filesystem, we could avoid that problem too.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 22:12 UTC (Wed) by cas (guest, #52554) [Link] (1 responses)

But then, once you have a standardised text format, why not save some space and processing time with an efficient binary serialization of XML/JSON/YAML/what-have-you?

  • space is irrelevant these days. multi-terabyte disks are cheap, readily available consumer products
  • in my experience, XML etc *greatly* complicates most jobs, increasing processing time, difficulty of programming, difficulty of understanding WTF is going on. it turns what should be a quick and simple one liner to extract information into a multi-hour programming effort reading API docs, parsing the data in whatever obscured format it's in (and possibly parsing other things like the DTD).
  • it's completely missing the point of XML, JSON, YAML etc - they're data *transfer* protocols, not data *storage* methods. their purpose is to unambiguosly transfer data from one system to another, not to store data in yet another obscure special purpose file format
  • it violates the KISS principle. but, then, everything Lennart is involved in does that.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 23:38 UTC (Wed) by dlang (guest, #313) [Link]

even multi-terabyte disks are expensive if you need a lot of them.

I store my logs at 10:1 compression (or better) and I still have 10's of TB of logs to deal with.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 8:12 UTC (Sun) by drag (guest, #31333) [Link] (1 responses)

Well they tried. It ended up being XML. :(

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 19:27 UTC (Sun) by skissane (subscriber, #38675) [Link]

The problem with XML is:
1) a syntax originally designed for marking up documents got reused for
data, with the result that XML provides distinctions which are
unnecessary for data purposes (e.g. element vs. attribute distinction)
2) historical baggage, e.g. DTDs
Certainly you can define new syntaxes which avoid those two problems that
XML has. On the other hand, whatever its warts, XML is an industry standard,
and practical considerations often imply choosing the imperfect industry
standard over some technically superior but rarely used alternative.

But, JSON is quite common now, and addresses some of the issues above. (But
I think it has its own deficiencies too)

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 14:50 UTC (Mon) by sorpigal (subscriber, #36106) [Link] (1 responses)

> I think the UNIX emphasis on plain text, while it had its value in its original historical context of trying to get something working quickly, it holds us back to keep it like it was some kind of unquestionable religious dogma.

It seems logical to say that a standard binary format is just as good as a standard text format, which is why this is carefully documented as one bullet point in the Unix philosophy: use text. The extra overhead was a lot worse back when this idea was developed, yet they stuck with it anyway. If you don't meditate on "Why" then you will invent a non-text system that's "as good or better" than text and suffer as a result. You can either accept received wisdom and "just do it," ignore this sage advice at your own peril or embrace the idea wholeheartedly.

It's not true...

Posted Nov 21, 2011 18:41 UTC (Mon) by khim (subscriber, #9252) [Link]

It seems logical to say that a standard binary format is just as good as a standard text format

This is not true. To pull useful data from corrupted text file you need a human being. To pull it from binary format with embedded CRC checks you only need to rigorpously use one very fast function. Sure, this only protects the data from accidental changes (zero-out pages in the middle, bitflips, that kind of things) but the funny thing that when people describe how they heroically recover data from corrupt disk or filesystem it's almost always from accidental corruption.

The extra overhead was a lot worse back when this idea was developed, yet they stuck with it anyway.

Actually it's much worse today. When you had hundreds of kilobytes or may be few megabytes of logs - human as "recovery system" works. When there are gigabytes, terabytes and petabytes of logs - it's hopeless.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 22:03 UTC (Wed) by cas (guest, #52554) [Link] (1 responses)

what is wrong with binary formats is that I have to use a *different* tool for every different binary format. i'll never use any of them often enough to truly master them and, worse, anything i learn about their usage is trapped within that single usage context.

with plain text formats I can use the *same* collection of tools for everything, and every new thing i learn about sh or sed or grep or awk or perl or whatever is automatically useful in hundreds of other contexts, not just the context in which i originally learnt it.

binary formats suck.

The Journal - a proposed syslog replacement

Posted Nov 27, 2011 2:23 UTC (Sun) by HelloWorld (guest, #56129) [Link]

> with plain text formats I can use the *same* collection of tools for everything, and every new thing i learn about sh or sed or grep or awk or perl or whatever is automatically useful in hundreds of other contexts, not just the context in which i originally learnt it.
I found that in most cases, it is easier to learn a new tool that is specialized for the job at hand than to try to get "standard unix" tools to do what you want, especially if you want to do it in a robust and maintainable way. For example, people keep asking again and again how to handle some XML format with things like sed or awk, which is just a bad idea given the existence of specialized tools like xmlstarlet.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 14:41 UTC (Sun) by jamesh (guest, #1159) [Link] (2 responses)

If you're talking about the traditional method of sending log messages as simple UDP packets, it might stop an attacker from altering historic logs on the system that generated the logs, but it has its own problems.

Log messages can get lost and since there is no authentication, log messages can be forged. And if the attacker manages to break into the log aggregation server, then you've got the same problems as before.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 20:48 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

as soon as you allow machines to send messages over the network it is going to be possible for messages to be forged. the receiving machine has no way of knowing what is happening inside the sending machine and if the data it is getting is correct or not.

This is not entirely true...

Posted Nov 20, 2011 20:52 UTC (Sun) by khim (subscriber, #9252) [Link]

Actually it's exactly the same as with local logging: of correct authentification scheme is used (i.e.: not syslog's UDP) then they can only be forged after takeover. The messages right before takeover are the most valuable. Sure, you must understand that some messages are are probably forged and some are not, but this is always the case when forensic analisys is done.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 5:12 UTC (Wed) by chuckles (guest, #41964) [Link]

Please keep your politics out of this.

Not interested in hearing them.

thank you.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:47 UTC (Sat) by robinst (guest, #61173) [Link] (5 responses)

Do you also object to git's repository format then?

If this manages to take off, there will be other tools that can read the data, just as other tools can read/write git's repository format.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 17:05 UTC (Sat) by jmorris42 (guest, #2203) [Link] (1 responses)

> Do you also object to git's repository format then?

Nope. But note that a git repo that isn't cloned anywhere would be vulnerable to an attacker simply rewriting the hashes and thus being able to alter a repo. On a modern CPU hashing is fairly fast so any project that isn't as huge as the kernel could be compromised. That isn't a problem because anything important is replicated. That is the key to security, the hashing just improves it.

For syslog, replication alone is enough to solve the problem. Adding crypto foolishness and a bunch of binary fluff just makes it more complicated and reduces security. Put a log server somewhere on your network with only the syslog port open. If you are really paranoid you could store sha256 sums of each log as you rotate and pack it away on yet another machine or better on paper. Or just log the important entries on a line printer in real time as others have already suggested. Use a line printer without reverse paper feed and it is physically impossible to change the permanent record.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 22:32 UTC (Sat) by robinst (guest, #61173) [Link]

I wasn't talking about security, just replying to this comment:

> So it replaces text files that can be read and processed with the standard UNIX tools with an undocumented binary format that can only be read by a single tool?
>
> Think I'll pass.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 1:56 UTC (Sun) by dlang (guest, #313) [Link] (2 responses)

the git repository format is well documented

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 13:19 UTC (Sun) by robinst (guest, #61173) [Link] (1 responses)

But was it well-documented from the very beginning?

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 20:43 UTC (Sun) by dlang (guest, #313) [Link]

yes, the git on-disk format was well documented from the very first posts.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 17:24 UTC (Mon) by lally (guest, #71211) [Link]

Agreed, but this is fixable. Add two hash fields (in ascii, git-style) to your normal log header. The only issue is how many bits you can assign the hash before it just makes each field too wide to be tolerable :-)

Would two 16-char fields be enough?

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:33 UTC (Fri) by zoonoo (guest, #80519) [Link] (8 responses)

The systemd crew rocks on. Onto the next target.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:43 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (7 responses)

The systemd people are trying to fix something that's not broken. Binary logging is unnecessary and harmful. Windows uses binary logging for everything and is worse off for it. The beauty of plain-text logs is the ability to use normal tools and easily compose pipelines. When you switch to a binary log format, you either lose these benefits, or have to convert to text before manipulating logs anyway, in which case you lose all the purported benefits of binary logging.

If you want cryptographic signatures in your log file, you can include a signed hash field _in the text format_ without having to discard 30 years of refinement.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:48 UTC (Fri) by zoonoo (guest, #80519) [Link] (2 responses)

Hey, I completely agree with you.
My statement was meant to be ironic.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:44 UTC (Fri) by mgedmin (guest, #34497) [Link] (1 responses)

Didn't you know irony and sarcasm do not work on the Internet?

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 4:24 UTC (Sat) by ewan (guest, #5533) [Link]

No way! Sarcasm is totally awesome on the internet.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:53 UTC (Fri) by cmorgan (guest, #71980) [Link] (2 responses)

This is a simplistic way to think about it.

Binary logs have several advantages. Data is split into fields, making it easier to analyze programmatically. Data that isn't easily represented by text or that would span several lines of text could be included. Searching is much easier.

As long as there are easy to use command line tools that will generate text output (maybe call the tool dmesg or whatever) the benefits of using a binary only format would be mostly transparent to end users.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:00 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (1 responses)

> Data is split into fields, making it easier to analyze programmatically.

syslog lines are broken into fields too. How often do you need to break down these logs _further_? The theoretical benefit of programmatic analysis doesn't seem worth the trouble of changing everything and the persistent trouble of a binary log format.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:05 UTC (Fri) by cmorgan (guest, #71980) [Link]

> syslog lines are broken into fields too. How often do you need to break down these logs _further_? The theoretical benefit of programmatic analysis doesn't seem worth the trouble of changing everything and the persistent trouble of a binary log format.

Some people do have a use for breaking the data down further. Being able to organize log entries by nested modules/submodules.

Try to have an open mind. systemd was a great idea for several reasons. So much so that most distributions have adopted or plan to adopt it. If this idea solves issues it too will be adopted. We can't forget that sometimes things evolve. In this case, would you care as much if a tool would provide you with the same looking output that you see today? What difference would there be in that case? Everyone can win (or at least break even).

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 10:26 UTC (Sun) by misc (subscriber, #73730) [Link]

The FAQ clearly say that Journal and regular syslog are not mutually exclusive ( 3rd question ). So I fail to see why people find a dichotomy when there is clearly nothing.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:37 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (4 responses)

Why not just send syslog entries to another machine where they can't be changed? Or use the classic approach of sending syslog entries to a line printer, where an attacker clearly can't get at the old entries.

Let's favor low-tech solutions over high-tech crypto-heavy ones here.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 1:03 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

The problem with a line printer is simple, and it's the same as a simple text logfile.

If your system is spewing log entries, the "signal" - warning signs of a hack - get lost in the noise.

At least with a logfile you can grep for trouble (although really you want to do the opposite, anti-grep for stuff you know about).

But whatever you do it's a tricky problem, although I would tend to agree with another poster - just add a signed hash field to the current text format.

Cheers,
Wol

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:35 UTC (Sat) by PO8 (guest, #41661) [Link]

The Journal seems to require magic storage HW for the current hash. Why not just write the whole log file there? It really isn't hard in 2011 to hook a microcontroller with an SD card slot and a USB port to the host. Add some software and you've got a cheap secure append-only store that can hold 16GB. You could put a reset switch on the package so that if you were to fill it up (hing: you won't) you could clear it and start over.

You get to keep your logs as textfiles, you can search the secure copy, almost no software has to change. Seems like The Journal done right to me.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 21:50 UTC (Wed) by jd (guest, #26381) [Link] (1 responses)

If the syslog files are in a fully logging filesystem, every version is retained, allowing you to recover the missing data without needing any specialist anythings.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 23:34 UTC (Wed) by dlang (guest, #313) [Link]

even with today's disk sizes, nobody runs a fully logging filesystem. The inability to overwrite data will fill any disk up very quickly.

a 'logging filesystem' will give you a few older versions, depending on settings, but hardly every older version.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:48 UTC (Fri) by dlang (guest, #313) [Link] (3 responses)

this will not scale well to large numbers of events. It's far better to send the logs elsewhere.

and in any case, crypto signing the logs does nothing to prevent the attacker from erasing the files. If each log entry is signed individually then you don't even prevent the attacker from erasing individual log entries. you would have to sign each log entry on top of the prior one to detect gaps.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:21 UTC (Fri) by nybble41 (subscriber, #55106) [Link] (2 responses)

> If each log entry is signed individually then you don't even prevent the attacker from erasing individual log entries. you would have to sign each log entry on top of the prior one to detect gaps.

Which is exactly what is being suggested. The top-level hash is based on the previous hash plus the new log entry. Yes, an attacker could still delete the logs, but the idea is to make them tamper-*evident*, not tamper-*proof*.

Still, as you stated, you can tamper-proof the logs by sending them to a dedicated, "bullet-proof" logging server, or some form of write-once local storage. Remote logging, at least, is essentially a solved problem.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:51 UTC (Fri) by NAR (subscriber, #1313) [Link] (1 responses)

but the idea is to make them tamper-*evident*, not tamper-*proof*.

Somehow I doubt this will achieve this. I bet most system administrators will use the local tools that check the signature - and the rootkits will just replace these tools to not warn the administrator...

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:16 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Given the following scenario, how do we ensure detection?

- No network logging
- Verification binary replaced
- rpm changed so that a local rpm -V hides detection
- Files changed such that an external rpm's -V hides detection
- Scrubbing logged events of the above (so that logging in and yum.log are silent (yum detects when the rpmdb changed without yum and yum logs any changes it makes))

With local root, this is all possible.

The first step that I can see is to add expected log messages. Every X minutes a new log message is made with a specific message. The attacker can no longer just nuke the end of the log because then expected messages are missing.

Now the attacker must rewrite the logs. I don't know how to prevent this and it is probably impossible (as root can write whatever they want). It's a higher barrier to go undetected. Given that there are those who will go to varying lengths to attack your systems, how many does the higher barrier deter that weren't before? Obviously, there are those that don't care and will go to *any* lengths, so we can't win them all.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:48 UTC (Fri) by dlazaro (guest, #38702) [Link] (12 responses)

Please, Lennart, stop messing with the system. You are over-thinking and over-complicating everything. And your unneeded changed are arriving at my systems undocumented. Enough is enough.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:52 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (6 responses)

Undocumented? If you're talking about systemd, it is one of the better documented projects. If your distro isn't announcing the change, that's not something you can blame on Lennart.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:03 UTC (Fri) by jubal (subscriber, #67202) [Link] (5 responses)

No, the binary format that the journald will use won't be documented. Read the google doc that describes the design and FAQ. Also, journald. It simply had to be a generic name, hadn't it?

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:25 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (4 responses)

Ah, I got mixed up by the "are arriving" tense. I don't expect journald is anywhere other than their machines yet.

The undocumented format is annoying.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:38 UTC (Fri) by jake (editor, #205) [Link] (3 responses)

> The undocumented format is annoying.

My sense is that they are trying not to get trapped into backward compatibility games by having other programs that read/write the data without using the supplied library. I don't think they plan to deliberately obfuscate the format (and they will be providing code to both read and write it), but my guess is that they don't want to get stuck into a particular format forevermore because someone wrote a program that grabbed the 12th byte of every record and decided that some data would always appear there.

It's essentially the ABI problem that the kernel runs into, and that sometimes makes it difficult to change things in the kernel (like tracepoints for example).

jake

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:47 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (2 responses)

There still needs to be some backwards compatability mechanism because the tool will need to read logs in the older formats (at least, I sincerely hope so!). In that case, not documenting the format is silly. If the format isn't versioned, then the format is silly.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:29 UTC (Fri) by elanthis (guest, #6227) [Link] (1 responses)

The document states this is temporary. The feature isn't even in the main systemd code yet, but is on a separate branch. They're avoiding specifying a format until they're sure that it's nailed down, does what they need, doesn't have some unforseen problem that could only be found through widespread testing, etc.

Once it's a stable feature, expect the format to be documented. Until then, the documentation is available in the .c files for anyone who has some need to avoid the provided library interface.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:57 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

> They're avoiding specifying a format until they're sure that it's nailed down, does what they need, doesn't have some unforseen problem that could only be found through widespread testing, etc.

This is understandable, even appreciated. Having multiple minor versions of varying usefulness indefinitely supported just because they existed for a commit or two during the inital format designs would be crazy to expect.

> Once it's a stable feature, expect the format to be documented.

FTA:
> At this point we have no intention to standardize the format…. …We might document the on-disk format eventually…

This is not a strong a guarantee as I'd like for this and certainly not something that would range a high level of expectancy from me. Before it becomes default in a major distribution, I'd like to see a format specification.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:32 UTC (Fri) by dcg (subscriber, #9198) [Link] (2 responses)

Well, speak for you. I, for one, am happy to see innovation happen in the Linux land. Unix and Linux always were about inventing new things, not pretending that all the problems that old designs have can be completely dismissed forever.

People who hate change and want to keep things like they were several decades ago can always use the BSDs...

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 11:28 UTC (Wed) by nye (subscriber, #51576) [Link] (1 responses)

>People who hate change and want to keep things like they were several decades ago can always use the BSDs...

Almost everyone dislikes user-visible change, unless there's a very slow migration path one step at a time (see KDE4, Vista, Gnome 3, etc.). Yes, this community has a larger proportion of heatseekers than the world at large, but even the FOSS world tends not to like huge revolutionary changes in one go.

More transparent changes - by which I mean things that don't change the way the user interacts with the machine on a daily basis - tend to be more welcome as there's less of a downside, so if it improves some functionality in some obvious way then it's an easier sell. For example, systemd has been far more positively received than PulseAudio since it solves real known problems without much of an effect on the end user.

Unfortunately, Lennart has a fairly bad track record. His projects tend to involve a grandiose scheme to replace some way of doing things entirely, which he then gets bored of once they reach the 90% stage.

Presumably some up-and-coming new star will come along again in a few years and decide to rewrite sound systems or init systems or syslog (or display servers, or desktop environments, or...), get them 90% done, and then get bored of them, and the cycle will begin anew.

Unfortunately, the options are either to stick with systems that are permanently 90% done, or be dismissed as a greybearded old has-been who 'hates change'.

The Journal - a proposed syslog replacement

Posted Nov 25, 2011 16:26 UTC (Fri) by HelloWorld (guest, #56129) [Link]

> Unfortunately, Lennart has a fairly bad track record.
Well, Lennart always cared about having a working migration path. PulseAudio supports the old ESD protocol and features an ALSA plugin. Systemd supports traditional init scripts just fine, and the journal can be used via the traditional syslog(3) api.
Also, I'd say that having someone like Lennart who has the guts to try to get rid of some established but broken/limited practise is an asset, not a liability.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:35 UTC (Sat) by aleXXX (subscriber, #2742) [Link] (1 responses)

I fully agree.

It seems now that once per 3 months there is an announcement "Lennart replaces another core UNIX utility", so in the not too distant future we would have Linux boxes where I don't know a thing how they work, all the stuff we learnt over years thrown away for Linux, while it still works on other systems (...FreeBSD would be the choice then I guess).

Alex

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 11:14 UTC (Wed) by nye (subscriber, #51576) [Link]

>It seems now that once per 3 months there is an announcement "Lennart replaces another core UNIX utility", so in the not too distant future we would have Linux boxes where I don't know a thing

ITYM 'Lennux boxes'.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:54 UTC (Fri) by jubal (subscriber, #67202) [Link] (11 responses)

Ah, so basically:
– the logs won't be readable with regular unix tools
– the on-disk format will be binary and won't be documented
– there won't be any third-party tools to process the data
– the daemon won't be portable

OBVIOUSLY¸ no sane sysadmin could even imagine a situation where this would cause any problems. Especially when trying to analyse logs from, lemme think, a minimal rescue boot image on a host where the filesystem has been damaged in some way.

Brilliant.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:07 UTC (Fri) by oak (guest, #2786) [Link] (10 responses)

Having done that kind of analysis twice for the same old computer (but for different disk) this fall, I can only concur.

A tool for which you could give a disk device and it would find all the syslog entries on that disk and order them correctly based on timestamps would be nice though. If they can provide that for "journald", then it might have some merit.

PS. you forgot from your list the fact that most this kind of binary formats take more space than text formats. When some program suddenly starts logging so much/fast data[1] to your syslog that your disk fills, it's nice if the syslog format overhead itself is minimum, it gives you more time to act.

[1] Syslog notices only repeats of previous message, when something logs two different lines constantly (from same or different, but related processes), that feature doesn't help.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:11 UTC (Fri) by raven667 (subscriber, #5198) [Link]

There is one area where this is not always true, storing pcap files of dropped packets will often take up less space than the iptables text logging. The raw pcaps also have more info than what is logged.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 23:14 UTC (Fri) by b0ti (guest, #81465) [Link] (8 responses)

>PS. you forgot from your list the fact that most this kind of binary formats take more space than text formats.
You might want to read the docs again:
"The fields, an entry consists off, are stored as individual objects in the journal file, which are then referenced by all entries, which need them. This saves substantial disk space since journal entries are usually highly repetitive (think: every local message will include the same _HOSTNAME= and _MACHINE_ID= field). Data fields are compressed in order to save disk space. The net effect is that even though substantially more meta data is logged by the journal than by classic syslog the disk footprint does not immediately reflect that."

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 23:53 UTC (Fri) by jubal (subscriber, #67202) [Link] (7 responses)

That's grand. Now could you please tell me how would you recover that information in case of disk failure?

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 1:40 UTC (Sat) by HelloWorld (guest, #56129) [Link] (6 responses)

> That's grand. Now could you please tell me how would you recover that information in case of disk failure?
Using the "cp" command and the backup, obviously.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:02 UTC (Sat) by jubal (subscriber, #67202) [Link] (5 responses)

I think that my sarcasm detector failed here. Surely you're not seriously telling me that – in the hypothetic case of a disk failure – the way to recover currently written log entries is to combine recovered, compressed and possibly damaged binary blobs with random compressed binary blobs? See, no matter if it was done intentionally or not, the redundancy of the syslog entries is a feature, not a bug.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:11 UTC (Sat) by HelloWorld (guest, #56129) [Link] (2 responses)

You can't trust data from a potentially damaged disk anyway. The only sane thing to do is make sure that your backup infrastructure is in place, so you don't need the data from the damaged disk. Or, if you can't afford to lose the data between two backups, use a RAID array.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:24 UTC (Sat) by jubal (subscriber, #67202) [Link]

You do realise that you might be looking for the data written directly before the crash? (Also: no, RAID is not a silver bullet, and please don't tell me that a SAN or NAS storage appliance will always prevent data loss – it should, but sometimes it can't). And remote log daemon won't help much if it's the remote log daemon's storage that just evaporated.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 3:16 UTC (Sat) by cwillu (guest, #67268) [Link]

"Trusted" is not a binary distinction.

Data off a broken disk isn't as reliable as data from a functional disk. However, data off a broken disk isn't completely unreliable, and data from a functional disk isn't completely reliable.

What matters is whether the data is useful, and in my experience, that data sitting on a broken disk is frequently useful.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 14:22 UTC (Mon) by nix (subscriber, #2304) [Link] (1 responses)

Quite. I have repeatedly in the past been able to get a lot of useful info from logs damaged by disk or fs damage. Sure, a few blocks were full of \0s or just plain garbage --- but the rest was readable. Who cares if the file as a whole no longer conforms to any formal grammar? Human beings don't need one!

But computers do. A binary->text tool would probably have given up in the face of such damage. At best it would go down a rarely-used hence buggy parser-error-recovery code path.

This depends on the design, actually

Posted Nov 21, 2011 18:27 UTC (Mon) by khim (subscriber, #9252) [Link]

But computers do. A binary->text tool would probably have given up in the face of such damage. At best it would go down a rarely-used hence buggy parser-error-recovery code path.

Computers do what they are programmed to do. If you'll write program which tolerates corrupt files then it'll just that. It's not as hard to do as you think: you don't need cryptohash for that, CRC32 is enough and it's exteremely fast novadays. But sure, you must plan for that in advance.

This is what Google does with it's petabytes of logs - works fine from what I've heard.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:56 UTC (Fri) by quotemstr (subscriber, #45331) [Link] (2 responses)

What the authors of this proposal miss is that *simplicity is a feature*, and it's more important than the minor features listed in the FAQ, the lack of which clearly can't be that important because people get along with syslog just fine today.

It's easy to add complexity to software. What's hard, and what really requires discipline and experience, is creating a simple, yet sufficiently powerful system. Simplicity is its own reward because it makes modification, composition, and troubleshooting easier. When you add complexity, you need to justify it by citing the benefits it will bring, and the benefits of this journal thing clearly aren't that important.

We already have a simple solution. Let's not ruin it.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:07 UTC (Fri) by k8to (guest, #15413) [Link]

Well they've missed that fact several times already.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:39 UTC (Sat) by aleXXX (subscriber, #2742) [Link]

Yes, KISS is a good thing.
Maybe at least Slackware stays with it.

Alex

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 17:56 UTC (Fri) by jubal (subscriber, #67202) [Link] (1 responses)

(let me also congratulate you on reinventing the Windows event log subsystem)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:06 UTC (Sat) by dlazaro (guest, #38702) [Link]

Or the Apple System Log facility (ASL for friends). They seem to be copying Apple in the most obtuse way possible. First it was systemd, a GPL licensed interpretation of launchd. Now this. Lenart and Kay even dare messing with the network protocol in a pie in the sky way.

Maybe you think that you are great system level designers but I work my ass out as a sysadmin every day. No single network connected switch or server understands your network protocol based on sharing journal files across the network. If you dare messing with syslog at least write down a formal RFC like everybody else. Even if you do this it is not guaranteed that your protocol will reach Internet Standard status down the pipe. In the best scenario, it will be years before it is supported by other network gear.

At least Apple's ASL is only supposed to be used with Mac-only software and it is well integrated with the standard syslog network protocol. Even Apple AirPort network gear logs via standard syslog events.

So much hubris makes me shiver.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:14 UTC (Fri) by jimreynold2nd (guest, #75341) [Link] (15 responses)

First off, I like the idea of a hash chain, so that all log entries can be check for authenticity. The issue with the first hash can be solved easily by, for example, printing it out and tape it to the machine (maybe with a seal and some holographic prints, so that even a physical attacker cannot change it).

But then I don't get why the logs need to be in a undocumented binary format, inaccessible to anything else. We all know that security though obscurity is a bad thing, and text format can accomodate anything. I don't buy the argument of "easier to analyze programmatically" argument: we can all use XML or JSON or some other organized text format if that is one of the needs. It also facilitates ease of recovery in case of, say, a disk failure.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:38 UTC (Fri) by gmaxwell (guest, #30048) [Link] (10 responses)

You've got it backwards.

It's the most recent entry the confirms the ones before it. Taping something to the monitor won't help because I can just author a plausible history of what comes next.

Instead this makes every new entry confirm the history (like how bitcoin works), but in the case of journald there is nothing preventing you from rewriting the history after the most recent snapshot of it— and nothing to prove a particular snapshot is the good one except sending it off to a a secure location or using an external secure timestamping service.

And of course the attacker can still delete the logs unless you send all of them off to a secure location. ... and if you're doing that you really don't need any of this.

I fully agree with your point on undocumented binary formats. Thats about as anti-forensic as you can get. Though its not all bad, for example varnish uses binary logs but provides a cat tool that converts them into a normal text stream.. so your ability to grep them is not diminished. If handled well they could make the binary part only problematic for archival but not operations.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:40 UTC (Fri) by jubal (subscriber, #67202) [Link]

…then there's the question of recovering data from partially damaged files. This will *obviously* break the signature chain.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 7:05 UTC (Sat) by alankila (guest, #47141) [Link] (4 responses)

On the contrary, I think there's a significant degree of thought spent on especially forensic issues. Reading the blog post indicates that today, any tool can fake any PID for syslog, apparently, because syslog spends no effort validating the client-given PID value. There's apparently linux-specific way to find out the true PID of process connecting to the syslog facility, and systemd is using it.

Undocumented binary data doesn't mean it's somehow fundamentally unreadable. You just compile the library and use it to read the crap. And it's open source. Sheesh.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 7:27 UTC (Sat) by gmaxwell (guest, #30048) [Link] (3 responses)

It makes me sad that you appear to have not completely read my message.

I explicitly point out that you can use tools to read the logs, and that this works pretty well e.g. for varnish.

But your life will be very painful if you are trying to piece together data from hundreds of machines, and backups across long spans of time, with different and incompatible versions of the file format.

If the developers are not very careful about versioning you may find yourself unable to read data from backups, or worse getting silently corrupted or truncated results. This is a risk which is heightened by using binary logs. It's orthogonal to the PID smarts— which seems like a great idea even without the replace everything proposed.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 9:38 UTC (Sat) by alankila (guest, #47141) [Link] (2 responses)

Well, it is a well-understood worry at least.

Log files have a long life, potentially in order of decades, so that sets the level of backwards compatibility required. It is huge, and indicates that whatever the merits of not documenting the format, it will become set in stone anyway unless log conversion tools are provided which can perform the conversion and afterwards validate that every bit of the information is old version was preserved and correctly converted (which might be same as checking the hash value of the log entries).

Nevertheless, even if archived logs become unreadable, old versions of this software do not just vanish into the ether but remain runnable, at the limit through emulation of x86 instruction set and old linux kernel versions. So some solution will always exist.

Regardless, I'd say that the reasonable requirement is that every generated journald log file must remain readable forever, or a chain of provably non-lossy converters must be provided that can upgrade from the earliest version.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 16:59 UTC (Sat) by backslash (guest, #32022) [Link] (1 responses)

Nevertheless, even if archived logs become unreadable, old versions of this software do not just vanish into the ether but remain runnable, at the limit through emulation of x86 instruction set and old linux kernel versions. So some solution will always exist.

This is all open source and not binary only apple or windows.... Just recompile!!

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 18:28 UTC (Sat) by alankila (guest, #47141) [Link]

Obviously you have not tried to recompile old software. There tends to be a significant porting effort because changes in build system (autotools, I hate you) and compiler code purity requirements may cause code to not compile anymore, or might segfault despite compiling. Additionally, any dependencies to libraries make things that much worse, because not only must that software compile but the old versions of the libraries must compile also.

Emulation at binary level through technique such as virtualization may therefore be far easier to achieve.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 14:28 UTC (Mon) by nix (subscriber, #2304) [Link] (3 responses)

It's the most recent entry the confirms the ones before it.
That's pretty much useless. Given that POSIX doesn't provide an API for inserting text in the middle of files, someone buggering the logs has to read() and re-write() all the data from the buggered point onwards (and is more likely to just copy-and-rewrite the whole file, for simplicity: it's not like the log buggerer is likely to care much about performance). At best you'll get a read() of the end of the log followed by a truncate() and re-write().

But if you do that, you're rewriting the end of the log anyway, so you can update all the hashes at the same time. The only way this will ever be secure is if the hashes are stored separately from the logs, streamed immediately over the network and stored on a non-connected box running a daemon which can answer the question 'what is the hash of message N' and 'what is the hash of the message immediately preceding message N'.

But there is no sign of such a scheme in journald: its design appears to militate against it much more than a straight-text logfile does, since you can rely on offsets in the latter remaining unchanged (so that an external file can point into them).

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 15:47 UTC (Mon) by johill (subscriber, #25196) [Link] (2 responses)

This is an important observation -- I thought about this too (but never posted), especially wrt. the comparison they make to git. The thing is this though: in git, the HEAD is essentially recorded at many places around the world -- rewriting the tree will be detected by everybody. In a journal, such a forward-running checksum scheme is completely useless as you point out since nobody has a copy of the HEAD sha1sum.

Looks like either we're not being told the full picture or somebody got confused about why exactly this useful in git.

(To make it secure though you don't need to store *all* hashes elsewhere, you just need to send off the most current HEAD hash to secure storage, still the same problem though.)

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 15:50 UTC (Mon) by johill (subscriber, #25196) [Link] (1 responses)

I note that they do say this though in their document, albeit a bit veiled (and the comparison to git was only made at KS I guess): "If the top-most hash is regularly saved to a secure write-only location, the full chain is authenticated by it."

It doesn't seem likely that anyone will ever have as easy ways to do that as with git.

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 4:26 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Actually, you're on the right track!

Make a central PUBLIC server that simply accepts and stores triples of form: <host_id, timestamp, hash> (host_id is UUID).

That's it. You can use this public server to periodically send your hashes. You lose (almost) no privacy, since log messages themselves need not to be replicated.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:37 UTC (Fri) by b0ti (guest, #81465) [Link] (3 responses)

> But then I don't get why the logs need to be in a undocumented binary format, inaccessible to anything else.
Because with this you can achieve a lot higher compression ratio if you store metadata separately. Even higher than using bzip2 on the xml or json. Not to mention lookup and search performance on very large datasets.

It would be nice to have different back-ends for this though. Since you only get an API to access it anyway, the storage format could be made transparent.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:55 UTC (Fri) by rfunk (subscriber, #4054) [Link] (1 responses)

Of all the limited resources involved in computing, disk space is by far the cheapest these days.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 23:06 UTC (Fri) by b0ti (guest, #81465) [Link]

With compression you not only benefit in the disk space required but also it will speed up I/O by letting the CPU work a little.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:42 UTC (Sat) by aleXXX (subscriber, #2742) [Link]

Are you sure it is worth it to spend time on working to reduce the size of log files on disk ???
I don't think so.

Keep it simple.
Text files rule. Or XML or JSON. But keep it text.

Alex

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:29 UTC (Fri) by mathstuf (subscriber, #69389) [Link] (3 responses)

I wonder why JSON couldn't be used instead of a binary format. It's structured, easily understandable, and can still be processed with tools (sure, maybe simple grep, sed, and awk are out, but I can't imagine there aren't already perl tools for doing XPath/XQuery-like on JSON files yet).

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 18:58 UTC (Fri) by juliank (guest, #45896) [Link]

Well, Lennart probably thinks it's too slow to parse...

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:26 UTC (Fri) by elanthis (guest, #6227) [Link] (1 responses)

Binary data, which is one of the needs for logging certain system events.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:46 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

It can't be escaped in some way? Sure, it bloats the logs, but is it so common as to make *everything* binary? I like keeping exceptional cases exceptional, not reworking things to accomodate all possibilities as if they had equal chances of occurring.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 19:56 UTC (Fri) by slashdot (guest, #22014) [Link] (4 responses)

Binary formats are a non-issue, since you can just use a tool that writes it out in text format, and pipe that to whatever.

However, overall, their hash chain idea is cute, but just sending everything to another machine seems far more effective, since instead of just detecting that someone erased the logs, you can prevent them from doing so (as long as they don't compromise the other machine too, that is).

Most of the points they raises are very valid.

Instead of using UUIDs, however, I would suggest using (an hash of?) the format string; if it needs to be changed, a special API can be used which takes both the original format string for ID purposes and the new one for formatting.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 20:58 UTC (Fri) by zander76 (guest, #6889) [Link]

Hey
Won't it be possible to sha1 the textfile plus some extra infomation like the time and previous key. That should make it fairly easy to check to see if its been modified without using some binary file or am I missing something?

Ben

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:35 UTC (Fri) by elanthis (guest, #6227) [Link] (2 responses)

The hash chain works really well, actually, even for local storage.

Attackers aren't going to just wipe out logs because that can make it painfully obvious that there's been a breakin. Removing or changing individual log entries is much more subtle, but with the hash chain this becomes very difficult; the attacker would have to rewrite the entire chain from the first message he wanted to modify.

This can clearly be combined with remote logging facilities. Better yet, it removes the need for infrastructure explicitly targetted at remote logging and instead just lets you use an existing backup service, and some tools to see if the "branches" of the logs match up between previous backups or not.

This is _exactly_ what git does.

In the end, while never losing data is important, knowing that you have modified data is significantly more important. People who can do so should have remote logging and very strict security rules, but the rest of us who have a single Linux box and an SFTP backup service (if we're lucky) shouldn't be screwed out of _all_ security features.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:18 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Additionally, on btrfs (or other COW systems) such manipulation can be nigh impossible to hide as it would require to rewrite quite a lot of stuff.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 14:31 UTC (Mon) by nix (subscriber, #2304) [Link]

the attacker would have to rewrite the entire chain from the first message he wanted to modify.
Well, yes. The attacker has to read() and write() the whole thing from that point on anyway, since POSIX provides no write_and_shift_up() nor delete_and_shift_down() functions. Do you really think that rehashing will be hard on top of that? You could only detect that if you mirrored the logs onto some other system... in which case you might as well only mirror the hashes. Actually you might as well simply store the hashes on their own on a different machine, and use a conventional syslog, without the myriad instantly obvious downsides of this hairbrained scheme.

The Journal - a proposed syslog replacement (is it really secure?)

Posted Nov 18, 2011 21:11 UTC (Fri) by nmav (guest, #34036) [Link] (12 responses)

I see the discussion that the method might be cumbersome to use, but I don't see any discussion on why this method is secure. My reading of it is that it doesn't add any security at all. If the only requirement is to keep the top hash in a read only location, then I just keep the top entry and recalculate the hashes of the next messages excluding the ones I want to delete.

Note that cryptographic hashes are _not_ digital signatures.

making the logs temper evident through git like hash chains

Posted Nov 18, 2011 21:37 UTC (Fri) by scottt (subscriber, #5028) [Link] (11 responses)

> If the only requirement is to keep the top hash in a read only location, then I just keep the top entry and recalculate the hashes of the next messages excluding the ones I want to delete.

Each hash is calculated from the "payload" of the log entry and the hash of the previous entry. If you try to change a hash in the chain the hash of its descendant would no longer match.

This is how the Monotone, Mercurial and Git revision control systems work.

making the logs temper evident through git like hash chains

Posted Nov 18, 2011 22:37 UTC (Fri) by nmav (guest, #34036) [Link] (5 responses)

Monotone, Mercurial and Git and not designed to protect against malicious attacks and as I said their method is not secure (that's why e.g. git allows digital signatures on tags).

making the logs temper evident through git like hash chains

Posted Nov 19, 2011 0:27 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

Wrong. Git has been specifically designed to be secure from the start. That was one of its original design goals.

And it IS secure, signatures are used not to authenticate integrity, but to authenticate the author of changes.

making the logs temper evident through git like hash chains

Posted Nov 19, 2011 4:40 UTC (Sat) by nevyn (guest, #33129) [Link] (3 responses)

From: blog.valerieaurora.org talking about CAS and compare by hash...
Finally, in a vain attempt to forestall the inevitable flame wars, I will point out that my objections do not apply to systems in which the hash address space is shared only with trusted users. In other words, hash-based source control is for the most part fine sticking with SHA-1 and could indeed use a cheaper hash like MD5 without any practical trouble
From: kernel trap git archive on the first discussion about git only using sha1, Linus explains:
As I explained early on [...], the _security_ of git actually depends on not cryptographic hashes, but simply on everybody being able to secure their own _private_ repository.
Then there was another discussion, where other people said the same things. Git's use of hashes as a CAS doesn't make it secure, doing the same thing for log file lines will not make them secure/trustable/whatever either.

making the logs temper evident through git like hash chains

Posted Nov 20, 2011 3:12 UTC (Sun) by cmccabe (guest, #60281) [Link] (2 responses)

SHA1 has been weakened, but many other hash functions have not. Given that security is the whole point, I'm sure that Lennart will use a newer hash.

making the logs temper evident through git like hash chains

Posted Nov 20, 2011 19:19 UTC (Sun) by nevyn (guest, #33129) [Link] (1 responses)

I think you missed the point ... git and journald can happily use SHA-1 because it adds no security at all. git gets a bunch of other useful features out of using hashes, AFAICS it's just a waste for journald.

making the logs temper evident through git like hash chains

Posted Nov 21, 2011 23:52 UTC (Mon) by cmccabe (guest, #60281) [Link]

> I think you missed the point ... git and journald can happily use SHA-1
> because it adds no security at all

Er, I think perhaps it is you who is missing the point. TFA says:

> Each entry authenticates all previous ones. If the top-most hash is
> regularly saved to a secure write-only location, the full chain is
> authenticated by it. Manipulations by the attacker can hence easily be
> detected.

The point is to get security, not to "happily use SHA-1."

making the logs temper evident through git like hash chains

Posted Nov 19, 2011 0:32 UTC (Sat) by dlang (guest, #313) [Link] (4 responses)

this means that you have to check the hash of every single line in the file to find a problem.

that takes a significant amount of time with a large logfile.

If you don't check every single hash, then the attacker deletes one entry and then two entries later the hash will compute

there's also nothing preventing the attacker from re-writing the entire file to have consistent hashes, but with missing entries (git allows this as well,I believe it's the filter-branch option)

If you have the ability to send stuff elsewhere to a secure location then you don't need this. If you don't have this ability, then this new stuff doesn't do you any good.

tripwire, ossec and equivalent already have the ability to learn that a file is a logfile and complain if an existing part of the file is modified between scans. There is a window of vulnerability in that they don't check after each line is written, but if you run them frequently you get something that's at least 90% as good, without having to throw out all the existing logging related tools in the process.

making the logs temper evident through git like hash chains

Posted Nov 19, 2011 5:28 UTC (Sat) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

there's also nothing preventing the attacker from re-writing the entire file to have consistent hashes, but with missing entries (git allows this as well,I believe it's the filter-branch option)
That's true. The biggest thing it does is to increase the sophistication an attacker requires to cover his tracks thoroughly. Instead of editing a log file with his favorite text editor, an attacker will need a track erasing program that rewrites all the log files with suspicious entries removed and hashes recomputed. Of course once a program like that gets out, all the script kiddies will be able to use it and we'll be hardly any better off than today- securitywise, at least.

making the logs temper evident through git like hash chains

Posted Nov 20, 2011 8:16 UTC (Sun) by drag (guest, #31333) [Link] (2 responses)

Well that's the obvious first thing one should think of. I thought of it too.

Then I read in the article about a mythical 'Write-once storage'. If you can write out the hash to a write-only interface then that would solve that problem.

Unfortunately I don't know of a good write-once media. Maybe cdroms, but I don't know about that.

Maybe special flash media with the 'erase block' part of the hardware disabled and a logging FS. I don't know.

It is a solvable problem, but not one that is as easy as first glance.

making the logs temper evident through git like hash chains

Posted Nov 27, 2011 2:18 UTC (Sun) by rgmoore (✭ supporter ✭, #75) [Link] (1 responses)

But the big security benefit in that case is from the existence of the WORM memory, since any data written to it is inherently tamper-proof. You could stick to an un-hashed text log and still have confidence that it hadn't been rewritten by an intruder. The benefit of the hash chain is that you can provide tamper evident recording by keeping only a fraction of the hashes, which is most important if the WORM storage is expensive or difficult to deal with. Of course keeping only a fraction of the hashes leaves open a potential window if the attacker can break in an alter the records between writes to WORM.

making the logs temper evident through git like hash chains

Posted Nov 27, 2011 5:47 UTC (Sun) by dlang (guest, #313) [Link]

however, since systems don't actually include WORM memory, and are very unlikely to (except for very specialized systems), how does that actually help?

remember that WORM memory needs to be a replaceable thing since by default you can't erase it to make room for new data.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:29 UTC (Fri) by b0ti (guest, #81465) [Link]

I, for one, welcome this. This Journal is finally something that will address all the problems which were already long due to solve.

For example take a look at what's under /var/log/. It has become the trashcan of applications where they can dump their crap. Each and every file there has its own format, they couldn't even use a single time/date format. Anyone who has worked with logs knows that syslog has too many shortcomings, even the new ietf standard does. Converting logs to syslog is a simple but inefficient solution, still how many people are out there converting eventlog and other formats to syslog? Trying to extend syslog and syslogd implementations is not easy when you need to squeeze a much broader concept into an existing format and functionality.
Rather than going this way, I said no thanks. This is why I designed nxlog to support different log formats, transports, etc. I'll likely create nxlog modules to support reading and writing the Journal.

Here are some things which I missed or didn't see in the documentation:
  • CEE has put a lot of effort into log standardization. Can't some of this be used in the Journal implementation instead of reinventing the wheel?

  • What's the encoding used for messages stored in the Journal? UTF-8?

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:29 UTC (Fri) by SEJeff (guest, #51588) [Link]

Perhaps syslog should just be replaced with git. systemd could git commit /var/log/messages after every log line. What could go wrong?

That sounds about as good of an idea as the journal does from a high level in every environment _sans_ high integrity / audit environments such as those that need heavy Information Assurance and SELinux lockdown.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 21:33 UTC (Fri) by bferrell (subscriber, #624) [Link] (10 responses)

<sarcastic mode = ON>
Would sone one PLEASE shoot Lennart?
<sarcastic mode = OFF>

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:54 UTC (Fri) by seneca6 (guest, #63916) [Link]

Sarcastic or not, such a comment is just moronic.

Please

Posted Nov 18, 2011 22:57 UTC (Fri) by corbet (editor, #1) [Link]

"Sarcastic mode" or not, that's not the sort of comment I'd like to see on LWN. You may well disagree with Lennart's approach to things - just get into that long queue over there - but he is putting a lot of effort into trying to solve some real problems. Advocating violence against him, even in "sarcastic mode", is not an appropriate response. Could we not do that again, please?

Thanks.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 1:36 UTC (Sat) by bferrell (subscriber, #624) [Link] (7 responses)

OK,

I'm chastised. I'll go sit in the corner with the other rowdy kids and let the grown-ups talk among themselves uninterrupted... Except I'm probably older than most of you, but that's besides the point.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 17:54 UTC (Sun) by ovitters (guest, #27950) [Link] (6 responses)

You made a threat against someones life. I fail to see how you think that is acceptable. I think a police report should be filed. FWIW, in the Netherlands police follows sites such as twitter and they'll take action (meaning: get your details from the host, ISP, then knock on your door). It does not matter if it was meant as a joke or not btw.

For just once of the various actions taken by the police, see e.g. http://www.112twente.com/832/Man-aangehouden-voor-verstur... (person was arrested on same day as threat on life was made).

I don't get why your post is still visisble. I would've reported you to the police and/or banned you. And yeah, free speech, but police is also has certain freedoms, such as putting you in jail.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 18:51 UTC (Sun) by Tet (subscriber, #5433) [Link] (3 responses)

You made a threat against someones life

No. He really didn't. Whatever side of the fence you happen to sit on in this particular debate, no one in their right mind can seriously claim that what he wrote constituted a threat against Lennart's life.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 19:05 UTC (Sun) by rahulsundaram (subscriber, #21946) [Link] (2 responses)

Problem is that, it often provokes *other people* into acting in a bad way when you even joke about such matters. It is at best, in very poor taste and at worse, a disaster and yes, it should be considered a threat and taken seriously. There hasn't even been a proper apology yet.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 23:08 UTC (Sun) by Julie (guest, #66693) [Link]


--it often provokes *other people* into acting in a bad way when you even joke about such matters

OK, so bferrell's comment made me angry because it was extremely childish and contemptuous towards Lennart, but yours just makes me mad.

Are you seriously suggesting that, if you ask me to kill Lennart and I go and do it I can then absolve myself of all responsibility by pointing a finger at you and saying 'he made me do it'??

Don't be ridiculous. It's this misanthropic thinking - that people are mindless vessels floating around just waiting to be filled with evil ideas from outside so we can go out and commit crimes - that calls down all sorts of censorious restrictions on people's heads in the name of 'protecting us all from ourselves'.

Let me make it perfectly clear.

If you see a death threat on someone's post and go and kill someone (whether you think they are serious or not) it is your fault.
If you see a scantily-clad woman walking past and you go and rape her it is your fault.
If you overhear a pensioner talking to a friend about how she has a lot of money in her purse and you go and rob her it is your fault.
If you have a colleague at work that makes racist comments and you go out and beat up an ethnic minority person it is your fault.

Adults are not children. In a democracy the full force of the responsibility for their own actions falls on their own shoulders. The only defense a grown adult can have to this is if they are mentally ill or insane.

(And yes, I know about 'incitement to commit...' laws, we have them in the UK. These also make me angry - they are anti-democratic because the ideas behind them strike right at the heart of the idea of personal accountability and responsibility. Of course the 'inciter's' behaviour is nasty and vicious - but this is an orthogonal issue to the behaviour of someone they 'influence', who is still totally accountable for their own conduct.)

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 7:00 UTC (Mon) by k8to (guest, #15413) [Link]

It's not a threat, it's in bad taste.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 19:22 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

> but police is also has certain freedoms, such as putting you in jail

I'd consider that a power, not a freedom.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 11:59 UTC (Wed) by nye (subscriber, #51576) [Link]

>You made a threat against someones life. I fail to see how you think that is acceptable. I think a police report should be filed.

There are actually laws against willfully wasting police time.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:51 UTC (Fri) by nwmcsween (guest, #62367) [Link]

Why not just use a log filesystem and only allow history changes from someone with the right keys? Maybe i'm simplifying things too much.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 22:55 UTC (Fri) by magfr (subscriber, #16052) [Link]

One thing I ask myself when I see Lennarts work is if he have ever used AIX. I see clear parallels between systemd and the system resource controller and also between the journal and the error logging service.

For the record I would also say that I like the basic idea of systemd, I just think it is sad that all of this ends up so tightly intertwined.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 23:03 UTC (Fri) by endecotp (guest, #36428) [Link] (8 responses)

> posted a detailed document

This seems to need a Google account to read :-(

> the top-most hash is regularly saved to a secure write-only location

Does anyone know what that "secure write-only location" is, in practice? Does my existing computer have one, or is this a new piece of hardware?

I'm also not certain how it avoid this:

blah blah [hash=1]
foo foo [hash=2]
root login from evil.com [hash=3]

Now the "secure write only location" stores 3. The attacker truncates the last entry from the log, and writes 2 to the "secure write-only location".

What have I missed?

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:36 UTC (Sat) by nybble41 (subscriber, #55106) [Link] (7 responses)

> Does anyone know what that "secure write-only location" is, in practice? Does my existing computer have one, or is this a new piece of hardware?

Most modern computers have one. On UNIX systems it's generally known as /dev/null. :)

More seriously, I suspect the intent was "write-once", a.k.a. "append-only". A CD-R or DVR+/-R in packet mode would probably qualify. For that matter, an old line printer (or a receipt printer) could probably be rigged to do the job.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:54 UTC (Sat) by PO8 (guest, #41661) [Link]

Yeah, please excuse my skepticism of this proposal, coming from a group of people advocating a major security change that can't even get basic terminology right. Back in the day, BYTE Magazine used to run an annual April Fools ad for various kinds of write-only memory chips; they reportedly got lots of serious responses from folks interested in purchasing.

Note that just storing the recent hash isn't good enough: the attacker could simply write over it with the last hash on their reconstructed chain, using whatever mechanism the OS was using to write the store. The whole hash chain needs to be kept on the secure store, and append-only is the obvious way to do this. For an append-only store I'd suggest flash and a dedicated microcontroller; see my post above.

Note also that even in this scheme the defender has the sometimes-difficult burden of figuring out at what timestamp the attacker compromised the system, so that the defender can tell which log messages to ignore.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 14:39 UTC (Sat) by endecotp (guest, #36428) [Link] (5 responses)

> I suspect the intent was "write-once", a.k.a. "append-only"

Ah OK. But:

- I don't have any append-only hardware on any of my systems (apart from optical drives and printers, but thye're not serious options in most cases). So this is still going to require new hardware, in other words it's a non-starter.

- If I did have append-only hardware, I could just save the log in it. Yes it's bigger than the hashes, but not dramatically bigger, and it has the advantage of actually being tamper-proof rather than just tamper-evident.

(Again, have I missed something?)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 17:19 UTC (Sat) by PO8 (guest, #41661) [Link] (4 responses)

I certainly agree with your second statement; see my comments above.

For the first, I'm not sure requiring HW is a non-starter. It would have to be cheap---say $20 or less---to start with. Eventually, motherboard manufacturers would just start throwing it on the board, increasing the price of the board by $.50 given reduced costs and increased volume. We've seen that pattern over and over with dedicated hardware. Another possible path in this case is to make an append-only store controller part of the TPM spec (if it isn't already---I haven't looked).

There was a time (and I lived through it) where it was believed that memory protection hardware for microprocessors was a non-starter. Eventually people decided to pay: partly for security reasons. So there's some hope.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 17:57 UTC (Sat) by endecotp (guest, #36428) [Link] (2 responses)

> I'm not sure requiring HW is a non-starter. It would
> have to be cheap---say $20 or less---to start with.

Here's an idea that I had a while ago: make a USB dongle that appears to be a USB-to-serial converter. Data that is sent to it is recorded in its flash; you configure your system to send log messages to it like a serial console.

My aim was to have a "dying breath" log for machines where networked logging wasn't feasible, e.g. a single co-located server. The idea is that if there is a kernel panic and the critical last log messages don't make it to the disk, they might still make it to this device. You could then re-mount it, potentially on another machine, where it would appear as a storage device containing the log files.

You could make this relatively secure by not allowing a transition from logging mode to read mode without re-plugging.

This could be implemented by a simplish microcontroller with USB device functionality. Maybe a bit more than a PIC, but not much more.

One thing that I was unsure of was how likely it would be that the host's USB system would keep running for long enough after a kernel crash. Perhaps a real serial device, or a console video recorder, would work in more cases. This device would be rather pointless if it were no more reliable than the disk.

Anyway, just a tangential thought for you all...

There are already such devices on the market...

Posted Nov 20, 2011 8:04 UTC (Sun) by khim (subscriber, #9252) [Link]

What you are describing looks awfully close to the "P33 JailBreak dongle".

PS3 Jailbreak works by overflowing receiving buffer in the PS3 so it needs to emulate four or five (depending on Jailbreak payload) USB devices, return bogus information with jailbreak code, etc. Significantly more then what your "logging device" should do. These devices price is starting from about $10 retail and this is with tiny runs so not much of the "economy of scale". More expensive ones (in the aforementioned range of about $20) may even emulate USB stick to make use after successfull JailBreak more pleasant!

In short: what you are describing looks perfectly doable.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 22:42 UTC (Wed) by cas (guest, #52554) [Link]

Here's an idea that I had a while ago: make a USB dongle that appears to be a USB-to-serial converter. Data that is sent to it is recorded in its flash; you configure your system to send log messages to it like a serial console.

This bit of the idea is good

You could make this relatively secure by not allowing a transition from logging mode to read mode without re-plugging.

but this bit isn't. It would make more sense and be far more usable if the USB dongle presented two devices.

The first device being a (perhaps serial) output device for writing log entries with maybe a control option for rotating log files by YYYYMMDD or whatever. each line sent to the device should have a "filename" (or syslog facility, or some other identifier) as the first word/field, with the remainder of the line being the log entry

The second a *read-only* USB storage device for reading the logs whenever you like.

so, the one device would provide write-once/append-only logging, and random read access to those logs

such a device could be made dirt cheap, too. it's just a USB flash disk with a slightly more capable processor & USB interface

The Journal - a proposed syslog replacement

Posted Dec 20, 2011 7:46 UTC (Tue) by topher (guest, #2223) [Link]

For the first, I'm not sure requiring HW is a non-starter. It would have to be cheap---say $20 or less---to start with.

Yes, it is a non-starter. There is no computer (or parts) manufacturer that is going to start including specialized hardware, even if it only cost $0.01us, for a system that doesn't exist yet, and that hasn't been adopted.

Especially when a lot of people, including some of us who have spent years dealing with logging retention, access, security, processing, alerting, etc, look at this and think it's a bad idea.

The Journal - a proposed syslog replacement

Posted Nov 18, 2011 23:04 UTC (Fri) by rfunk (subscriber, #4054) [Link]

I'd like to propose that Lennart get together with Daniel J. Bernstein to do their own distribution of Linux that works completely differently from the way the rest of us want.

It's really too bad that the OpenBSD folks have such a bad attitude toward Linux; when they set about making a more secure replacement for some bit of Unix infrastructure, they're quite good at doing it in a way that respects the way people expect (and want) Unix to work.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 1:59 UTC (Sat) by lindahl (guest, #15266) [Link]

The proposed use-case is silly. Everyone uses remote logs after a break-in.

Lots of Unix flavors have had binary syslogs. Fortunately, they are mostly dead or dying.

Mixed feelings

Posted Nov 19, 2011 2:20 UTC (Sat) by tau (subscriber, #79651) [Link] (2 responses)

Alright, since this is an RFC of sorts I'll add my 2c

Firstly, I'm not sure I share the author's conviction that syslog is profoundly broken. Drastically changing core tools in a system administrator's toolkit is something that needs to happen from time to time, but the very change itself will have a negative impact, as well as numerous unforseen consequences, and the justification will have to be a compelling one. The article mentions message source validation, log rotation, and uneven disk usage due to chronological log rotation as drawbacks of the current rsyslogd system. None of these things require a wholesale replacement of the tool or even its output format. The cryptographic hash chain also seems to be a solution in search of a problem; the primary problem with the kernel.org compromise AFAIK was overly permissive access policies, and log files are only a small part of the post-mortem forensics. I'm far too young to be against all changes to the world for any reason, but the issues presented in the proposal seem to be tangential to the changes in journald itself.

Secondly, the rigid bidirectional coupling between systemd and journald smells very bad to me. I like systemd because it is actually good at decoupling things and being filesystem-centric, two traits that are held up as integral parts of good UNIX design. Systemd's site-specific overrides that can exist independently of unit files in software distributions, configuration of system policy by symlink manipulation, and a restricted declarative unit language with easily-modelled semantics are points in its favour. However, journald is a step backwards; this bidirectional coupling just creates a monolithic entity with questionable design, and this decision does not come with any justification. Conceptually a logging service ought to exist independently of a service manager, though the service manager could be highly dependent on the unique feature set of a logging system. Similarly, the telescreen-like inability to turn it off instead of merely turning it down is just downright bizarre, and I can't think of any good reason for that. Monitoring systems should be passive; I should be able to remove journald from my system entirely and suffer no loss of functionality. This is a key design principle of just about every other logging system out there.

Unique and globally-namespaced identifiers for different classes of events together with machine-readable key-value tags are an excellent idea, and definitely serve as a good justification for making this change, but I'm not sure about using UUIDs. The use of Java-style reversed domain name prefixes has compelling advantages over UUIDs, because 8fa69db4-0a89-4c7b-b715-7e7ea93233c7 doesn't exactly roll off the tongue, nor is it at all memorable. On the other hand, I could potentially search for "journald bad sector event" and discover that the tag for this message is "org.kernel.blockdev.media.bad_sector". I can then search for that string online and find far more posts, log fragments, cross-references, and diagnostic scripts referencing that particular kind of event than I would find just searching for a UUID, since people are far more likely to remember the text string and cite it in discussion groups than they are to post a UUID. The only disadvantage to this style of identifier given in the article is that it is inefficient and that namespaces would have to be explicitly managed. This focus on execution efficiency and namespace collisions comes from a programmer's perspective, not from a system administrator's perspective, and far more system administrators will have to deal with this system than programmers will have to maintain journald's implementation. That, and, there is of course the old quote about sins being committed in the name of efficiency.

Finally, I wish the authors would choose a different name than "The journal". That word is already taken in the sysadmin's lexicon: a journal is a filesystem data structure, for heaven's sake. Then again, databases refer to journals as logs, so I suppose I can't get too upset about systemd referring to logs as journals :)

(FWIW I'm a programmer, not an administrator as such)

Mixed feelings

Posted Nov 20, 2011 1:07 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

> Unique and globally-namespaced identifiers for different classes of events together with machine-readable key-value tags are an excellent idea, and definitely serve as a good justification for making this change

Have you read the most recent RFC for syslog?

It defines key-value tags for syslog

Globally assigned namespace identifiers only works if you have global coordination of all development. Since you don't, there will be collisions with different programs picking the same identifier.

Mixed feelings

Posted Nov 20, 2011 2:38 UTC (Sun) by nevyn (guest, #33129) [Link]

Also, this addition to syslog isn't _new_ by any stretch ... the current RFC is http://tools.ietf.org/html/rfc5424 from 2009 but the first draft version which specified structured-data was http://tools.ietf.org/html/draft-ietf-syslog-protocol-01 ... which is from January 19, 2004.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:33 UTC (Sat) by gerdesj (subscriber, #5446) [Link]

Most of the 14 points made in the link are covered in some way by rsyslog.

Rather than reinvent the wheel, why not resuse what is already there?

I've even persuaded it to create MySQL based logs of emails through Exim + Spam Assassin + SA-Exim + ClamAV.

Good luck making that easier via Journal.

At least Rsyslog has been around for ages and developed by someone who is _somewhat_ focused on one thing.

As MS et al have found to their cost as well, logging is not easy and does not lend itself to many standards.

Facilitating parsing and passing is useful however - and for that RSyslog is a bit good.

Cheers
Jon

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 2:46 UTC (Sat) by RogerOdle (guest, #60791) [Link] (1 responses)

Why do you think that it is necessary to modify the established log file format. The features described here can be kept in their own tracking file maintained by the syslog system without a single modification to any of the existing files or formats. If people do not want this performance wasting feature then they can do without. I haven't seen a wide spread problem with these log files. Hardened systems track entry in multiple ways using redundancy and isolation in security mechanisms. They try to avoid single-points of failure whenever possible. It has been suggested that log files were modified to hide the tracks of hackers but how often have they succeeded? They didn't in the case mentioned since they got caught.

Security is a constantly changing field so vulnerabilities like these have to be examined. But we have to consider the impact on performance. Generating hash codes is not a simple or quick operation when you are considering it in relation of a heavily used transaction processor. Hash codes on the syslog may not noticeably impact a low demand but it will never be used on a high-demand server.

In any case, I am against use of binary files for system administration of UNIX/Linux systems. Human readable ASCII files preserve compatibility, make it easier for system operators to tell what is going on, and make it easier for new users to learn how the systems work. What do you do when you binary log reader is hacked? How can you tell the difference when it is telling you the truth or is lying to you?

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:16 UTC (Sat) by intgr (subscriber, #39733) [Link]

> Generating hash codes is not a simple or quick operation when you are
> considering it in relation of a heavily used transaction processor.

Nope, that's premature optimization.

I'm sure the Journal system will introduce new performance bottlenecks, but hashing is not one of them. Even my 2-year-old CPU (Phenom II) can hash ~300 MB/s of data per second with SHA-512, on a single CPU core. SHA-1 and MD5 are even faster.

Against the tide of negativity

Posted Nov 19, 2011 5:21 UTC (Sat) by ringerc (subscriber, #3071) [Link]

... as a sysadmin I'm DELIGHTED to see someone looking at fixing the painfully unstructured, horrid and braindead logging infrastructure used on Linux and UNIXes. Thanks for having the guts to tackle this task - despite surely knowing the hostility and change-resistance you'd face. The state of *nix logging is miserable and I'm delighted to see some progress in this area after endless years of logging so crap it makes the Java logging fiasco look good in comparison.

I'm not completely comfortable having the file format undocumented, though I understand why you want to do so _initially_ for agility and the ability to make quick fixes/improvements.

At the very least a format version header is needed so that tools written to read the format (which there *will* be, documented or no and whether or not there's a good reason) can at least detect newer and unrecognised versions of the format.

FWIW, I like the direction systemd is going in as well, so thanks for the good work so far. Linux needs a real core overhaul if it's going to be maintainable for future workloads as they move away from the desktop and to multiple interconnected sometimes-on battery-powered devices, and systemd provides a useful foundation for some of that. Thanks for enduring the hate mail from change resistant types.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 5:39 UTC (Sat) by jcm (subscriber, #18262) [Link] (1 responses)

Just to comment on the specific quote in the original post. There is absolutely no reason syslog entries we have today could not include a cryptographic hash. Therefore, calling that out as a "feature" needs to be countered promptly less some become confused that only by replacing another fundamental piece of Unix technology can we be "secure" from kernel.org breakins in the future.

As to the notion of a binary only, undocumented, non-standardized logging format...I don't know where to begin so in the interest of my blood pressure, I won't comment further on that.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 17:28 UTC (Sat) by jcm (subscriber, #18262) [Link]

Secondarily, the only way to do secure logging properly is to send logs to another machine, preferably at some point involving physical hard copy. I see all of these reactionary claims post the kernel.org break-in, some boarding on total nonsense (like the systems now configured to blow away any ssh keys found stored therein - hint, there are many other things someone could do to compromise global security, this is caring way too much about one thing). But anyway, to get back to the point, the *only* *ONLY* way to do security when it comes to logging is airgap separated external one-way logging to another machine. ONLY.

A few thoughts.

Posted Nov 19, 2011 5:50 UTC (Sat) by ebiederm (subscriber, #35028) [Link] (5 responses)

-Forensics. The biggest problem I have had recently is the default logrotate configuration that deletes everything after a month. You don't have to be a hacker and delete the logs lograte is configured to delete
the logs for you.

- Structured and more paresable syslog. That is known as RFC5424, and is already supported by rsyslog. It might help to get applications to take better advantage of the new format but a better backwards compatible syslog has already been done and is standardized.

Can we please just take advantage of the tools we have rather than embarking on yet another NIH adventure?

A few thoughts.

Posted Nov 19, 2011 7:47 UTC (Sat) by alankila (guest, #47141) [Link] (4 responses)

It seems to me that journald aims higher than this RFC, especially because it understands two important properties of logging:

1) messages in log tend to be highly repetitive, so they are amenable to significant reuse of the internally used field values. So you don't want to repeat the same data umpteenth time, especially if the value is long, you rather write the key-value -pair once into common area, and from that point onwards reference only the ID of that key-value pair.

2) log messages are useful information to end user and administrators alike, but it tends to require pair of eyeballs and experience to find the right log file and to know what entries to look for. This is an unnecessary chore. As a hypothetical example, you launched "foobard" and it says "failed", and you are wondering if it wrote anything in any log. I'm hoping you could say "logcat -d /usr/sbin/foobard" to find the latest messages from journal generated by that program.

I suppose 1) means that logs can be substantially more comprehensive than the RFC without requiring that much more disk space, and 2) is something that ought to make linux far more pleasant to use in general.

What follows next is my impression: I think Linux has a fairly soft underbelly, and it appears to have become the target of increasingly embarrassing attacks, and these attacks will continue if we don't get our act together. Journald is trying to harden the logging facility against things such as attacker process flooding the log and thus causing the system to run out of space on the /var filesystem, a problem that seems to be ignored today. My feeling is that a lot of our existing design choices at bottom of application stack are basically inadequate to meet the actual needs of well-designed robust systems, and we seem to believe that inadequacy at bottom level of stack can be repaired by pouring a layer of code on top of that insufficient foundation. However, the insufficient foundation remains insufficient due to the law of leaky abstractions.

The consequence from the above argument is that it is perfectly OK to design a monolithic piece of software as long as your monolithic design is _good_. If it is not good enough, then no amount of crap poured on top of it will fix it either. I'd be fully on board of this journald if it means that I don't have to go "log fishing" through zgrep foo /var/log/* anymore, but can rather say something like "just give me all entries from last 30 seconds" or "it was written by the process image /usr/sbin/x".

A few thoughts.

Posted Nov 19, 2011 15:48 UTC (Sat) by NAR (subscriber, #1313) [Link] (1 responses)

Journald is trying to harden the logging facility against things such as attacker process flooding the log and thus causing the system to run out of space on the /var filesystem, a problem that seems to be ignored today.

And exactly how would journald solve this? All the attacker has to do is to flood the logs with unique messages...

A few thoughts.

Posted Nov 19, 2011 18:40 UTC (Sat) by alankila (guest, #47141) [Link]

It is said to ratelimit the logging speed if it starts to run out of disk space (but ratelimit based on what criteria?), and to rotate and purge the logs when the alternative is running out of space (but what entries will be purged?). These changes ought to solve the problem to a degree, and I am hoping sufficient intelligence is applied to make the best possible effort in these adversarial conditions.

A few thoughts.

Posted Nov 21, 2011 15:23 UTC (Mon) by sorpigal (subscriber, #36106) [Link] (1 responses)

Your 2) is possible with syslog, but it's spelled "tail /var/log/foobard" - just standardizing on the naming of these files would be enough to satisfy this.

I'm actually a fan of improving logging, because it could be so much better, but this proposal has so many things that make me go "Huh?" that I'd rather change nothing than adopt it as is.

> The consequence from the above argument is that it is perfectly OK to design a monolithic piece of software as long as your monolithic design is _good_.

I would say, rather, that the consequence of the above argument is that some of the simple systems we have long used are becoming no longer "just enough" for some uses and that some of them should be replaced with a new level of "just enough" that satisifies today's requirements... and this needs to happen before they are each replaced by ill conceived, grandiose, kitchen-sink solutions which throw the baby out with the bathwater.

A few thoughts.

Posted Nov 21, 2011 15:43 UTC (Mon) by mpr22 (subscriber, #60784) [Link]

I would say, rather, that the consequence of the above argument is that some of the simple systems we have long used are becoming no longer "just enough" for some uses and that some of them should be replaced with a new level of "just enough" that satisifies today's requirements... and this needs to happen before they are each replaced by ill conceived, grandiose, kitchen-sink solutions which throw the baby out with the bathwater.

It really would be nice if more of the visible pro-Unix people in discussions of Lennartware would say "there may be a problem but Lennart's solution is wrong" instead of "Lennart is wrong to think there is a problem". It would be even nicer if they'd put their code where their prose is, of course :)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 8:57 UTC (Sat) by aleXXX (subscriber, #2742) [Link] (7 responses)

Another practical concern: Lennart is developing a lot of new stuff, supposed to sit at the core of a Linux system.
PA, systemd, bluez, now this, was there more ?

Every piece you write you have to maintain in the future. Even more so for such core tools. This adds up.
Will Lennart be able to support and maintain all the things he has developed (and will develop if it continues like that) in the next 10 or 20 years ?

Alex

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 9:52 UTC (Sat) by rahulsundaram (subscriber, #21946) [Link] (6 responses)

You are trying too hard. Lennart has already handed over development and maintenance of PulseAudio, Bluez to other people and Avahi doesn't require much ongoing work at all. systemd also has other maintainers.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 10:31 UTC (Sat) by aleXXX (subscriber, #2742) [Link] (5 responses)

No, I'm not trying anything.

This is just not how I understand software development.

Implement something new and cool and then hand it over to others because you don't care anymore.

I am still maintaining now in my spare time in the evenings the stuff I added to CMake in 2007 (more than 4 years ago) when I was working at Kitware as a paid job. I still feel responsible for it.

Alex

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:17 UTC (Sat) by rahulsundaram (subscriber, #21946) [Link] (4 responses)

What makes your method better? First you were worried that he isn't going to be able to maintain all of it but when I point out he isn't doing that, you complain about that too. I wonder what outcome will be considered satisfactory according to you.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:51 UTC (Sat) by aleXXX (subscriber, #2742) [Link] (3 responses)

That he maintains his stuff.
Writing something, and then leaving the rest to others is easy.

Other well known developers ?
Linus is still caring about the kernel.
Theo is still caring about OpenBSD.
David Faure is still in KDE.
Bill Hoffman is still maintaining CMake.
The list goes on.

I mean, I'm not aware of any other high-profile developer who started so many things in such a short time and doesn't maintain them anymore.

Alex

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 13:22 UTC (Sat) by rahulsundaram (subscriber, #21946) [Link]

Now, this is exactly what I meant by you trying too hard. Lennart is still involved in everything he has written. Doesn't mean, he has to be the primary maintainer. I see, your list includes KDE but doesn't mention GNOME. That's very convenient. Firefox? Apache? PHP? Exim? GTK? The list of projects that have new maintainers now is far longer than yours.

David Faure might be involved in KDE but what about all the different KDE modules? There is nothing to indicate that developers have to hang on to every project they start forever. Quite the opposite. Transitioning maintenance is a *very healthy* thing to do. Nobody has to worry about the bus factor much if every project had new maintainers from time to time.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 14:16 UTC (Sat) by Wol (subscriber, #4433) [Link]

But what if Lennart's strength is in design, not maintenance? As I describe myself, people come to me if they need a problem solved. Once I've found the solution, looking after it gets taken away from me.

Why do you want to ignore Lennart's strengths, and harp on his weaknesses?

After all, take Linus for example, how much kernel code does he write nowadays? I understand the answer is "nothing".

Why SHOULD Lennart have to look after his stuff that he's designed, once others see its value and take it over? Or do you think that Linus should be using a large chunk of his time looking after git?

Cheers,
Wol

That's funny...

Posted Nov 19, 2011 14:37 UTC (Sat) by khim (subscriber, #9252) [Link]

I mean, I'm not aware of any other high-profile developer who started so many things in such a short time and doesn't maintain them anymore.

Funny that you started with Linus. Who quite explicitly developed "new thing" (git) and quickly shuffled maintenance to others.

The fact is: creation of new software and maintenance are both important but these are different skillsets. If Lennart can find good maintainers who will support his creations - then all is well. Linus is good in both creation and maintenance roles - but even does not support everything he wrote.

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 9:49 UTC (Sat) by kerolasa (guest, #56089) [Link]

I don't understand why there are no way to allow {,u}mount only if answer to private key challenge is known. Block device layer could be modified the way that only file read, create and append are passed to file system. Additionally allowing rename might be sane, but not across file system. Basically that would make block device to be something in between traditional read-write & read-only.

This might enable extra security with existing file systems and utilities, and would perhaps be suitable ftp file archive. Logs are more difficult, they are often compressed which does not work without being able to delete uncompressed file. I guess to overcome this choosing a file system that supports transparent compression is the way to go.

The Journal proposal sounds interesting. I understand the proposal breaks old ways of working, and I am not worried about that; sometimes old bad habits has to go to give room for new generation of solution. That is in align with Hume's (is-ought) Law. More importantly it is easy to defend what is, and tell old must not break, without really contributing how thing ought to be. If it is unclear I am aside of study how things ought to be, including intuitively wrong and crazy ideas. People who abandon alternatives without careful thinking are not thinking, and therefore their opinion is less important.

p.s. For people who do not read old philosophers; the Hume's Law can be expressed many ways, and here is two 'you cannot determining how things ought to be from how they are', or 'how things are is not necessarily how they ought to be'.

NIH: The Journal - a proposed syslog replacement

Posted Nov 19, 2011 12:12 UTC (Sat) by rilder (guest, #59804) [Link]

Is it not possible to achieve consistency of log files without making the log files binary ? They say it may increase complexity but won't this increase complexity too. People will be really unforgiving if high CPU etc on heavily loaded servers is due to logging.

Syslog already supports different sinks like databases, network. Is it hard to add authentication to those ? I have seen 0MQ with rsyslog being implemented -- so basically you have all the freedom to add all the boilerplate auth you need.

Lastly, I don't see this being used on any desktops, only on servers perhaps.

Also, one of the comments states -- instead of gzcat i will use <tool> .. But for uncompressed text based logs, picking one tool from a plethora is what makes it best.

Git - a proposed syslog storage

Posted Nov 19, 2011 12:43 UTC (Sat) by kragilkragil2 (guest, #76172) [Link]

Wouldn't it be a lot simpler to use git to store todays logfiles. AFAICS it could provide the same level of security and every Unix person needs to learn git anyways (IMO).

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 17:13 UTC (Sat) by Tr0n (guest, #42662) [Link] (5 responses)

Has Lennart gone mad!?
You secure logs by sending them to a remote host (or two) and making sure they can only be administered by a handful of people.

The first rule about security is ACCESS.. If they have access to delete the logs or destroy the system, that's all they need.

Heh, a write-only location for the initial seed of the logs is silly... How is it going to be read in order to verify the first entry? Why wouldn't root just write a new value?
Again, I re-iterate what has been known for YEARS: to secure logs of events, things should be sent to a remote syslog (/rsyslog) server.

This custom binary rubbish is just plain madness.

(BTW, I can see a great amount of sense and reasoning behind systemd/puleaudio - which is why I'm so surprised)

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 18:51 UTC (Sat) by alankila (guest, #47141) [Link] (4 responses)

The idea is, as far as I understand it, to periodically log the latest log entry's hash (and id) to a location which does not permit updates afterwards. I'm going to guess we are talking about logging it once per minute or something like that.

From there on, you can use this sequence of hash values to significantly improve chances of detecting any tampering, because the attacker has only in average half of the hash storage window to take over the logging facility before tampering with the logs becomes provable.

Binary log is entirely reasonable given the additional goals being sought: ability to log binary data using a format that reuses values from previous log entries and stores them in compressed form (permitting machine-readable self-describing log entries with far more information than otherwise is possible without significantly increasing entry size).

The Journal - a proposed syslog replacement

Posted Nov 19, 2011 18:59 UTC (Sat) by alankila (guest, #47141) [Link]

I wish to add additional detail to the middle paragraph. The idea is that once attacker enters a machine, there may be a log entry in syslog that shows evidence for it happening, some characteristic error message or whatever.

If the attacker wishes to hide this entry, he must almost immediately take over the logging system before it manages to save the top hash to secure location, because afterwards you can't unnoticeably remove those log entries.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 0:23 UTC (Sun) by endecotp (guest, #36428) [Link] (2 responses)

> periodically log the latest log entry's hash (and id) to a
> location which does not permit updates afterwards.

Such locations don't exist, apart from printers and optical drives.

> I'm going to guess we are talking about logging it once per
> minute or something like that.

That's nothing like fast enough if the attacker has some sort of automatic tool (and I don't believe I've ever seen non-automated attacks). To be useful it must save every single new hash value.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 4:39 UTC (Sun) by alankila (guest, #47141) [Link]

Such locations can be constructed. Many people here seem to think that a dedicated syslog server is secure, and that would have no other function and no other visible ports except one which accepts data in syslog protocol. Logging every hash sounds like a solution whose overhead is comparable to just doing remote logging directly. There might be value in having some kind of middle ground.

Not every attack succeeds immediately, and it may take several tries to successfully exploit some race condition in a daemon. Once attacker breaks in through some local daemon, it still takes some time to download or build the relevant exploit utility, and to launch the secondary attack which finally gives root compromise.

The Journal - a proposed syslog replacement

Posted Nov 20, 2011 20:19 UTC (Sun) by jackb (guest, #41909) [Link]

Such locations don't exist, apart from printers and optical drives.
So they don't exist except when they do exist?

rate limiting does not defend against DOS, it accentuates the DOS

Posted Nov 20, 2011 6:57 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

the idea that rate limiting is a defense against a DOS is invalid, if the logging infrastructure can't keep up and therefor slows down the rate at which other systems can generate logs, it doesn't prevent a DOS, it creates one by making the other systems stop responding while they wait for the logs to be written.

In some cases your logs are important enough to do this, but even in many places where security is very important, availability is still more important than guaranteeing that every log message gets saved.

rate limiting does not defend against DOS, it accentuates the DOS

Posted Nov 20, 2011 17:10 UTC (Sun) by alankila (guest, #47141) [Link]

It might be possible to ratelimit only the daemon that is logging too much, and let others continue logging without delay. In any case, without more details about what exactly will be done it's hard to say whether a good idea is being proposed. Running out of space on /var is pretty bad outcome too, so avoiding that one way or other seems worthwhile.

log compression

Posted Nov 20, 2011 7:02 UTC (Sun) by dlang (guest, #313) [Link] (9 responses)

the compression that they talk about here sounds very impressinve and sophisticated, but looking for identical strings of text and replacing them with shorter placeholders (i.e. pointers) is exactly what gzip, bzip2, etc do today. I would be surprised if the journal software was really able to do much better.

with many TB of real-world logs, I'm getting the following results

gzip -9 give me 10:1 compression

bzip2 -9 gives me 20:1 compression but is significantly slower

doing a zgrep or zcat from a compressed file is actually faster than grep or cat from the uncompressed file (on a fairly sophisticated disk array.

I haven't yet done tests on lzma compression

log compression

Posted Nov 20, 2011 10:28 UTC (Sun) by misc (subscriber, #73730) [Link] (2 responses)

I guess the compression will be done on the fly, and not once per day by cron.

log compression

Posted Nov 20, 2011 11:30 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

if you have so few logs that you only rotate them once a day, do you really need the compression to happen any sooner?

I've got systems where I rotate the logs FAR more frequently (down to single digit minutes on some systems). This is a pretty large installation (architected to handle > 100K logs/sec)) but when people are claiming the need to optimize things for performance/space reasons, they need to work better than the existing solutions.

log compression

Posted Nov 20, 2011 17:41 UTC (Sun) by ovitters (guest, #27950) [Link]

I assume it'll be properly researched. It seems you're very happy with syslog. Journal will still allow you to use syslog. I don't see any big issue, except stop energy. After having pulseaudio and systemd, better to contribute than to try and avoid it :P

log compression

Posted Nov 20, 2011 19:19 UTC (Sun) by slashdot (guest, #22014) [Link] (5 responses)

It seems journald will support random access and indexing log entries, which are both hard or impossible with the naive application of compression you are citing.

log compression

Posted Nov 20, 2011 19:47 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (3 responses)

So build a separate auxiliary file that sits *alongside* conventional log files. This auxiliary file can contain all the journal metadata - PIDs, precise timestamps, message GUIDs - and index some of them. The conventional syslog file would happily exist alongside the auxiliary file, unless disable by an administrator.

Programs that want to log to the journal could use a library that look like this:

struct journal_log_attribute
{
    enum journal_attribute_type type;
    union {
        journal_attribute_guid guid;
        journal_attribute_keyval keyval;
        journal_attribute_module module;
        /* etc */
    };
};

void
journal_vsyslog(
    int priority, 
    const char* msg, 
    va_list args, 
    struct journal_log_attribute* attributes[] /* NULL-terminated */)
{
    if (journald_is_active()) {
        journal_internal_vsyslog(priority, msg, args, attributes);
    }

    vsyslog(priority, msg, args); /* Ignore attributes */
}

Have you read the document?

Posted Nov 20, 2011 19:58 UTC (Sun) by khim (subscriber, #9252) [Link] (2 responses)

So build a separate auxiliary file that sits *alongside* conventional log files.

Have you read the design doc? This is exactly what journald does.

And if your idea is not to keep separate log with indexing and all additional goodies but to try to attach it to the existing textual file then this is stupudity beyond comprehension: referring to this scheme with "duct tape", "bailing wire", or "chewing gum" does a disservice to all three of those fine building materials.

Have you read the document?

Posted Nov 20, 2011 20:02 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (1 responses)

> Have you read the design doc? This is exactly what journald does.

Except that, AIUI, the goal is to eventually have journal-only logging for some facilities. I don't want that to ever come to pass.

> attach it to the existing textual file

Why not? The index could point into a particular offset in a textual log, or just duplicate the contents of the textual log. The point is to ensure that all logs can be queried with plain-text tools and to _optionally_ provide richer information for these logs. The nightmare scenario is for some messages to appear only in the journal and for other messages to appear only in syslog files.

> stupudity beyond comprehension

Can we try to maintain *some* level of decorum here?

Have you read the document?

Posted Nov 20, 2011 20:13 UTC (Sun) by khim (subscriber, #9252) [Link]

> attach it to the existing textual file

Why not?

Because these files are already processed by quite a few different programs stitched together in non-obvious ways. To hope that you can keep all that synchronized... I'll wish you luck.

The nightmare scenario is for some messages to appear only in the journal and for other messages to appear only in syslog files.

If I understand correctly all messages pass the journald but plain old syslogd messages go to syslogd too. This means journald keeps everything no matter what.

log compression

Posted Nov 20, 2011 21:45 UTC (Sun) by dlang (guest, #313) [Link]

That is a useful option to have for log messages, but it's also available today. you can store your log messages in a database (of many different kinds, including nosql variations) and have the data indexed umpteen ways.

At the risk of being dismissed as an old fogy, the power of unix is based on the tradition of having many simple tools work together rather than having one tool that tries to do everything for everyone.

Nowhere is this more the case than in logging.

How important the logs are to you will vary drastically (do you want the application to stall if the log can't be written, do you want to spool to disk and run the risk of filling your disk, or do you want to throw away the log message)

how you store the logs will vary drastically, and in many cases you may want to store them in multiple ways.

how you examine the logs for 'interesting' things will vary.

At my office we have all of the following in place

recording to flat files combining all the logs togeather

recording to flat files of specific types of log messages

recording to a nosql database cluster across a large farm of machines with everything indexed

opensouce tools to watch for 'interesting' events and notify us when they happen

custom tools to watch for 'interesting' events and notify us when they happen

commercial closed source tools to watch for 'interesting' events and notify us when they happen

with existing syslog, all of these things can work togeather and I can add other log processing as well

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 7:08 UTC (Mon) by yoe (guest, #25743) [Link] (9 responses)

Binary logging is a very bad idea.

"oh, crap, it seems like my hard disk broke down. Let's quickly check the logs to see what happened."

"error: could not read log file: log file corrupt."

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 10:25 UTC (Mon) by mpr22 (subscriber, #60784) [Link] (8 responses)

That's not an argument against binary logging. That's an argument in favour of having multiple copies of the log on different spindles (and possibly even different computers). If the disk has thrown a wobbly, you may well have garbage instead of metadata, garbage instead of data in whatever blocks were most recently written to, etc.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 13:08 UTC (Mon) by SEJeff (guest, #51588) [Link] (3 responses)

Hopefully next gen filesystems such as btrfs will partially solve this problem via checksumming data more until true ECC hardware and memory is the norm.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 19:06 UTC (Mon) by mpr22 (subscriber, #60784) [Link] (2 responses)

ECC seems vanishingly unlikely to be the norm until companies are forced to do it (as in, a sufficiently-large jurisdiction's laws make it either criminal or sufficiently expensively tortious to sell end-product devices that don't use some kind of ECC technology on their memory), because it increases the design and manufacturing costs of the end product without increasing sales of the end product.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 19:22 UTC (Mon) by SEJeff (guest, #51588) [Link]

Agreed. That is why filesystems which checksum and keep multiple copies of data/metadata will help paper over the problem of unreliable hardware. Things like ZFS and Btrfs truly are the future.

The Journal - a proposed syslog replacement

Posted Nov 24, 2011 8:06 UTC (Thu) by cas (guest, #52554) [Link]

because it increases the design and manufacturing costs of the end product without increasing sales of the end product.

and the really annoying thing is that it would just be a short-term once-off design cost until the chipsets for various device types all had ECC support (all AMD CPU chipset motherboards have ECC support and have had for several years - the catch is that ECC RAM is much more expensive). And the economies of scale for producing just one kind of RAM instead of "server RAM" and "desktop RAM", would quickly offset even those costs within a HW design generation

which is, of course, the reason why it hasn't happened - artificial market segmentation is extremely profitable. You can only charge more for "server-class" hardware if they have a few things which don't exist or are uncommon on "consumer" motherboards - e.g. ECC being uncommon outside of AMD chipsets, consumer motherboards having SATA rather than SAS (which should just replace SATA entirely), and consumer drives being SATA interface rather than SAS.

The Journal - a proposed syslog replacement

Posted Nov 21, 2011 22:29 UTC (Mon) by jrn (subscriber, #64214) [Link] (2 responses)

Or maybe an argument in favor of using error correcting codes in that binary format. :)

Though it's hard to beat the robustness against corruption of an uncompressed text file. Entire sectors can be missing, and the rest is still easily readable without requiring specialized skills.

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 5:45 UTC (Tue) by slashdot (guest, #22014) [Link] (1 responses)

This is because text has a record delimiter that is not used within the records (the newline character), making synchronization trivial.

A binary format with record hashes is also similarly recoverable, since you can just try all record start positions until one hashes properly (much more expensive computationally, but on current CPU it won't be noticeable if records aren't huge).

Add sync markers and tags to binary files.

Posted Nov 22, 2011 6:31 UTC (Tue) by eru (subscriber, #2753) [Link]

This is because text has a record delimiter that is not used within the records (the newline character), making synchronization trivial.

Binary format can easily have the same property: A synchronization marker (not necessarily a single byte) that is guaranteed to not appear in the data. This means the actual data needs some processing to avoid the marker, but this can be cheaper than a conversion to text. Eg. if your sync marker is 0x55, double it if it appears in the payload data. Some other byte combinations starting with 0x55 could tag the type of following data (date, numbers of different sizes, string etc), which also helps parse possibly corrupted files.

The Journal - a proposed syslog replacement

Posted Dec 20, 2011 8:03 UTC (Tue) by topher (guest, #2223) [Link]

That's not an argument against binary logging.

Actually, it is. And it's a valid one. Take the case of a corrupted block on your filesystem. We're not caring how it got there (physical problem, filesystem problem, whatever), but it's there. If you've got a text file with a corrupted chunk in it, you can generally recover everything except for the corrupted part with no special tools.

Now imagine the same scenario, but with a binary file. You better hope and pray that whoever wrote the tools for that (currently undocumented) binary format have specialized tools for analyzing and recovering from file corruption. Otherwise, there's an excellent chance you just lost that entire file. Depending on the the internal format of the file, you might be screwed no matter what (especially if the file is doing internal compression functionality, which could mean your corruption just "infected" your entire file).

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 19:28 UTC (Tue) by JesseDyer (guest, #81543) [Link] (2 responses)

Wow, if only all of this energy were spent solving the real problem (how the attacker gets in) rather than the tangential problem (how do we know he got there), perhaps all of this would be rendered... moot?

I keep hearing the same pattern in this thread over and over again, which is something like this:

"Once we get the binary file, we'll use tool X to convert it into a text file so that we can actually make use of the thing.."

Am I the only one that appreciates the irony here?

The Journal - a proposed syslog replacement

Posted Nov 22, 2011 20:27 UTC (Tue) by raven667 (subscriber, #5198) [Link]

Wow, if only all of this energy were spent solving the real problem (how the attacker gets in) rather than the tangential problem (how do we know he got there), perhaps all of this would be rendered... moot?

If only the world were that simple. You are never going to be able to prevent a motivated attacker from getting into your system. That's not really practically possible, new vulnerabilities will be discovered at the same rate or faster than you can plug them. What you can do, and what is a better use of resources, is a strong audit capability so that you can throw the bums out as soon as they get in, hopefully before they can cause serious damage. Not that you shouldn't patch as well but all the patching in the world will never be enough.

The world is a harsh place, you can't prevent all bad things from happening all the time but you can respond quickly with strength and resilience when they do.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 9:06 UTC (Wed) by anselm (subscriber, #2796) [Link]

There are various interesting things one could straightforwardly do with a well-designed binary log file format that are difficult to do with the current free-for-all text files. Efficient search according to (a combination of) different criteria is perhaps the most obvious candidate.

I agree that it is important to have a tool that will dump the binary file to a readable text file if required, but it seems to me that much of the »binary format sucks, over my dead body« we hear here is due to a Pavlov reflex.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 4:54 UTC (Wed) by ziggyfish (guest, #81547) [Link]

One of the big problems with implementing a binary base log file system is as mentioned already, only one tool can read it correctly. And you loose the security if someone finds out how the data is stored. It's the exact same thing "Problem" as a text based log file. The other problem is a lot has to change in order for programs to work with the new system, i'e dmesg needs to be changed, boot loaders (including GRUB) have to add tools to allow people to read logs in case something goes wrong with the system and it can't boot.

Also I can not see how you can use the same methods as before to clean the logs, and to backup these logs. For example what happened if you wanted to view that log file in Windows, how would you do this. From a tool point of view you need the correct security tokens (which can be easily bypassed), etc.

You also have to remember routers use the syslogd protocol to send messages to a unix systems in case of DDOS, DOS and port scans etc. How will this be handled?

I don't like the move, it defeats the whole point of UNIX. Every thing in UNIX is a text file. Anyone can add to it and anyone can remove lines from it. It is up to the kernel and the logging program to control who can do what. The point about the syslog files being text only is mute as normally only root can write to the file and a binary file has the same "problem".

Why not use XML or something like that so that tools can still read and parse it, the log files can still be read even if the system can not boot. and provide a tool that can control access to the log file.

Lets not follow some stupid Windows way of doing it when we have a tried and test way we have used for years.

On a side note, should we even be looking at the security of the log file when only root can change it and if the hacker has root access. they can do a lot more dangerous things then change a log file.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 5:01 UTC (Wed) by ziggyfish (guest, #81547) [Link] (3 responses)

One last thing, your already told who logged the message. as the first part of the message that gets logged is the process that logged it and the processes ID.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 10:09 UTC (Wed) by mpr22 (subscriber, #60784) [Link] (1 responses)

A quick look at /var/log/syslog on my desktop Ubuntu box suggests that this turns out to only partly be the case. There are several programs (at a minimum: acpid, NetworkManager, modem-manager, dhcpd), which do not have a PID in their syslog messages.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 21:40 UTC (Wed) by ziggyfish (guest, #81547) [Link]

Not by default, however it is still possible to add this information into the log.

The Journal - a proposed syslog replacement

Posted Nov 23, 2011 23:31 UTC (Wed) by dlang (guest, #313) [Link]

that program name and pid is data provided as part of the log written by the application, so the application can lie about both.

However,fixing this doesn't require making changes that are nearly this drastic.

systemd is already planning to create a new container for each application (with cgroups, etc), have it create a new filesystem namespace as well with a different /dev/log and the existing modern syslog daemons (rsyslog and syslog-ng) can record which container the log came from.

Not very thought through proposal

Posted Nov 23, 2011 12:07 UTC (Wed) by job (guest, #670) [Link] (4 responses)

Each log entry authenticates the previous ones? So in order to remove a few lines from the log, you have to rewrite the log since that point in time?

Somehow I doubt this will pose a problem to an attacker who is so stealthy he/she manipulates logs. Most attackers just wipe them. That's why remote logging was invented.

After all, you will need a toolset to handle these logs for reading, searching and writing. I'm sure there will be a tool in this toolset which rewrites logs (just like there is for git).

The part where you store the root seed which authenticates the whole logs is also quite opaque. I guess you have to store a new seed every time you rotate logs, otherwise you'll be completely unable to authenticate anything when (parts of) an older log goes missing. So you'd have to have a remote logging protocol which logs these seeds continously, at which point you could just let it log everything and be done with it.

I'm all for enforcing stricter inter-application log formats to make them easier to parse, but this is just solving made up problems instead of the real ones just because it's easier.

Not very thought through proposal

Posted Nov 25, 2011 9:57 UTC (Fri) by intgr (subscriber, #39733) [Link] (3 responses)

> Somehow I doubt this will pose a problem to an attacker who is so stealthy
> he/she manipulates logs. Most attackers just wipe them. That's why remote
> logging was invented.

Nope. I've seen intruders simply using 'sed' to delete the lines they want to hide. I do think that Journal will defeat log manipulation by many simpler attackers, simply because there are no distro-bundled tools to manipulate them.

Not very thought through proposal

Posted Nov 25, 2011 10:11 UTC (Fri) by dlang (guest, #313) [Link] (2 responses)

there will need to be distro bundled tools to manipulate the logs.

there are just too many cases where you need to deal with parts of the logs and you don't want to have to deal with all of the logs, so you need to do the equivalent of grep or grep -v

the only way it will slow the attacker down is as it is first introduced via 'security by obscurity', If it ever does become the standard, the attackers will just use the appropriate tools.

Not very thought through proposal

Posted Nov 25, 2011 22:27 UTC (Fri) by job (guest, #670) [Link] (1 responses)

Indeed there will be such tools. It is required as the format is undocumented binary.

The reason some attackers manipulate logs with sed today is simple because it's possible. They will use another tool when it's not.

I hope some of the proponents of this proposal would answer the details about remote seed logging and what problems this would solve as opposed to simply remote logging. This part is completely left out, and I still think the misunderstanding is on my part as it is simply a much too ill thought out proposal otherwise.

Not very thought through proposal

Posted Nov 25, 2011 23:31 UTC (Fri) by dlang (guest, #313) [Link]

one argument could be that remote seed logging is low bandwith compared to full remote logging.

I don't think it's a compelling argument (especially in light of all the complexity involved here, etc), but it's an argument.

by the way, it turns out that there is a RFC on how to properly secure logs, RFC5848 that has been through the mill of analysis, both from a crypto point of view and it's limitations (http://www.gerhards.net/download/log_hash_chaining.pdf)

The Journal - a proposed syslog replacement

Posted Nov 24, 2011 22:16 UTC (Thu) by jacob22 (guest, #81577) [Link] (1 responses)

Binary formats require tools to read them. Usually you have a single library for a specific format. This creates s Single Point of Failure. It becomes easy to target the tool rather than the data for the bad guy.

A very big strength of syslog is that it can be read by a large number of tools - from cat to Libreoffice. The multitude of tools have saved me on several occasions when I have been rootkited.

The Journal - a proposed syslog replacement

Posted Nov 25, 2011 13:51 UTC (Fri) by lindi (subscriber, #53135) [Link]

Doesn't that mean that the attacker can choose which tool to attack and always pick the weakest? For example if you use "cat" then you can be subject to issues described in the 2003 paper titled "TERMINAL EMULATOR SECURITY ISSUES".

The Journal - a proposed syslog replacement

Posted Dec 5, 2011 14:06 UTC (Mon) by ndecker (guest, #81690) [Link]

The property that a hash validates the whole logging chain of all entries before could be used for some interesting things.

Somebody could setup a public logging server that accepts a new hash every minute for a registered system and creates a hash chain of all entries it receives. The storage needed would be pretty small. This is the "write once" media that is practical because it needs to store only one hash per minute.

No information is disclosed to this service other than "i am logging".

Then there could be a recovery cdrom from the distribution which reads the journal of the public service and automatically validates all logs on the system.


Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds