|
|
Log in / Subscribe / Register

The journald design is horrible to the point of useless

The journald design is horrible to the point of useless

Posted Dec 1, 2011 8:26 UTC (Thu) by anselm (subscriber, #2796)
In reply to: The journald design is horrible to the point of useless by dlang
Parent article: That newfangled Journal thing

rsyslogd is a great piece of software, no doubt. However the fact that there is one outstanding implementation of a syslog receiver (or two, if you want to count syslog-ng, too) doesn't really save syslog as such. What would be needed is a thorough refurbishment of the client-side syslog API, and then to get more of what rsyslogd does into other syslogd implementations (server-side and client-side).

If nothing else, the journald proposal helps us identify how a next-generation syslog API should look like. It seems that Rainer Gerhards has already adopted some journald ideas for rsyslogd, and if journald does indeed lead to improvements in syslog output storage, too (preferably standardised and well-documented ones), then it has already been useful.

Like dlang, I can only recommend Rainer's piece to anybody who is interested in an informed discussion. As he says himself, he is biased in one direction while the journald proposal is biased in another, but he is clearly very well-informed about the issues at hand and it shows. It would be interesting to hear Lennart and Kay comment on his article.


to post comments

The journald design is horrible to the point of useless

Posted Dec 1, 2011 12:04 UTC (Thu) by dlang (guest, #313) [Link] (13 responses)

If you exclude rsyslog and syslog-ng, how many syslog implementations are in use nowdays?

On linux, sysklog has been poorly maintained for a long time, and has been replaced by rsyslog on just about every distro. older distro releases won't have it, but they also won't have systemd or the journal either.

In terms of a log store, what is it that you are looking for? As I see it, the needs of the log store vary drastically, not just from person to person, but also from use to use.

Yes, sometimes you want to do a search for a particular type of log event, but a lot of the time you want to see what happened around a particular time.

syslog does not dictate how the data is stored, and you can even store the same data in many different ways (on different machines, and even shard it across farms of machines)

Yes, the syslog() call could use improving, but since the journal proposal says that it's going to remain compatible with it, it doesn't fix it.

Also, I see it less as Rainer adopted some journald ideas as the journald people aren't aware of what was already available (the trusted properties being one exception)

The journald design is horrible to the point of useless

Posted Dec 1, 2011 13:25 UTC (Thu) by anselm (subscriber, #2796) [Link] (12 responses)

If you exclude rsyslog and syslog-ng, how many syslog implementations are in use nowdays?

The BSDs apparently have their own implementation. MacOS X probably has, too. There are likely syslogd implementations in all sorts of appliances.

This doesn't matter as much for the purposes of journald, which tries to reboot the franchise to a large extent, but as far as tweaking the existing syslog mechanisms are concerned, it is just as well to be aware that relying on one single implementation for all the improvements isn't all that different from coming up with something completely new and different.

Yes, sometimes you want to do a search for a particular type of log event, but a lot of the time you want to see what happened around a particular time.

I maintain mail servers (among other things). I would be pleased to see a way of grouping all log entries that pertain to one particular mail message, or one particular sender or recipient. Right now this means grepping /var/log/mail.log and friends, which sucks, particularly when log files get rotated.

Improving this would mean getting the developers of a diverse bunch of software packages (Postfix, Amavisd, policyd-weight, …) to adopt a common style of structuring log messages (RFC 5424 is a definite start but really what we need to standardise here is semantics). It would also mean using a log store that actually allows querying for fields such as »message ID«, »queue ID«, or »sender address« in a reasonably efficient fashion, which the current simple-minded method of dumping everything into a text file certainly doesn't. Even dumping everything into a database doesn't unless you actually normalise the individual fields (or an interesting subset of them). I'm not holding my breath.

Given that it involves applications rather than infrastructure, the first goal is of course independent of the »rsyslogd vs. journald« issue but it must be said that journald's binary-format normalised log store, in principle, goes a way towards realising the second goal that, so far, rsyslogd doesn't. Of course nobody says that the current journald proposal is the be-all and end-all of logging, so if all that results from it is that Rainer gets prodded into implementing a more useful log store for structured messages then something important will be gained.

Yes, the syslog() call could use improving, but since the journal proposal says that it's going to remain compatible with it, it doesn't fix it.

The journald proposal doesn't actually say that. It says:

[T]he syslog API syslog(3) is supported as first-class interface to write log messages, and continues to be the primary API for all simple text logging. However, as soon as meta data (especially binary meta data) shall be attached to an entry the native journal API should be used instead.
From Lennart's and Kay's point of view this makes perfect sense, because it allows them to put improvements into the native API, considering that current syslog() users are unlikely to change their code (if they would do that just for journald, they could just as well upgrade it to use the native API).

If we want to hang on to the current syslog scheme, the first thing that it would make sense to do would be to implement a client-side API for RFC 5424 structured-data logging that would guarantee well-formed log messages, and to add this to glibc. The next thing would be to agree on a scheme for identifying facilities that is more flexible and fine-grained than the 24 arbitrary and hard-coded facility codes, many of which do not apply to modern systems. Whether this happens by allowing more numbers or by standardising a structured-data element would be subject to debate.

Also, I see it less as Rainer adopted some journald ideas as the journald people aren't aware of what was already available (the trusted properties being one exception)

Trusted properties, an efficient log store, generality, performance, … there are various things in the journald proposal that rsyslogd has yet to address. As I said, if all that comes out of the journald idea is a better syslogd then we have gained something already. Finally, it would still be fair to allow Lennart and Kay some credit for the cojones to tackle yet another difficult problem, and for getting things moving. The way I see this discussion going, many people (Rainer notably not included) seem to call them stupid for even trying, with no other justification than »We don't like PulseAudio and systemd, and rsyslogd does everything already, nyah, nyah, nyah«, which is (a) untrue and (b) disingenuous.

The journald design is horrible to the point of useless

Posted Dec 1, 2011 13:47 UTC (Thu) by dlang (guest, #313) [Link] (11 responses)

>> If you exclude rsyslog and syslog-ng, how many syslog implementations are in use nowdays?

> The BSDs apparently have their own implementation. MacOS X probably has, too. There are likely syslogd implementations in all sorts of appliances.

> This doesn't matter as much for the purposes of journald, which tries to reboot the franchise to a large extent, but as far as tweaking the existing syslog mechanisms are concerned, it is just as well to be aware that relying on one single implementation for all the improvements isn't all that different from coming up with something completely new and different.

syslog-ng and rsyslog both work across different flavors of Unix. Since LP says that we should only develop for Linux and all other flavors should go hang, this is definantly not an argument in favor of the journal.

there are two independent implementations (not providing the exact same features, but competing and adding features)

but in both of these implementations are doing incremental improvements, not "throw out everything that's worked up until now because I know better"

getting applications to structure their logs is completely independent of what logging mechanism they use, look at the work that CEE is doing to try and define standards (and then watch to see if anyone bothers to use these standards when writing logs) I don't believe that a binary format is going to help this any.

that being said, rsyslog does include the ability to define log parsers that create fields that can then be manipulated by rsyslog (either for decisioning by rsyslog, or for writing to a log store) see http://www.rsyslog.com/doc/mmnormalize.html for a hint on how to invoke this in rsyslog (it then creates $!<name> properties that can be used)

I would strongly suggest that if the default log store you are dealing with does not meet your needs, that you should look at reconfiguring the systems that you manage to use a log store that better fits your needs.

then you say

> Trusted properties, an efficient log store, generality, performance, … there are various things in the journald proposal that rsyslogd has yet to address.

addressing your wish list

rsyslog now has trusted properties (it already had trusted PID)

it supports many different log stores, you need to define what you mean by 'an efficient log store' more precisely before anyone can tell you which one is best for you.

generality seems to be a clear win for rsyslog compared to the journal proposal and it's lock-in to systemd and Linux

as far as performance goes, rsyslog can process a million logs a second (of the right log type, on the right hardware), I seriously doubt that the journal will do nearly as well (especially in it's first implementation), so you must mean something different in terms of performance. what is it that you are looking for in the logging daemon that makes you think that the journal (which doesn't exist yet) will outperform existing, tuned and tested code?

The journald design is horrible to the point of useless

Posted Dec 1, 2011 14:51 UTC (Thu) by anselm (subscriber, #2796) [Link] (10 responses)

syslog-ng and rsyslog both work across different flavors of Unix. Since LP says that we should only develop for Linux and all other flavors should go hang, this is definantly not an argument in favor of the journal.

One difference between you and me is that you seem to be out to prove Lennart Poettering is an idiot while I'm interested in better logging in general, so I'm willing in principle to see some good in the journald proposal. I don't believe that journald as per the proposal is 100% pure gold, but that some of its properties deserve being looked at rather than thrown out outright. (I'm pleased to note that Rainer's approach is apparently fairly similar to mine.)

In particular, I don't think Lennart and Kay are at their most brilliant when considering only Linux-based logging, where logging (to a much larger extent than, say, service management à la systemd) often goes across platform boundaries, and so I'm personally in favour of approaches that do work on various systems. This is one issue that I have with journald – not an insurmountable one but an important one.

On the other hand, while it is obviously possible to install rsyslogd on other flavours of Unix, I personally, before doing anything like that, would want to convince myself that those other flavours of Unix have not made their own tweaks to their own syslogds that rsyslogd doesn't support. With a platform that is as ill-defined and fragmented as syslog this may be an issue. Right now we have one standard for syslog that is wildly obsolete and a newer, better one that people do not seem to be exactly falling over themselves to implement, which is not a good situation either.

addressing your wish list

First of all, this isn't »my wish list«, just a few points I quoted from the proposal.

The performance issue, as far as I'm concerned, pertains to dealing with the actual logs rather than accepting messages. It is all well and good to say that I could in theory implement a log store for rsyslogd that does what I want. Journald apparently has implemented such a log store already, and IMHO actual code beats code that might eventually be written at some point. (And incidentally I did outline what I'd like to see in a log store for e-mail logs; if you'd care to point me at existing rsyslogd-based code that implements something along these lines – preferably as a package for Debian Squeeze –, then I'd be very happy.)

Right now the answer to various points in Lennart's and Kay's proposal is »yes, we could do that too, if we wanted«. Not all of these points appear to be desperately required, but once again I'm not here to prove that journald is better than rsyslogd, I'm interested in what we can learn from the journald proposal in order to improve logging in general. If the journald proposal leads to a better rsyslogd (which at least concerning trusted properties it already has) and nothing else then we have already gained something.

The journald design is horrible to the point of useless

Posted Dec 2, 2011 4:36 UTC (Fri) by dlang (guest, #313) [Link] (9 responses)

I have never said that LP is an idiot, If he was I wouldn't care what he said as such people don't influence Linux much.

I do believe that he is wrong, and I give him the benefit of the doubt and assume that he failed to research the current state of the art around syslog
(and the fact that Rainer says that LP never talked to him about the journal reinforces this), because otherwise he is being deliberately misleading in his statements.

The fact that anyone would propose such a drastic change without learning the state of the art significantly undermines the credibility of their paper as far as I am concerned.

syslog-ng and rsyslog are both commonly used on *BSD and solaris, so while you may want to do research to see if there are missing tweaks, others have done so already.

there are many different ways of storing logs, different ways have different benefits and drawbacks when accessing the logs

the traditional text files rolled periodically are great for looking at what happened around a particular time (including especially, what is happening now, or has happened just recently)

However, they are horrible at doing searches for previously unexpected information over a large volume of logs.

If you know what you are expecting to be looking for, you can tailor your log store to make it easy to find that, but assuming that you don't want to do so, a couple 'off the shelf' approaches to rapidly finding things in large log volumes are:

1. log to a postrgesql database and enable the full-text indexing (I believe it's called gist indexing), this will let you search rapidly for any text

2. log to hadoop and use it's search capabilities (including clustering) to search the logs.

in both cases this is a little more than just 'apt-get magic-tool', because you need to configure the database and then configure rsyslog to write to the database, but all the pieces are there, it really is just a matter of configuring them.

this is more than just "yes we could do that if we wanted", it's "we already created the tools to do that, and people are using the tools to do that, you can do the same"

The journald design is horrible to the point of useless

Posted Dec 2, 2011 9:04 UTC (Fri) by anselm (subscriber, #2796) [Link] (8 responses)

syslog-ng and rsyslog are both commonly used on *BSD and solaris, so while you may want to do research to see if there are missing tweaks, others have done so already.

Yes, but these are the open-source Unices. I'd be more worried about the likes of MacOS X, AIX, or HP/UX.

the traditional text files rolled periodically are great for looking at what happened around a particular time (including especially, what is happening now, or has happened just recently)

Great in that the messages in question do end up near one another. Not so great in that you need to scan a file (or a whole bunch of files) sequentially in order to find where that »particular time« is in the log. (One could probably do per-file binary search based on the log time stamps but if there is a tool to do that it is definitely not in widespread use.) Also it turns out that, like in the e-mail case, related log messages do not always happen to occur around a particular time.

I still like the systemd/journald idea of including the last few lines of log output produced by a system service in a »service status« output. This is wildly impractical under the current scheme since there is no way to tell in which file(s) a service's output even ends up (short of parsing the syslogd configuration), and even then you need to scan the file(s) sequentially to find what you're interested in. You end up having to tweak all your init scripts and/or reinvent large parts of your logging infrastructure. Systemd and journald suddenly start looking pretty sweet by comparison. (And yes, the stock answer to that is »We didn't need this for the last 30 years, it is unnecessary, nobody should bother, systemd sucks.«)

in both cases this is a little more than just 'apt-get magic-tool', because you need to configure the database and then configure rsyslog to write to the database, but all the pieces are there, it really is just a matter of configuring them.

And if the pieces weren't there, it would really just be a matter of putting them in? Seriously, I think you're making it a bit too easy on yourself here. Especially since the real problem is not logging stuff to a database (which rsyslogd can do if you ask it nicely), it is getting at the stuff afterwards – which again is something that Lennart and Kay do acknowledge in their proposal. Getting rid of old stuff would also be something one would have to consider at some point. Full-text search in databases is nice but it would be even better to be able to exploit the structure that some types of log do have, that RFC 5424 suggests, and that journald pretty much enforces. Right now this requires considerable manual work, and »we created the tools to do that« is over-optimistic – »we created the tools to enable people to build the tools to do that on their own« would be closer to the truth (and is already not a bad thing, but a lot less than what you claim).

The journald design is horrible to the point of useless

Posted Dec 2, 2011 14:18 UTC (Fri) by dlang (guest, #313) [Link] (7 responses)

When I talk about rsyslog running on Solaris, I'm not talking opensolaris. as for the other flavors of Unix, if any users of them care enough about rsyslog or syslog-ng running on that platform it will happen (either from contributed code or from paying for support). But in any case the existing syslog daemons do just fine receiving messages from those systems (and various other proprietary systems like Cisco) and then handling those logs just like all the other logs. This means that you can combine them in whatever log store you choose to use. You won't get things like trusted attributes from such systems, but that's because those systems don't offer that capability, and journald sure isn't going to solve this since it won't run on those systems (and can't make such systems provide info that isn't there in the first place)

finding which file in a series of time-rotated files a particular time is in is pretty trivial, as is finding the time in the log file once you have opened it (at least if you use a normal test editor), you can either do the binary search yourself by jumping to various line numbers, or just search for the timestamp you are looking for. You are really overstating the difficulty here. Yes this can be done wrong (rolling the logs daily, even if you have gigs of logs in a day is usually not a wise thing to do for example)

the idea that you can get the log messages when you ask for the status of a program only works in the trivial case where all the logs are written by the one pid that was started by the program you are asking about the status. If that pid started other processes that then wrote logs, systemd (or equivalent) isn't going to have a way of knowing for sure which 'service' those log messages are for.

I don't believe that the Journal is going to enforce programs writing well structured logs any more than the windows event log does (see Rainer's comment that in spite of the 'structure' being enforced by windows event log, the problem of analyzing logs on windows is the same mess that it is on *nix)

It sounds as if you have decided that anything that LP writes is the Pony that you want and any criticism of it just means the person doing the criticizing is against all progress.

The journald design is horrible to the point of useless

Posted Dec 2, 2011 17:07 UTC (Fri) by raven667 (subscriber, #5198) [Link] (1 responses)

If that pid started other processes that then wrote logs, systemd (or equivalent) isn't going to have a way of knowing for sure which 'service' those log messages are for.

I think this particular statement is probably not true. systemd uses the cgroups feature to associate all processes as part of a generic "service" or "session" which is not fooled by daemonization so it could reliably associate all logs across multiple PIDs with a particular "service".

The journald design is horrible to the point of useless

Posted Dec 2, 2011 19:05 UTC (Fri) by dlang (guest, #313) [Link]

this is a point I didn't consider, but identifying what cgroup something is in is going to be another racy lookup isn't it?

in any case, I've suggested to Rainer that he add this lookup to the trusted properties that rsyslog provides. Once that is there then it will be trivial to split the logs by service (and therefor trivial to do a tail of the most recent logs from any service)

The journald design is horrible to the point of useless

Posted Dec 2, 2011 17:16 UTC (Fri) by anselm (subscriber, #2796) [Link]

the idea that you can get the log messages when you ask for the status of a program only works in the trivial case where all the logs are written by the one pid that was started by the program you are asking about the status.

AFAIK this is wrong. Systemd puts each service into its own control group. Processes that the service generates stay in the same control group. The system status command can then consider all messages from the same control group, or the same systemd service, in order to catch everything from the service process and its children. This is actually in the proposal.

That this is a lot more difficult to do with the current infrastructure (SysV init and (r)syslogd) doesn't mean that it is, in fact, impossible in general. Systemd is not a complete exercise in futility.

It sounds as if you have decided that anything that LP writes is the Pony that you want and any criticism of it just means the person doing the criticizing is against all progress.

No. As far as I am concerned it is absolutely OK to criticise Lennart's and Kay's proposal, just as it is absolutely OK to criticise the existing syslog infrastructure and protocol. Progress results not from hanging on to the existing stuff at all costs, but from critical evaluation of both the old and any newly proposed stuff, and from putting the best ideas of both together to create something that is better than what we had before. Whether that is, in the end, a beefed-up syslogd or something entirely different is irrelevant as long as it solves the problems at hand and there is a reasonable compatibility path to the existing infrastructure. You will note that this also seems to be Rainer's attitude, although he is (for understandable reasons) somewhat biased towards the »beefed-up syslogd« end of the spectrum of possible solutions. (Which is of course OK.)

I've been using various syslogd implementations for a very long time and have usually been able to get them to do what I need, but that doesn't mean I'm blind to possible improvements that may come in from elsewhere. This IMHO is a better approach than dismissing all of a number of possible improvements outright just because one does not like the people who came up with them. No one (not even Lennart or Kay) believes that the journald proposal solves all possible problems with logging, but pretending that there are no problems with logging at all, or no serious problems, or that if there are in fact problems then the journald proposal doesn't solve them either, doesn't actually lead to progress.

The journald design is horrible to the point of useless

Posted Dec 2, 2011 17:28 UTC (Fri) by raven667 (subscriber, #5198) [Link] (3 responses)

It sounds as if you have decided that anything that LP writes is the Pony that you want and any criticism of it just means the person doing the criticizing is against all progress.

I can't speak for the OP but from from my perspective, as someone who is not personally invested in the outcome, about 80% of the comments seem to be baseless negative personal animosity against LP or an appeal to tradition against progress. Half of the other 20% are making actual technical arguments pointing out flaws in the proposal and the other half are defending LP, pointing out positives about the proposal or just advocating keeping an open mind. Those that are defending tend to be responding more the 80% than to the actual constructive criticism of the 10%

Some of the constructive criticism is very pursuasive and I'm not at all sure that the journal is the right way to go, unlike with systemd which was so obviously the right, UNIX, way to do things that I wish it was written decades ago. I can't help of think about daemontools supervise and multilog, which I used very successfully for many years, and see LP as this generation's DJB.

The journald design is horrible to the point of useless

Posted Dec 2, 2011 19:06 UTC (Fri) by dlang (guest, #313) [Link] (2 responses)

> and see LP as this generation's DJB

I think that may people would agree with you, including a LOT of the people who are being critical of LP and this proposal.

there are good reasons (and not just personal animosity or his license choice) that DJB's software did not take over the world.

LP: the new DJB?

Posted Dec 2, 2011 22:50 UTC (Fri) by anselm (subscriber, #2796) [Link] (1 responses)

there are good reasons (and not just personal animosity or his license choice) that DJB's software did not take over the world.

However, some of DJB's ideas did become popular on technical merit. Maildir comes to mind.

I don't think Lennart and Kay are doing badly in comparison at all. Systemd in particular is likely to go a lot farther than Qmail or DJBDNS ever did, simply because, whatever its detractors may say, it does have all kinds of advantages compared to SysV init (the situation is a lot more obvious than with, say, Qmail vs. other MTAs) and the two don't go for DJB-style rest-of-the-world-be-damned my-way-or-the-highway backwards incompatibility in quite the same way. After all, systemd still interfaces with SysV init scripts and traditional syslogd but in a manner that introduces interesting and useful new features.

As far as journald is concerned, we'll have to see; maybe the future will just be a closer association between systemd and rsyslogd, which wouldn't be a bad thing either.

LP: the new DJB?

Posted Dec 3, 2011 3:39 UTC (Sat) by raven667 (subscriber, #5198) [Link]

Actually when I read some of the first design docs for systems I was greatly reminded of daemontools which I have used in all my big, 24/7 systems over the last decade. I chose daemontools because of its technical merits, because of obvious deficiencies in sysv init and despite the fact the software comes from Mars with its own filesystem layout and other needless complications. We even tried using svscan for pid 1. Systemd has all the technical merits and none of the deficiencies.

I hope this journal thing leads to some improvements but I'm less certain that the journal as described is the be all and end all of logging. Let's see where it goes.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds