LWN.net Logo

Advertisement

f-irc: away with the old irc-client paradigms! The graphical-interface feeling in a terminal. Give it a try!

Advertise here

The structured logging challenge

By Jonathan Corbet
July 12, 2011
The debate over the concept of "user-friendly disk names" was rekindled this week with the posting of a new version of Nao Nishijima's persistent device names patch. The disagreements over this particular feature remain; it is possible that the change will be merged regardless. At the core of this discussion, though, is a concept which goes beyond adding a user-specified name to specific devices; it's the bigger problem of getting structured data out of the kernel.

After all these years, the main mechanism by which the kernel passes information to user space remains the lowly printk() function. It is, needless to say, a useful and flexible way of getting messages out, but it imposes almost nothing on the structure of those messages. That leads to all kinds of output like (from drivers/net/de620.c):

    printk(KERN_WARNING "%s: Thanks, I feel much better now!\n", dev->name);

or the famous message from drivers/char/lp.c:

    printk(KERN_INFO "lp%d on fire\n", minor);

System administrators should not be faulted for wondering what they should do in response to messages like these.

There have been some changes to impose structure on printk() output, starting with the addition of a marker for the severity level of each message. It still is not hard to find printk() calls without severity levels, though; actually enforcing the use of these markers has proved hard to do. A bit more structure is added by dev_printk() (and variants like dev_err()), but the use of these functions is even less universal.

The lack of structure means that there is little consistency between messages; any two network drivers will almost certainly print different things to indicate the same situation. Kernel messages can also vary over time; messages emitted by printk() are also not normally seen as part of the kernel ABI, despite the fact that changing them can break scripts that try to extract useful information from the system logs. So it's not surprising that, about one year ago, Andrew Morton said:

The kernel's whole approach to messaging is pretty haphazard and lame and sad. There have been various proposals to improve the usefulness and to rationally categorise things in way which are more useful to operators, but nothing seems to ever get over the line.

Various people have tried to improve the situation in spots; the user-friendly disk names, by trying to attach a consistent name to devices, is one such attempt. The netoops patch from Google is another; it helps Google figure out why machines are crashing without requiring operators to actually dig through the logs. But these changes are far from an overall framework for structured data from the kernel.

There have been a few attempts to make such frameworks over the years; all have fallen far short of making it into the kernel. It's not hard to come up with plausible reasons for this failure. The amount of work required is huge, especially if one wants to add structure to the bulk of interesting communications from kernel space. Developers like printk(); they are less likely to be enamored of some other interface which requires more work to use, is less flexible (by design) in its output, and which may well have to sit alongside the existing printk() logging. Coming up with a structured format which meets everybody's needs - and which will not have to be supplemented with a "version 2" format in the coming years - presents its own challenges.

It must also be said that kernel developers, as a whole, see little value in standardized, structured kernel logging information. It will not help them to debug their kernels. The fact that a lot of users want this sort of feature is far from irrelevant to the development community, but experience has shown that a lack of developer interest can make it much harder to get changes merged - especially if those changes are wide-ranging and disruptive.

If this problem is ever to be solved, it would seem that two things need to be found: a mechanism which looks like it could work and a motivation for kernel developers to accept it. The motivation can probably found in a combination of (1) their paychecks as customers continue to push for this capability and (2) the prospect of a continuing stream of ad hoc patches adding structure to various corners without solving the real problem. But that leaves open the problem of finding a workable solution.

Your editor has a half-baked thought on this matter based on the realization that the kernel already has a nice mechanism for passing structured data to user space. On almost any contemporary system, the /dev directory is managed by the udev daemon; udev works by receiving highly-structured messages from the kernel describing the coming and going of devices, changes in their configuration, firmware load requests, and more. It is an established protocol which enables sophisticated user-space responses to kernel events. Udev and the associated "uevent" mechanism had some early growing pains, but this code is now stable, functional, and almost universally used. Perhaps it's time for this mechanism to take on some new duties.

Uevents work because the format is simultaneously structured and flexible; it can be extended when the need arises. The generation of events is almost entirely done automatically by the driver core; most driver authors need not do anything to cause them to happen and, indeed, may not even know that this mechanism is operating underneath the hood. Driver authors don't have to make their own events; they would have to go out of their way to prevent them from happening.

Logging of other types of events will probably require explicit support in the relevant kernel code; that is the part needing some extra thought. The creation of uevents by hand is a bit of a labor-intensive business; the relevant code tends to look like:

	retval = add_uevent_var(env, "ACTION=%s", action_string);
	if (retval)
		goto exit;
	retval = add_uevent_var(env, "DEVPATH=%s", devpath);
	if (retval)
		goto exit;
	retval = add_uevent_var(env, "SUBSYSTEM=%s", subsystem);
	if (retval)
		goto exit;

Clearly, any attempt to place this kind of code in every logging location is not going to get very far. What is needed is a useful set of helper functions. These functions, for maximum utility, would probably be tied fairly tightly to the underlying subsystems. Storage drivers could have functions to report block errors, device changes, and multipath connectivity changes. Network drivers would need to report events like carrier loss, excessive checksum errors, or duplicate MAC addresses. All kernel code could benefit from helpers to log allocation failures or failed assertions. In each case, the helper would standardize the format of the reported information while allowing the addition of information specific to the call site.

The addition of a new set of logging functions would necessarily require changes to drivers to use those functions. So it would take time to achieve anything close to comprehensive coverage, and 100% coverage would never happen. But, then, we still don't have 100% coverage for the KERN_* severity markers. If this interface proved useful, one could imagine that the code paths of interest to enterprise distribution customers would be covered in relatively short order.

But, then, there are probably several things fatally wrong with this idea; the structured logging problem will likely remain unsolved for some time yet. But the problem will not go away; if anything, the need to recognize and automatically respond to system events will only increase. Someday somebody will come up with a solution that works and that can be adopted with minimal pain; until then, printk() remains the only show in town.


(Log in to post comments)

The structured logging challenge

Posted Jul 14, 2011 16:55 UTC (Thu) by dankamongmen (subscriber, #35141) [Link]

"Network drivers would need to report events like carrier loss, excessive checksum errors, or duplicate MAC addresses. "

these kinds of things are actually available as highly structured (if somewhat underdocumented) rtnetlink(7) messages. i would argue that netlink is a more natural fit for this kind of data.

system logs structured or otherwise need both a degree of persistence (for review and forensics) and immediacy (for booting, lockups, and issues to which one would like to immediately respond). the latter suggests a requirement of display; any infrastructure would thus be at best a strict superset of current logging, and thus add complexity to an area where robustness is a if not the primary concern. of course, it also implies that old techniques can be used as fallback, or prior to initialization...

The structured logging challenge

Posted Sep 5, 2011 17:24 UTC (Mon) by jcbrowne (guest, #79912) [Link]

There is quite a bit of work on generating structured logs to enable more effective diagnosis of system states, particularly, faults and failures. Readers of this article may be interested in the rationalization of kernel log messages reported in

J. Hammond, T. Minyard and J. Browne “End-to-end framework for fault management for open source clusters: Ranger” Proceedings of the 2010 TeraGrid Conference

The remaining major obstacle to effective diagnostic use of system logs, once the kernel logging is rationalized as in the article cited above, is the fact that most software subsystem log are ad hoc and incomplete.

Jim Browne

Copyright © 2011, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds