LWN.net Logo

They should be paying attention to the lumberjack project

They should be paying attention to the lumberjack project

Posted Apr 14, 2012 2:46 UTC (Sat) by slashdot (guest, #22014)
In reply to: They should be paying attention to the lumberjack project by dlang
Parent article: Toward more reliable logging

Are hierarchical structures really needed in logging? Why?

How do you plan to index and search a set of records consisting of hierarchical structures?

A list of key/value pairs seems much better, and can support indexing and SQL queries trivially.


(Log in to post comments)

They should be paying attention to the lumberjack project

Posted Apr 14, 2012 23:21 UTC (Sat) by dlang (✭ supporter ✭, #313) [Link]

> Are hierarchical structures really needed in logging? Why?

I was not involved with the discussions, but a few things come to mind

1. the data you want to log may include structures

2. you may need to log multiple data items of the same type (like a list of filesnames)

you can always choose to flatten a hierarchical structure if you need to, but it's much harder to re-create the structure after you have flattened it

not all storage is SQL.

They should be paying attention to the lumberjack project

Posted Apr 14, 2012 23:35 UTC (Sat) by man_ls (subscriber, #15091) [Link]

Nice! Now all we need is a MongoDB database inside the kernel. And it could be used for all kinds of things: device trees, file structures, even memory management come to mind.

Just joking. I think I get the least-common-denominator motivation. But isn't getting JSON into any kind of logging facility going to cause immediate "designed-by-committee" fears into kernel developers, and therefore ignore the lumberjack project?

They should be paying attention to the lumberjack project

Posted Apr 15, 2012 0:01 UTC (Sun) by dlang (✭ supporter ✭, #313) [Link]

remember that what we are talking about is not anything consumed inside the kernel, just the format of a one-way feed to output from the kernel to userspace

They should be paying attention to the lumberjack project

Posted Apr 15, 2012 1:01 UTC (Sun) by man_ls (subscriber, #15091) [Link]

OK, that makes sense. JSON is trivial to generate.

They should be paying attention to the lumberjack project

Posted Apr 17, 2012 16:38 UTC (Tue) by k8to (subscriber, #15413) [Link]

To be fair, a list of say filenames isn't hard to do in kv pairs.

filename:foo filename:bar filename:baz

Sure, that's more trouble to parse than assuming you can't have repeats, but it's not much work.

However, it starts getting tedious if you need to say something like:

severity:fatal message="corruption in filesystem regarding following items" filename:foo inode:3 filename:bar inode:6 filename:baz inode:8

At this point you really want more structure, or else the consuming end has to intuit to group things.

They should be paying attention to the lumberjack project

Posted Apr 21, 2012 8:19 UTC (Sat) by neilbrown (subscriber, #359) [Link]

.... filename-1:foo inode-1:3 filename-2:bar inode-2:6 filename-3:bar inode-3:8

Explicit structure embedded in the names of name:value pairs.

Yes, it's ugly. But it's simple and if most cases don't need any real structure, then the ugliness will hardly be noticed.

They should be paying attention to the lumberjack project

Posted Apr 15, 2012 1:56 UTC (Sun) by dlang (✭ supporter ✭, #313) [Link]

to some extent, hierarchical structures can be simulated by item names

for example, a firewall log message needs to have the following items in it

the device generating the log message
the device that is the source of the traffic
the device that is the destination of the traffic

each of these device definitions may include more than one piece of information (hostname, FQDN, IP address, port number)

you could have

loghost: hostname, logip: 1.1.1.1, sourcehost: hostname2, sourceIP: 2.2.2.2, sourceport: 1234, destinationhost: hostname3, destinationIP: 3.3.3.3, destinationport:1234

or you could have
logsource { name: hostname, ip: 1.1.1.1}, source { name: hostname2, ip: 2.2.2.2, port: 1234}, destination { name: hostname3, ip: 3.3.3.3, port 1234 }

personally, I find the second arrangement better and less likely to get confused by people adding extra information to a particular component

as another example, think of all the contexts that a userid can appear in, including what user the applications writing the log message is running as, should all these different possible contexts use a different tag? or are we better off using the same tag everywhere and using the hierarchy information to determine the context?

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds