Interesting idea, but...
Interesting idea, but...
Posted Nov 20, 2011 19:44 UTC (Sun) by vrfy (guest, #13362)In reply to: Interesting idea, but... by tshow
Parent article: That newfangled Journal thing
For now, the journal files are just an indexed ring buffer that follow store/forward logic. Anybody who depends on any data/log archival or classic tools to run, should run the syslog daemon at the same time, which gets exactly the same data and writes the same files as before.
The journal data is a zero maintenance buffer, that just takes the maximum configure disk space an no other ressources, it does not change any aspect of classic syslog. Local tools like systemd, or other agents will rely on the indexed data from the journal during runtime, and can not work with syslog files.
Journal files are not a log archive in the same sense as syslog files are for some time. That's just impossible to promise from where we are right now, hence the warning about it.
Posted Nov 20, 2011 19:59 UTC (Sun)
by quotemstr (subscriber, #45331)
[Link] (1 responses)
Having a fluid format during development is acceptable. Shipping a critical component that writes with an unstable format is not.
> the journal files are just an indexed ring buffer that follow store/forward logic
Have you guys looked at ETW, the Windows event logging facility? The designers encountered and solved many of the issues that journal will run into. I still prefer plain-text logs to binary ring-buffer abominations, but if you're going to go down the latter route, you might as well learn from the past.
One convenient feature of ETW is a configurable and flexible approach to log management. ETW lets you decide on a per-log basis whether you want ring or serial logging, whether you want new log files to be created as old ones fill up, how often you want logs flushed to disk, and so on. ETW also has a "real-time consumer" option that basically pipes log messages to a process instead of writing them to disk, allowing programs to perform actions when certain system events happen. (You can even start a service when a certain log message appears.) The kernel also uses ETW to log its messages, allowing you to configure all the logging in one place.
Now, the details of ETW are disgusting: there are too many overlapping log options (you can do log rotation two different ways), too many log-message formats, too much metadata for each message (do we really need separate "keywords" and "categories" and such?), and too baroque an API. It's too damned hard to get simple plain-text output from the Windows event-logging system.
But beneath all the grime, there are some interesting ideas.
Posted Nov 23, 2011 14:41 UTC (Wed)
by skorgu (subscriber, #39558)
[Link]
The idea of highly structured output appeals to me but it seems more like tracing than logging. I'd rather have a kickass perf and a kickass syslog than try to combine the two.
Posted Nov 20, 2011 20:03 UTC (Sun)
by tshow (subscriber, #6411)
[Link] (1 responses)
At any rate, if the plan is to have a properly documented format for the log data when the project hits 1.0, I withdraw my objection; my concern was that it appeared that undocumented was the overall plan, not simply a consequence of the early development phase.
Posted Nov 20, 2011 20:19 UTC (Sun)
by vrfy (guest, #13362)
[Link]
The classic syslog model is not touched at all. If anything relies on any of the syslog features, the format, the remote logging, the files, it should run syslog like it always did.
This is something that runs on the local machine, and serves as the base for tools that need to make decisions or provide the 'history' of services. If syslog is the model to look at, journald is just a proxy.
The journald design is network-aware, but in no way network-transparent. All that can be done pretty efficiently by additional tools, but these tools will probably not be part of the core installation.
Posted Nov 20, 2011 22:19 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (7 responses)
Posted Nov 20, 2011 23:43 UTC (Sun)
by khim (subscriber, #9252)
[Link] (6 responses)
I think Google does it right. They have two separate types of logs: Sometimes messages generated in logs of second type are deemed valuable enough and are "promoted": at this point unique message type should be added, message should be documented, etc. Adding unique UUID is trivial in comparison :-) I doubt it makes much sense to force structure on all log messages which program can ever generate. If log record is supposed to be analyzed exclusively by humans then it's Ok to use free-form text for it. If it can be used by other software then it's time to organize and document it.
Posted Nov 21, 2011 1:22 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (5 responses)
I don't really see how reverse domain names are worse, except of handwavy 'it's faster to use UUIDs'.
Posted Nov 21, 2011 1:31 UTC (Mon)
by dlang (guest, #313)
[Link] (3 responses)
Posted Nov 21, 2011 1:49 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
Posted Nov 21, 2011 10:31 UTC (Mon)
by dlang (guest, #313)
[Link]
but back to my point. you can send structured logs via syslog today, there is even a standard to do so. People choose not to do this today, but that's not the fault of the syslog mechanism, that's the fault of the programmers.
As has been noted elsewhere, you can still make unstructured logs through this new mechanism, so the new mechanism doesn't give you structured logs any more than syslog does
Posted Nov 23, 2011 17:01 UTC (Wed)
by sam-williams (guest, #57470)
[Link]
Structure would improve things a bit, but no self-respecting systems administrator would suggest they can't do their job without a bit of binary hand-holding. The binary fileformat could cause more problems then it cures. Care should be used in providing an ability to access this information with simply tools.
Posted Nov 21, 2011 8:08 UTC (Mon)
by khim (subscriber, #9252)
[Link]
The only way known to humanity is to put fixed value in all fields except textual "details" field and then write free-form description there. A lot of peoples tried to make "structured logging everywhere" work, yet none succeeded. This means it's time to stop trying to push "structured logging everywhere" idea and think about different question "do we really need structured logging?" and the answer is "probably not". A lot of logs only make sense for someone who has detailed knowledge of the program. If you don't think long and hard about what your log is trying to convey and to whom then then no amount of structure applied will help. And not all log messages deserve such attention. At least this is what developers usually think - and if you system will not be accepted by developers then it may as well not exist.
Interesting idea, but...
Interesting idea, but...
Interesting idea, but...
Interesting idea, but...
Have you considered using reverse domain names for message ids?
Interesting idea, but...
UUID are just too unwieldy. Imagine that I'm a developer writing:
class SomeClass
{
void someMethod()
{
while(true)
FILE *fl = fopen(...);
if (!fl)
log(DEBUG, "File with data is not found, trying to create it");
...
At that moment you realize that you need a UUID for this message. Ok, easy enough:
class SomeClass
{
void someMethod()
{
while(true)
FILE *fl = fopen(...);
if (!fl)
log(DEBUG, "d92bc8ba-7a98-4e49-8384-8ee013e2f773", "File with data is not found, trying to create it");
...
But damn! Now my string wraps around 80 character limit! And if I'm unlucky it can wrap around inside the UUID itself, so I need to reformat my code.
Alternatively, I need to create a file with #defines of UUIDs and write things like this: "log(DEBUG, FILE_DATA_NOT_FOUND_UUID, "File with data is not found, trying to create it");" which leads to duplications. So most lazy programmers (such as myself) would just define a couple of UUIDs and use them with free-form text fields for verbose messages.
So please, consider making a good API for developers to be your first priority. And using UUIDs in API is definitely doubleplus ungood.
Actually I think it's good idea to have two types of logs
1) Binary structured logs which are moved to centralized storage, kept for a long time and are used for a lot of things.
2) Debug free-text logs which are only kept around till server is restarted.Actually I think it's good idea to have two types of logs
Actually I think it's good idea to have two types of logs
Actually I think it's good idea to have two types of logs
Actually I think it's good idea to have two types of logs
Actually I think it's good idea to have two types of logs
Any conclusion made from false premise will be true so you rant is certainly valid.
As I've already said...
Well, if it can be made easy to use structured logging everywhere, then why not do it?