|
|
Subscribe / Log in / New account

Debian TC vote on init system coupling

Debian TC vote on init system coupling

Posted Feb 24, 2014 20:31 UTC (Mon) by fandingo (guest, #67019)
In reply to: Debian TC vote on init system coupling by jude-
Parent article: Debian TC vote on init system coupling

> I have one: journald does not gracefully handle log file corruption it causes due to improper shutdown. What I would like it to do is write log records transactionally, so I can tell the difference between someone tampering with my log, and my log being left in an inconsistent state because my laptop crashed (i.e. uncommitted transactions). This will require changing the write protocol, which is non-trivial. Again, I'm willing to do this myself. But, a non-stable journald/systemd API (not to be confused with the external D-Bus API) would mean that I not only have to make the implementation, but also re-write parts of it each time I upgrade systemd. That makes it too cost-ineffective for me.

You're in luck, and it doesn't require re-implementing journald. I'm at this moment working on a feature to journald that will allow you to do this much easier than you imagine.

I'm unsatisfied with the journald-->remote journald logging, which requires a syslog intermediary. Rather than building in remote logging directly into the journal, which is the obvious solution, I figured that there are other applications that would be interested in this data. (In particular, I was imaging more network-based uses like native loggers for Splunk and LogStash.) The implementation that I'm currently working on exposes log data over KDBus (so practically zero performance impact) to subscribers. Anything can subscribe and get log messages in their entirety. You would be able to set journald.conf Storage=None and have your own persistent writer connect to the system bus and write out your preferred format.

===

> If the journald-to-systemd-PID-1 API was stable

What gives you the impression that it is unstable? The systemd developers include it in the stability promise.


to post comments

Debian TC vote on init system coupling

Posted Feb 24, 2014 21:59 UTC (Mon) by jude- (guest, #95678) [Link]

I guess I need clarification on what is meant by "The D-Bus interfaces of the main service daemon" and "The internal protocols used on the various sockets such as the sockets /run/systemd/shutdown, /run/systemd/private", as discussed in [1]. I interpreted this to include the interfaces by which systemd-as-PID-1 and journald communicate with one another. They're internal interfaces that applications don't use (as far as I can tell), but the various systemd daemons use.

Is this the same as the Service bus API (listed in [2])?

[1] http://www.freedesktop.org/wiki/Software/systemd/Interfac...
[2] http://www.freedesktop.org/wiki/Software/systemd/Interfac...

Debian TC vote on init system coupling

Posted Feb 24, 2014 22:04 UTC (Mon) by jude- (guest, #95678) [Link] (3 responses)

> You would be able to set journald.conf Storage=None and have your own persistent writer connect to the system bus and write out your preferred format.

This sounds pretty cool, and is definitely something I'd be interested in using :)

My main concern with regards to transaction-like logging semantics is that the processes of updating the log indexes, updating the rolling log hash, and writing the log message do not occur as a single atomic operation. A crash in the middle of logging a message would leave the logs, indexes, and hashes in an inconsistent (but potentially recoverable) state. Does your implementation address this? As in, once you expose a log record over kdbus, are its associated hashes and indexes guaranteed to be consistent?

Debian TC vote on init system coupling

Posted Feb 24, 2014 22:47 UTC (Mon) by fandingo (guest, #67019) [Link] (2 responses)

I appreciate the interest. I'm just getting started on the implementation, so I can't speak with any authority on how the final product will turn out. Plus, who knows what modifications the core developers will like to see before merging.

> My main concern with regards to transaction-like logging semantics is that the processes of updating the log indexes, updating the rolling log hash, and writing the log message do not occur as a single atomic operation. A crash in the middle of logging a message would leave the logs, indexes, and hashes in an inconsistent (but potentially recoverable) state. Does your implementation address this? As in, once you expose a log record over kdbus, are its associated hashes and indexes guaranteed to be consistent?

This requires a little more explanation on what I'm actually trying to do. There's two separate pieces:

1) Modifications to systemd that expose log messages as signals. (Currently, there are methods to query log messages, but that won't be sufficient for constant forwarding.)

2) A service that listens for these signals and does *something* with them. (My something is sending them across the network to a listening service.)

#1 is the only part that necessarily lives within the systemd tree. Anyone is free to make their own #2, and they can even do wildly different things with those messages (like write them out locally in their own format). Furthermore, since #2 is subscribing to a set of signals, multiple services can simultaneously perform #2, and journald has no clue (except that subscribers >0).

Back to atomic operation, consistency, and tampering.

I suppose that it would be possible for journald to make some sort of log_entry_ack() method available that a #2 service could send back to the journal confirming receipt. The question is what's the purpose? Obviously, that ack cannot be forwarded back to #2 or else there is infinite amplification of acks. The only use would be for journald to use it's internal storage mechanism, but then you're dependent on local logs (which I'm trying to avoid and you don't like due to corruption and other concerns).

A #2 service is free to implement any sort of fancy syncing/checkpointing that it wants. The journal messages already contain useful monotonic timestamps that could be immediately written to disk in a durable manner and with the desired anti-tampering protection. That's certainly going to have performance implications, but you should be able to add as much safety as is desired.

KDBus is still in active development, so we'll have to wait to see what guarantees it provides to message. Nonetheless, it's running as part of the kernel, so there should at least be highly reliable signal delivery, possibly even guaranteed. DBus currently (and safe to assume that KDBus will continue to) guarantees message ordering, so that's also helps with consistency.

Lastly, journald has supported forward secure sealing (FSS). I'm not a cryptographer, but my understanding is that FSS is quality anti-tampering implementation. It would not appear to be that much of a challenge to send the sealing key over the system bus, allowing a #2 service to verify the integrity based on the verification key.

Let me know if you have additional questions or comments. As I said, I'm just getting started on implementation, and I don't want to pigeon-hole this idea to my use case, general network log aggregation.

Debian TC vote on init system coupling

Posted Feb 25, 2014 3:06 UTC (Tue) by jude- (guest, #95678) [Link] (1 responses)

It sounds like we're both going to implement part of the same wheel in #1. I sense an opportunity to collaborate :)

I think the biggest challenge to #1 is getting messages out of systemd without causing systemd to block, without filling up too much buffer space, and without losing messages. While signaling message consumers would get them to wake up and grab the next message, you'd have to be very careful to get the next message quickly (even under load), so systemd can free it. Even with (reliable) real-time signals, you'll still need to ensure that systemd holds onto your message long enough for your to get it (without blocking systemd). This is on top of contending with signal handler races, whereby you might consume messages concurrently and out-of-order. I think you might have better luck with a socket or a message queue.

I took a look at the kdbus GitHub. I looks to be designed specifically to address this challenge, so that might be a better long-term option (maybe use a compatibility library to access kdbus, if you need portability).

> A #2 service is free to implement any sort of fancy syncing/checkpointing that it wants. The journal messages already contain useful monotonic timestamps that could be immediately written to disk in a durable manner and with the desired anti-tampering protection. That's certainly going to have performance implications, but you should be able to add as much safety as is desired.

This is quite helpful :) They're already putting the messages in order for us. It looks like you would be able to get away with sending messages out as soon as they're ready (i.e. once you have the hash for the message).

Debian TC vote on init system coupling

Posted Feb 25, 2014 3:35 UTC (Tue) by fandingo (guest, #67019) [Link]

> I think the biggest challenge to #1 is getting messages out of systemd without causing systemd to block

DBus can take care of this.

> without filling up too much buffer space, and without losing messages

I think that this will be the biggest concern. User-space DBus has always been pretty slow, so it hasn't been used for large data transfers. I've inquired with the systemd developers to see if there is (or will be) any method of "expiring" signals (there is a timeout feature for regular method calls) and how that appears to the subscribers. If that is possible, we should be able to query journald for message IDs, although there are certainly lots of considerations to handle if the listener is overwhelmed by messages and cannot query messages from the journal (especially if the journal is only storing to memory).

I'd like to have someone to collaborate with. If you're interested, join the systemd mailing list. I'm going to work a very rough POC and post it to the list in the next couple of weeks to get some feedback.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds