Debian TC vote on init system coupling
Debian TC vote on init system coupling
Posted Feb 24, 2014 20:31 UTC (Mon) by fandingo (guest, #67019)In reply to: Debian TC vote on init system coupling by jude-
Parent article: Debian TC vote on init system coupling
You're in luck, and it doesn't require re-implementing journald. I'm at this moment working on a feature to journald that will allow you to do this much easier than you imagine.
I'm unsatisfied with the journald-->remote journald logging, which requires a syslog intermediary. Rather than building in remote logging directly into the journal, which is the obvious solution, I figured that there are other applications that would be interested in this data. (In particular, I was imaging more network-based uses like native loggers for Splunk and LogStash.) The implementation that I'm currently working on exposes log data over KDBus (so practically zero performance impact) to subscribers. Anything can subscribe and get log messages in their entirety. You would be able to set journald.conf Storage=None and have your own persistent writer connect to the system bus and write out your preferred format.
===
> If the journald-to-systemd-PID-1 API was stable
What gives you the impression that it is unstable? The systemd developers include it in the stability promise.
Posted Feb 24, 2014 21:59 UTC (Mon)
by jude- (guest, #95678)
[Link]
Is this the same as the Service bus API (listed in [2])?
[1] http://www.freedesktop.org/wiki/Software/systemd/Interfac...
Posted Feb 24, 2014 22:04 UTC (Mon)
by jude- (guest, #95678)
[Link] (3 responses)
This sounds pretty cool, and is definitely something I'd be interested in using :)
My main concern with regards to transaction-like logging semantics is that the processes of updating the log indexes, updating the rolling log hash, and writing the log message do not occur as a single atomic operation. A crash in the middle of logging a message would leave the logs, indexes, and hashes in an inconsistent (but potentially recoverable) state. Does your implementation address this? As in, once you expose a log record over kdbus, are its associated hashes and indexes guaranteed to be consistent?
Posted Feb 24, 2014 22:47 UTC (Mon)
by fandingo (guest, #67019)
[Link] (2 responses)
> My main concern with regards to transaction-like logging semantics is that the processes of updating the log indexes, updating the rolling log hash, and writing the log message do not occur as a single atomic operation. A crash in the middle of logging a message would leave the logs, indexes, and hashes in an inconsistent (but potentially recoverable) state. Does your implementation address this? As in, once you expose a log record over kdbus, are its associated hashes and indexes guaranteed to be consistent?
This requires a little more explanation on what I'm actually trying to do. There's two separate pieces:
1) Modifications to systemd that expose log messages as signals. (Currently, there are methods to query log messages, but that won't be sufficient for constant forwarding.)
2) A service that listens for these signals and does *something* with them. (My something is sending them across the network to a listening service.)
#1 is the only part that necessarily lives within the systemd tree. Anyone is free to make their own #2, and they can even do wildly different things with those messages (like write them out locally in their own format). Furthermore, since #2 is subscribing to a set of signals, multiple services can simultaneously perform #2, and journald has no clue (except that subscribers >0).
Back to atomic operation, consistency, and tampering.
I suppose that it would be possible for journald to make some sort of log_entry_ack() method available that a #2 service could send back to the journal confirming receipt. The question is what's the purpose? Obviously, that ack cannot be forwarded back to #2 or else there is infinite amplification of acks. The only use would be for journald to use it's internal storage mechanism, but then you're dependent on local logs (which I'm trying to avoid and you don't like due to corruption and other concerns).
A #2 service is free to implement any sort of fancy syncing/checkpointing that it wants. The journal messages already contain useful monotonic timestamps that could be immediately written to disk in a durable manner and with the desired anti-tampering protection. That's certainly going to have performance implications, but you should be able to add as much safety as is desired.
KDBus is still in active development, so we'll have to wait to see what guarantees it provides to message. Nonetheless, it's running as part of the kernel, so there should at least be highly reliable signal delivery, possibly even guaranteed. DBus currently (and safe to assume that KDBus will continue to) guarantees message ordering, so that's also helps with consistency.
Lastly, journald has supported forward secure sealing (FSS). I'm not a cryptographer, but my understanding is that FSS is quality anti-tampering implementation. It would not appear to be that much of a challenge to send the sealing key over the system bus, allowing a #2 service to verify the integrity based on the verification key.
Let me know if you have additional questions or comments. As I said, I'm just getting started on implementation, and I don't want to pigeon-hole this idea to my use case, general network log aggregation.
Posted Feb 25, 2014 3:06 UTC (Tue)
by jude- (guest, #95678)
[Link] (1 responses)
I think the biggest challenge to #1 is getting messages out of systemd without causing systemd to block, without filling up too much buffer space, and without losing messages. While signaling message consumers would get them to wake up and grab the next message, you'd have to be very careful to get the next message quickly (even under load), so systemd can free it. Even with (reliable) real-time signals, you'll still need to ensure that systemd holds onto your message long enough for your to get it (without blocking systemd). This is on top of contending with signal handler races, whereby you might consume messages concurrently and out-of-order. I think you might have better luck with a socket or a message queue.
I took a look at the kdbus GitHub. I looks to be designed specifically to address this challenge, so that might be a better long-term option (maybe use a compatibility library to access kdbus, if you need portability).
> A #2 service is free to implement any sort of fancy syncing/checkpointing that it wants. The journal messages already contain useful monotonic timestamps that could be immediately written to disk in a durable manner and with the desired anti-tampering protection. That's certainly going to have performance implications, but you should be able to add as much safety as is desired.
This is quite helpful :) They're already putting the messages in order for us. It looks like you would be able to get away with sending messages out as soon as they're ready (i.e. once you have the hash for the message).
Posted Feb 25, 2014 3:35 UTC (Tue)
by fandingo (guest, #67019)
[Link]
DBus can take care of this.
> without filling up too much buffer space, and without losing messages
I think that this will be the biggest concern. User-space DBus has always been pretty slow, so it hasn't been used for large data transfers. I've inquired with the systemd developers to see if there is (or will be) any method of "expiring" signals (there is a timeout feature for regular method calls) and how that appears to the subscribers. If that is possible, we should be able to query journald for message IDs, although there are certainly lots of considerations to handle if the listener is overwhelmed by messages and cannot query messages from the journal (especially if the journal is only storing to memory).
I'd like to have someone to collaborate with. If you're interested, join the systemd mailing list. I'm going to work a very rough POC and post it to the list in the next couple of weeks to get some feedback.
Debian TC vote on init system coupling
[2] http://www.freedesktop.org/wiki/Software/systemd/Interfac...
Debian TC vote on init system coupling
Debian TC vote on init system coupling
Debian TC vote on init system coupling
Debian TC vote on init system coupling
