Version 2 of the kdbus patch set
No more device files
One of the biggest complaints about the first version was its use of device files to manage interaction with the system. Devices need to be named; that forced a hierarchical global naming system on kdbus domains — which were otherwise not inherently hierarchical. The global namespace imposed a privilege requirement, making it harder for unprivileged users to create kdbus domains; it also added complications for users wanting to checkpoint and restore containers.
The second version does away with the device abstraction, replacing it with a virtual filesystem called "kdbusfs." This filesystem will normally be mounted under /sys/fs/kdbus. Creating a new kdbus domain (a container that holds a namespace for one or more buses) is simply a matter of mounting an instance of this filesystem; the domain will persist until the filesystem is unmounted. No special privileges are needed to create a new domain — but mounting a filesystem still requires privileges of its own.
A newly created domain will contain no buses at the outset. What it does have is a file called control; a bus can be created by opening that file and issuing a KDBUS_CMD_BUS_MAKE ioctl() command. That bus will remain in existence as long as the file descriptor for the control file is held open. Only one bus may be created on any given control file descriptor, but the control file can be opened multiple times to create multiple buses. The control file can also be used to create custom endpoints for well-known services.
Each bus is represented by its own directory underneath the domain directory; endpoints are represented as files within the bus directory. Connecting to a bus is a matter of opening the kdbusfs file corresponding to the desired endpoint; for most clients, that will be the file simply called bus. Messages can then be sent and received with ioctl() commands on the resulting file descriptor.
As can be seen, the device abstraction is gone, but the interface is still somewhat device-like in that it is heavily based on ioctl() calls. There has been a small amount of discussion on whether it might make more sense to just use operations like read() and write() to interact with kdbus, but there appears to be little interest in making (or asking for) that sort of change.
Metadata issues
A significant change that has been made is in the area of security. In version 1, the recipient of a message could specify a set of credential information that must accompany the message. This information can include anything from the process ID through to capabilities, command line information, audit information, security IDs, and more. Some reviewers (Andy Lutomirski in particular) complained that this approach could lead to information leaks and, maybe, worse security issues; instead, they said, the sender of a message should be in control of the metadata that goes along with the message.
The updated patch set contains a response to that request by changing the protocol. When a client connects to the bus, it runs the KDBUS_CMD_HELLO ioctl() command to set up a number of parameters for the connection; one of those parameters is now a bitmask describing which metadata can be sent with messages. It is possible for the creator of the bus to specify a minimum set of metadata to go with messages, though; in that case, a client refusing to send that metadata will not be allowed to connect to the bus.
There is still some disagreement over which metadata should be sent, whether it's optional or not. Andy disagrees with providing command-line (and related) information, on the basis that it can be set by the process involved and thus carries no trustworthy information. This metadata is evidently used mostly for debugging purposes; Andy suggests that it should just be grabbed out of /proc instead. He is also opposed to the sending of capability information, noting that capabilities are generally problematic in Linux and their use should not be encouraged.
One other interesting bit of metadata that can be attached to messages is
the time that the sending process started
executing. It is there to prevent race conditions associated with the
reuse of process IDs, which can happen quickly on a busy system. Andy
dislikes that approach, noting that it will not work well with either
namespaces or checkpointing. He prefers instead his own "highpid" solution. This patch adds a second,
64-bit, unique number associated with each process; interested programs can
then detect process ID reuse by seeing if that number changes. Eric
Biederman disagreed with that approach,
saying "What we need are not race free pids, but a file descriptor
based process management api.
" Andy was
not opposed to that idea, but he would like to see something simple
that can be of use to kdbus now.
Andy had a number of other comments, including pointing out a couple of places where, he
contended, he could use kdbus to gain root access on any system where it
was installed. Even so, he seems happy
with the direction the code is going, saying "And thanks for
addressing most of the issues. The code is starting to look much better to
me.
"
Toward the mainline
In theory, resolving the remaining issues should be relatively
straightforward, though it is not hard to see the "highpid" idea running
into resistance at some point. But the number of reviewers for the second
kdbus posting has been relatively small, perhaps as a result of the
holidays in the US. The addition of a significant core API of this type
requires more attention than kdbus has gotten so far. That suggests that
there may still be significant issues that have not yet been raised by
reviewers. Kdbus is getting closer to mainline inclusion, but it may well
take a few more development cycles to get to a point where most developers
are happy with it.
| Index entries for this article | |
|---|---|
| Kernel | kdbus |
| Kernel | Message passing |
Posted Dec 4, 2014 12:02 UTC (Thu)
by tomegun (guest, #56697)
[Link]
> Andy disagrees with providing command-line (and related) information, on the basis that it can be set by the process involved and thus carries no trustworthy information. This metadata is evidently used mostly for debugging purposes; Andy suggests that it should just be grabbed out of /proc instead.
The main reason we can not simply pull useful, yet not trustworthy, information out of /proc is that in the case of short-lived processes the entries in /proc are likely gone by the time the receiver tries to collect them.
Version 2 of the kdbus patch set
