|
|
Subscribe / Log in / New account

Missing the AF_BUS

By Jonathan Corbet
July 3, 2012
The D-Bus interprocess communication mechanism has, over the years, become a standard component of the Linux desktop. For almost as long, developers have been trying to find ways to make D-Bus faster. The latest attempt comes in the form of a kernel patch set adding a new socket address family (called AF_BUS) to the networking layer. Significant performance improvements are claimed, but, like previous attempts, this one may have a hard time getting into the mainline kernel.

D-Bus implements a mechanism by which processes can send messages to each other. Multicast functionality is inherently a part of the protocol; one message can be sent to multiple recipients. D-Bus promises reliable delivery, where "reliable" means that messages arrive in the order in which they were sent and multicast messages will either be delivered to all recipients or, if that is not possible, to none. There is a security model built into the protocol whereby messages can be limited to specific recipients. All of these features are used by contemporary systems, which expect the system to be robust, secure, and with as little latency and overhead as possible.

The current D-Bus implementation uses Unix-domain sockets and a central routing daemon. It works, but the routing daemon adds context switches, overhead, and latency to each message it handles. The kernel is unable to help get high-priority messages delivered first, so all messages cause wakeups that slow down the processing of the most important ones; see this message for a description of how these problems can affect a running system. It has been evident for some time to the developers involved that a better solution must be found.

There have been a number of attempts in that direction. The previous time this topic came up, it was around a set of patches adding multicast capabilities to Unix-domain sockets. This idea was rejected with the claim that the Unix-domain socket code is already too complicated and there was not enough justification to make things worse by adding multicast capabilities. The D-Bus developers were told to simply use IPv4 sockets, which already have multicast support, instead.

What those developers actually did was to implement AF_BUS, a new address family designed to meet the needs of D-Bus. It provides the reliable delivery that D-Bus requires; it also has the ability to pass file descriptors and credentials from one process to another. The security mechanism is built in, with the netfilter code (augmented with a new D-Bus message parser) used to control which messages can actually be delivered to any specific process. The end result, it is claimed, is a significant reduction in D-Bus overhead due to reduced system calls; submitter Vincent Sanders claims "a doubling in throughput and better than halving of latency." See the associated documentation for details on how this address family works.

A factor-of-two improvement in a component that is widely used in Linux systems would certainly be welcome. The patch set, however, was not; networking maintainer David Miller immediately stated his intention to simply ignore the patch set entirely. His objections seem to be that IPv4 sockets are sufficient for the task and that reliable delivery of multicast messages cannot be done, even in the limited manner needed by D-Bus. He expressed doubts that the IPv4 approach had even been tried, and decreed: "We are not creating a full address family in the kernel which exists for one, and only one, specific and difficult user."

Vincent responded that a number of approaches have been tried and found wanting. IPv4 sockets cannot provide the needed delivery guarantees and do not allow for the passing of file descriptors and credentials. It is also important, he said, for D-Bus to be up and running before the networking subsystem has been configured; setting up IP interfaces on a contemporary system often requires communication over D-Bus. There really is no better solution, he said.

He found support from a few other developers, including Alan Cox, who pointed out that there is no shortage of interprocess communication systems out there with requirements similar to D-Bus:

In fact if you look up the stack you'll find a large number of multicast messaging systems which do reliable transport built on top of IP. In fact Red Hat provides a high level messaging cluster service that does exactly this. (as well as dbus which does it on the desktop level) plus a ton of stuff on top of that (JGroups etc)

Everybody at the application level has been using these 'receiver reliable' multicast services for years (Websphere MQ, TIBCO, RTPGM, OpenPGM, MS-PGM, you name it). There are even accelerators for PGM based protocols in things like Cisco routers and Solarflare can do much of it on the card for 10Gbit.

He added that latency concerns are paramount on contemporary systems and that one of the best ways of reducing latency is to cut back on context switches and middleman processes. Chris Friesen added that his company uses "an out-of-tree datagram multicast messaging protocol family based on AF_UNIX" that could almost certainly be replaced by something like AF_BUS, were AF_BUS to be added to the mainline kernel.

There have been various other local messaging patch sets posted over the years. So it seems clear that there is a significant level of interest in having this sort of capability built into the Linux kernel. But interest alone is not sufficient justification for the merging of a large patch set; there must also be agreement from the developers who are charged with ensuring that Linux has a top-quality networking stack in the long term. That agreement is not yet there, so there may be a significant amount of multicast interpersonal messaging required before we have multicast interprocess messaging in the kernel.

Index entries for this article
KernelD-Bus
KernelMessage passing
KernelNetworking/D-Bus


to post comments

Missing the AF_BUS

Posted Jul 5, 2012 4:39 UTC (Thu) by hp (guest, #5220) [Link] (14 responses)

Do keep in mind that there are a number of avenues to improve D-Bus in userspace as well; I'm mystified why so much effort has gone into kernel changes and near zero into userspace:
http://lists.freedesktop.org/archives/dbus/2012-March/015...

but in any case the opportunity is there. My belief is that nobody is chasing this because for the vast majority of people D-Bus performance is not an actual problem. When I've asked for concrete examples of when it was a problem, things like Nokia N900 (iPhone 3G era hardware right?) come up, and poorly coded applications aren't ruled out and seem likely to be involved even in that case.

basically there is just no need to performance tune the kind of stuff dbus is normally used for on a stock Linux desktop... if something is only 1% of user-visible speed, making it double fast isn't perceptible.

people do show up on the mailing list using dbus on low resource embedded systems and needing ultra low latency or something, but in those cases dbus was pretty clearly a poor choice of hammer for the nail at hand.

I don't think Alan is wrong though. the dbus semantics and guarantees that make it slow are also what make it convenient, and app developers generally want those guarantees and apps are less buggy if they have them. So it might be nice to make this genre of thing fast, even if the simple notifications etc. used by the desktop aren't performance critical, there are other domains that might benefit from ordered, reliable delivery, lifecycle tracking, etc. there's no question a faster implementation of dbus would be more broadly useful beyond just the desktop.

Missing the AF_BUS

Posted Jul 5, 2012 5:32 UTC (Thu) by hp (guest, #5220) [Link]

Also it's worth noting again that the design tradeoffs in dbus are almost all copied from X11. Messages between two processes are guaranteed to stay in the order sent, network errors are handled by just disconnecting and starting over, there's a central server that all the apps use to talk to each other, it's a binary protocol, there's a "selection" (bus name) concept used to acquire/release/locate resources, similar authentication mechanisms, server is located via environment variable, blocking round trips are discouraged (mechanisms are provided to avoid them), people use both almost exclusively on same-machine or at least a LAN, etc.

So most problems and solutions that apply to X11 will also apply to dbus.

I think the tradeoffs and guarantees made here are a pretty good guide to what desktop/mobile app developers want when they're writing a UI that's implemented as a "swarm of processes" (as all the X desktops are). Framed another way, this is what a local IPC system has to provide in order to support relatively reliable application code in this context. However, these tradeoffs are probably inappropriate for systems distributed over the internet or even over a cluster.

Based on dbus list traffic there seem to be development situations where similar tradeoffs make sense but the inherent slowdown of the central dispatch daemon is a problem. That's where kernel-accelerated dbus-like-thing would make sense maybe.

Missing the AF_BUS

Posted Jul 5, 2012 9:17 UTC (Thu) by kyllikki (guest, #4370) [Link]

Concerning your comments on userspace D-Bus improvements. Collabora employ one of the upstream D-Bus maintainers and pay for some of their time to work on D-Bus.

We most definitely are committed to improving the userspace side of D-Bus in addition to the kernel work (which was a project for the GENIVI alliance)

Our eventual aim using all the solutions is for a tripling in throughput and a significant reduction of latency for the general case.

Missing the AF_BUS

Posted Jul 5, 2012 10:33 UTC (Thu) by smcv (subscriber, #53363) [Link] (1 responses)

> poorly coded applications aren't ruled out

On the system bus, which is a trust boundary, poorly- or even maliciously-coded applications can never be ruled out, unfortunately.

> in those cases dbus was pretty clearly a poor choice of hammer for the nail at hand

People consider D-Bus to be a suitable transport for all sorts of things, desktop or not. The first sentence of the specification describes it as "a system for low-latency, low-overhead, easy to use interprocess communication", which probably contributes to the view that it's the right hammer for every nail - in practice, its current design tradeoffs tend to prioritize "easy to use" over low-latency.

Improving its latency, and avoiding priority inversion between the dbus-daemon and its clients, certainly increases the number of situations where D-Bus can be used. They might not be necessary for "the desktop bus", but that's by no means the only thing people use D-Bus for.

Improving the kernel-level transport is orthogonal to improving the user-space part (of which message (de)serialization is indeed likely to be the lowest-hanging fruit), and there's no reason they can't both happen.

> the dbus semantics and guarantees that make it slow are also what make it convenient

I absolutely agree that the convenient semantics - multicast signals, total ordering, conventions for lifecycle tracking and so on - are what make D-Bus valuable, and if you're willing to sacrifice those convenient semantics for performance, that's a sign that D-Bus is not right for you. Having said that, given the constraints of those semantics, the more efficient the better, and AF_BUS proves that there is room for improvement.

Missing the AF_BUS

Posted Jul 5, 2012 13:01 UTC (Thu) by hp (guest, #5220) [Link]

> poorly- or even maliciously-coded applications can never be ruled out,
> unfortunately

What I meant here was, an app with lots of round trips in its protocol design or that shovels loads of data over the bus is going to be a perf problem. As a practical matter if you have user-visible situation xyz that appears slow, fixing dorky app behavior can be the fastest way to fix xyz.

> there's no reason they can't both happen

That's why I keep lobbying for the userspace changes to happen - a couple of them looked like they'd only take a few days of work. Hey, for all I know, someone did them already over the last few months. Anyway it's just bugging me (as you've no doubt gathered) that the kernel stuff is a kind of multi-year undertaking due to the difficult political issues, while the performance could be greatly improved without blocking on kernel devs...

So I'm just trying to give the potential userspace tasks some PR. Maybe someone reading these comments will want to work on them, we can dream...

(I know I'm telling those close to dbus things they already know. But it may not be apparent to those who aren't close to it that there's stuff they could do today.)

Missing the AF_BUS

Posted Jul 5, 2012 17:15 UTC (Thu) by smurf (subscriber, #17840) [Link] (7 responses)

Apparently people now try to do some things across DBus that require a higher data rate and lower latency than the traditional "huh, somebody plugged in a USB stick / wants root for aptitude/yum / left fpr lunch" events DBus handled in the past.

Newfangled stuff like multi-touch. Or keyboard input (for Chinese and whatnot).

You don't want that stuff to go through more context switches (and processes) than strictly necessary. So AF_BUS seems to be a Good Thing.

Missing the AF_BUS

Posted Jul 5, 2012 19:26 UTC (Thu) by iabervon (subscriber, #722) [Link] (6 responses)

It seems to me like the right solution for a lot of the low-latency or high-throughput applications is actually to send a socketpair endpoint over DBus, rather than using DBus for the actual data. Provided, of course, that that works. If nothing else, it would be very much worthwhile to save the packet framing when you have a logical data stream, and it's hard to beat a scheme where your Chinese user input can be turned into stdin with nothing more than dup2 and inherited by child processes.

Missing the AF_BUS

Posted Jul 5, 2012 19:53 UTC (Thu) by hp (guest, #5220) [Link]

Yep. You can even use the dbus protocol over that socketpair. libdbus could probably provide some nice convenience API to facilitate this where you'd basically get a DBusConnection back from a method call to the remote service.

The downside is mostly that it's a fair bit more work for apps to do stuff like this. Services don't necessarily need to track "registered clients" right now but with this kind of setup they have to, in addition to dealing with the raw sockets and other extra work.

A lot of the discussion of speeding up dbus is motivated by trying to make the easy thing work well for apps, instead of requiring app authors to sort out these tradeoffs.

Especially with the higher-level API in say glib's dbus support, though, it might be possible to near-automatically put certain objects on dedicated sockets. Just a matter of programming...

Missing the AF_BUS

Posted Jul 6, 2012 5:51 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

It would be nice to be also able to use DBUS over the network. If you stick to DBUS-only protocol then it's not a problem.

But the minute you start using socketpairs - it becomes impossible.

Missing the AF_BUS

Posted Jul 6, 2012 6:24 UTC (Fri) by smurf (subscriber, #17840) [Link] (3 responses)

socketpair-ing the sender and recipient would probably work, but it has a few downsides.

* you lose the easy debugging and monitoring you now get with DBus (presumably, with AF_BUS you could use something like wireshark),

* the client now has to juggle multiple file descriptors, that requires an API change

* multiple file descriptors and reliable message ordering don't mix

Too many downsides if you ask me.

Missing the AF_BUS

Posted Jul 6, 2012 6:58 UTC (Fri) by Fowl (subscriber, #65667) [Link] (1 responses)

Surely libdbus could handle this all automatically?

Missing the AF_BUS

Posted Jul 6, 2012 16:18 UTC (Fri) by smurf (subscriber, #17840) [Link]

The multiple-file-descriptors problem is the killer,

dbus_connection_get_unix_fd() returns exactly one file descriptor. If you open a direct connection to some application, you have more than one file descriptor. How do you make your application select()/poll() on both (or more) of these?
Admittedly, on second thought, you could do it with epoll(). But it's still a change in semantics (you can't read from that file descriptor; worse, you can't write to it).

How would you propose to handle the monitoring problem? Let the daemon send a "somebody's listening for messages X, so if you exchange any of those privately, kindly send me a copy" commands to each and every client? Owch.

I'm not saying this cannot be done. I'm saying it's heaps more work, and more fragile, than simply moving the main chunk of this into the kernel, especially since there's already code which does that. And code is more authoritative than English.

Missing the AF_BUS

Posted Jul 6, 2012 17:32 UTC (Fri) by iabervon (subscriber, #722) [Link]

I think I was unclear: I'm not suggesting that DBus do some sort of socketpair thing to provide high-speed dbus links; I'm suggesting that use cases that don't really care about DBus, but need to negotiate with another program for their input, could get back to their normal operation (input from a fd) by having the other side send a fd back when it's all set up. That is, user input doesn't normally go over DBus, so applications are already designed around a non-DBus input path; even if you need IPC to set up your input device appropriately, it doesn't make too much sense to get user input over DBus, particularly if you can't get corresponding user input over DBus if the system doesn't have Chinese input or multi-touch.

Of course, it's certainly possible that people will want high-speed IPC with DBus properties also, and it makes sense for DBus to be efficient regardless of whether it's running into performance constraints. But it doesn't make sense to use DBus for all communication, even if its performance could be made good enough.

Missing the AF_BUS

Posted Jul 26, 2012 22:41 UTC (Thu) by oak (guest, #2786) [Link] (1 responses)

> When I've asked for concrete examples of when it was a problem, things like Nokia N900 (iPhone 3G era hardware right?) come up

One of the worst issues is D-BUS message delivery reliability. All it needs is an app that subscribes for some frequent message (like device orientation) and then doesn't read its messages either because it was supended or just buggy. As message delivery needs to be reliable, D-BUS will then just buffer the messages and get all the time slower and slower as it starts to swap.

Second issue is too complicated D-BUS setup. I think e.g. the N900 call handling goes through half a dozen daemons before the call UI pops up. Each of these steps adds it's own socket buffering and process scheduling overhead in addition to other overheads (e.g. paging the processes in to RAM if they were swapped out etc).

Then there's the D-BUS daemon code itself. Ever wondered why something that's "just" supposed to read and write data from sockets is CPU bound instead of IO bound? D-BUS daemon spends a lot of CPU on message content marshaling.

Missing the AF_BUS

Posted Jul 26, 2012 22:57 UTC (Thu) by hp (guest, #5220) [Link]

the tradeoffs in the first issue can be configured in the config
file, it can throw errors when the buffer size is whatever you like. there are also some list/bug discussions of other behaviors that could be useful to support.

the second issue is not dbus's fault. that kind of thing is often from making a daemon when a library would be better. it's a bug in the app design.

the third issue I've mentioned repeatedly myself including in the threads I linked before.

but none of these three thing are concrete examples of user visible operations. in most real world cases all three of these problems are gotten away with and it isn't perceptible. n900 is the most often mentioned case where they aren't and if you're correct here, N900 has at least one really bad setup with half a dozen daemons.

Missing the AF_BUS

Posted Jul 5, 2012 4:55 UTC (Thu) by alonz (subscriber, #815) [Link] (17 responses)

I can't help but feel David Miller's response is a tad hypocritical :(

After all, he was the one who practically shoved a new address family (AF_ALG) down the throats of the community as a “solution” to connecting kernel crypto with userspace—said solution being such a poor fit that it isn't being used anywhere, but stifling any opportunity to integrate an actually suitable solution.

(Yes, this is my personal hurtful spot, and I am grumpy.)

Missing the AF_BUS

Posted Jul 5, 2012 5:47 UTC (Thu) by daniel (guest, #3181) [Link] (16 responses)

Dave seems to be labouring under the misapprehension that his TCP stack is efficient. It isn't. It is a big rambling, inefficient pile of spaghetti.

Missing the AF_BUS

Posted Jul 5, 2012 6:00 UTC (Thu) by alonz (subscriber, #815) [Link] (5 responses)

In Dave's defense will note that the Linux TCP stack does appear to be extremely efficient compared to other OS'es… It's just not always the perfect hammer for the screws you may be using.

Missing the AF_BUS

Posted Jul 5, 2012 6:10 UTC (Thu) by daniel (guest, #3181) [Link] (4 responses)

Then the others must suck even more, but the Linux TCP stack still sucks.

Missing the AF_BUS

Posted Jul 5, 2012 17:10 UTC (Thu) by jond (subscriber, #37669) [Link] (3 responses)

It's the quality of discourse that keeps me coming back to LWN.

Missing the AF_BUS

Posted Jul 7, 2012 1:40 UTC (Sat) by daniel (guest, #3181) [Link] (2 responses)

Good point. But what exactly is the correct technical response to an argument of the form "it's ok if we suck because somebody else sucks even worse". Never mind that that premise is stated without support, whereas the premise that the Linux TCP stack is a big pile of spaghetti is easily verified.

Missing the AF_BUS

Posted Jul 13, 2012 5:43 UTC (Fri) by Tov (subscriber, #61080) [Link]

Easy! Instead of waving your hand and present your unfounded opinion, you present some facts...

Missing the AF_BUS

Posted Jul 15, 2012 7:55 UTC (Sun) by philomath (guest, #84172) [Link]

How easy? can you just give me a starter, please?

Missing the AF_BUS

Posted Jul 5, 2012 18:26 UTC (Thu) by josh (subscriber, #17465) [Link] (9 responses)

Would you mind providing (links to) more information, for people interested in learning about the purported inefficiencies in Linux's TCP stack?

Missing the AF_BUS

Posted Jul 6, 2012 15:03 UTC (Fri) by pspinler (subscriber, #2922) [Link] (1 responses)

I'm not sure if linux's tcp stack is inefficient or not compared to other tcp stacks, but the networking stack is certainly is complex and multi-layered. Consider all the basic tcp protocol code (reliability, packet frag and reassembly, etc), then layer on top netfilter, underneath it routing logic, the ip stack, and etc, and it's easy to construct packets that go through possibly significant code paths.

Certainly all that complexity can't be great for performance.

It's the argument I make for fibre channel v. iscsi. It's true that iscsi hardware (being just standard networking stuff) is a lot cheaper and does the job 90-95% of the time. But in the edge case, especially w.r.t latency, fibre still wins, largely because it's simple in comparison.

-- Pat

Missing the AF_BUS

Posted Jul 9, 2012 2:35 UTC (Mon) by raven667 (subscriber, #5198) [Link]

> Certainly all that complexity can't be great for performance.

That's something worth testing, scientifically.

> It's the argument I make for fibre channel v. iscsi. It's true that iscsi hardware (being just standard networking stuff) is a lot cheaper and does the job 90-95% of the time. But in the edge case, especially w.r.t latency, fibre still wins, largely because it's simple in comparison.

One thing about this example that I would like to point out. FC implements much of the features of Ethernet and TCP/IP ... differently, so in that sense the complexity is at least comparable though probably not equal. As far as the implementation complexity I think that FC can get off easier because as a practical matter it is used in closed networks often with all components from the same vendor. Ethernet and TCP/IP have to deal with a lot more varied equipment and varied networks and have to be battle tested against _anything_ happening, all that extra implementation complexity has a real reason for being there.

Missing the AF_BUS

Posted Jul 9, 2012 6:02 UTC (Mon) by daniel (guest, #3181) [Link] (6 responses)

I'll have to tell you about it, because the actual code is buried deep in somebody's trading engine and they would likely take issue with me posting it on the web. Profiling turned up some really bad CPU bumps in places you would not immediately suspect, like UDP send, which was taking nearly a microsecond per packet more than it should. I thought there would actually be some deep reason for that, but when I dug in I found that the reason was just sloppy, rambling code, pure and simple. I straightened it all out and cut the CPU overhead in half, consequently reducing the hop latency by that amount. I went on to analyze the rest of the stack to some extent and found it was all like that. You can too, all you need to do is go look at the code.

Here's a lovely bit:

http://lxr.linux.no/#linux+v3.4.4/net/ipv4/tcp_output.c#L796

This is part of a call chain that goes about 20 levels deep. There is much worse in there. See, that stuff looks plausible and if you listen to the folklore it sounds fast. But it actually isn't, which I know beyond a shadow of a doubt.

Missing the AF_BUS

Posted Jul 9, 2012 6:53 UTC (Mon) by daniel (guest, #3181) [Link] (3 responses)

Here's a better example:

http://lxr.linux.no/#linux+v3.4.4/net/ipv4/ip_output.c#L799

This code just kills efficiency by a thousand cuts. There is no single culprit, it is just that all that twisting and turning, calling lots of little helpers and layering everything through an skb editing API that successfully confuses the optimizer adds up to an embarrassing amount of overhead. First rule to remember? Function calls are not free. Not at the speeds networks operate these days.

Missing the AF_BUS

Posted Jul 9, 2012 8:18 UTC (Mon) by nix (subscriber, #2304) [Link] (1 responses)

Actually, predicted function calls *are* nearly free on modern CPUs. Of course, function calls stuck deep inside conditionals are less likely to be successfully predicted as taken -- and unpredicted/mispredicted function calls (like all other mispredicted, non-speculated branches) are expensive as hell. However, these days I don't believe there is much more reason to be concerned about function calls than there is to be concerned about any other conditional. (Specialists in deep x86 lore, which I am very much not and who I am merely reiterating from dim and vague memory, are welcome to contradict me, and probably will!)

Missing the AF_BUS

Posted Jul 9, 2012 23:06 UTC (Mon) by daglwn (guest, #65432) [Link]

The call is cheap. The saving/restoring of registers and lost optimization opportunities are not.

Missing the AF_BUS

Posted Jul 9, 2012 18:40 UTC (Mon) by butlerm (subscriber, #13312) [Link]

>This code just kills efficiency by a thousand cuts. There is no single culprit, it is just that all that twisting and turning, calling lots of little helpers...

Much of the complexity of that function has to do with kernel support for fragmented skbs, which is required for packets that are larger than the page size. That is the sort of thing that would go away if the kernel adopted a kernel page size larger than the hardware page size in cases where the latter is ridiculously small.

I am not sure what the real benefits are of managing everything in terms of 4K pages is on a system with modern memory sizes. Perhaps the idea of managing everything in terms of 64K pages (i.e. in groups of 16 hardware pages) could be revisited. That would dramatically simplify much of the networking code, because support for fragmented skbs could be dropped. No doubt it would have other benefits as well.

Missing the AF_BUS

Posted Jul 9, 2012 9:11 UTC (Mon) by gioele (subscriber, #61675) [Link]

> I straightened it all out and cut the CPU overhead in half, consequently reducing the hop latency by that amount. I went on to analyze the rest of the stack to some extent and found it was all like that. You can too, all you need to do is go look at the code.

> This is part of a call chain that goes about 20 levels deep. There is much worse in there. See, that stuff looks plausible and if you listen to the folklore it sounds fast. But it actually isn't, which I know beyond a shadow of a doubt.

Don't you have some notes, implementation ideas or performance tests that you want to share with the rest of the kernel community? I'm pretty sure that they would love to hear how to cut in half the CPU overhead of UDP messages without regressions in functionalities.

This kind of impact would surely reduce the battery consumption of mobile applications, so, maybe the main developers will not interested, but devs of mobile-oriented forks like Android will surely be.

Missing the AF_BUS

Posted Jul 9, 2012 20:26 UTC (Mon) by butlerm (subscriber, #13312) [Link]

I should add that fragmented skbs are used for zero copy support too, so if the idea is to simplify the networking stack by dropping them, zero copy would be out. On the other hand, zero copy seems to be usable for sendfile() and not much else, so that doesn't sound like much of a loss if it improves the much more common case.

Missing the AF_BUS

Posted Jul 5, 2012 5:09 UTC (Thu) by alonz (subscriber, #815) [Link]

I wonder whether this effort can somehow be made useful for a saner version of the Android binder code as well…

Missing the AF_BUS

Posted Jul 5, 2012 10:50 UTC (Thu) by jezuch (subscriber, #52988) [Link] (2 responses)

> and that reliable delivery of multicast messages cannot be done, even in the limited manner needed by D-Bus

I, too, was wondering how they (D-Bus) [expect to] achieve this...

Missing the AF_BUS

Posted Jul 5, 2012 13:11 UTC (Thu) by hp (guest, #5220) [Link]

People probably define some of the words (maybe one or more of "reliable delivery", "limited manner", "multicast") differently, and if they sat down to hash it out this debate would come down to people talking about different things.

Missing the AF_BUS

Posted Jul 5, 2012 23:52 UTC (Thu) by Tester (guest, #40675) [Link]

In kernel dbus, when the sender wants to send a message to N recipients, it first checks that there is space in the queue for each recipient. If there is, it puts the message there. If there isn't, then it returns an error. It's that simple.

Missing the AF_BUS

Posted Jul 6, 2012 4:45 UTC (Fri) by mgalgs (guest, #85461) [Link]

Couldn't it still be a kernel module that dbus could use optionally?

Kernel module

Posted Jul 6, 2012 12:42 UTC (Fri) by cesarb (subscriber, #6266) [Link]

> It works, but the routing daemon adds context switches, overhead, and latency to each message it handles.

Why not instead convert the dbus daemon into a kernel module, like has been done in the past with the http daemon? It would avoid having to context switch to and from the daemon, and need no changes to the networking subsystem.

Note: I am joking.

Missing the AF_BUS

Posted Jul 10, 2012 9:59 UTC (Tue) by Hugne (guest, #82663) [Link]

"so there may be a significant amount of multicast interpersonal messaging required before we have multicast interprocess messaging in the kernel."

It's there already, with reliable delivery.. modprobe tipc.

It does not pass SCM_RIGHTS or FD's, but a patchset that does this for node-local TIPC messaging will probably gain more acceptance than a new AF..

I asked on netdev if they had considered this, but i never saw a reply why they didn't choose it.

Missing the AF_BUS

Posted Jul 10, 2012 10:55 UTC (Tue) by nhippi (subscriber, #34640) [Link]

It is rather depressing that the kernel people refuse to include in-kernel support for either of the extremely widely used Linux IPC systems: binder and d-bus.

Missing the AF_BUS

Posted Jul 12, 2012 5:28 UTC (Thu) by slashdot (guest, #22014) [Link] (4 responses)

Why is a bus even needed?

Make it so that the server creates a UNIX socket with the same it wants to take, and the client connects to it.

One of those could be an enumeration/activation/etc. server (but not a message router!).

For multicast, do the same and connect to all message broadcasters, using inotify to notice when new ones come up; the publisher just sends to all connected clients.

ZeroMQ can automate most of this, if desired.

The only kernel support that might be needed is support for having unlimited size Unix socket buffers and charging that memory to the receiver, so that the OOM killer/rlimit/etc. kills a non-responsive multicast receiver rather than the sender.

A more sophisticated solution that doesn't duplicate the socket buffer for each subscriber would be even better, but probably doesn't matter for normal usage cases.

Alternatively, get rid of signals, and instead have a key/value store abstraction where you can subscribe to value updates: this way, if the buffer space is full, you can just end an "overflow" packet and the client manually asks for the values of all its watched keys.

Missing the AF_BUS

Posted Jul 12, 2012 5:57 UTC (Thu) by michelp (guest, #44955) [Link] (3 responses)

There is an effort to put ZeroMQ in the kernel as well.

https://github.com/250bpm/linux/

Missing the AF_BUS

Posted Jul 12, 2012 6:31 UTC (Thu) by michelp (guest, #44955) [Link] (2 responses)

I guess I should point out it's also going about it by implementing a protocol family similar to AF_BUS. Why can't we have d-bus and more? There are plenty of useful patterns out there and many available protocol family constants. If it's well tested and performant and cleanly coded, then why reject it? Is't that the purpose of having extensible protocol families? It's not like it's going to screw up any existing code.

Missing the AF_BUS

Posted Jul 12, 2012 6:57 UTC (Thu) by neilbrown (subscriber, #359) [Link] (1 responses)

Don't under-estimate the maintenance burden of adding more code.

It is true that we seem to add filesystems with gay abandon so maybe a similar case could be added for address families...

The reason that I would avoid adding multiple address families for IPC is that someone would want a mix of features from one and features from another (e.g. multicast and fd passing from AF_INET and AF_UNIX). So would we add yet another one that does both?

Missing the AF_BUS

Posted Jul 12, 2012 15:02 UTC (Thu) by michelp (guest, #44955) [Link]

> Don't under-estimate the maintenance burden of adding more code.

Can you give me an example of who is burdened by what exactly in this case?

> The reason that I would avoid adding multiple address families for IPC is
> that someone would want a mix of features from one and features from
> another (e.g. multicast and fd passing from AF_INET and AF_UNIX). So
> would we add yet another one that does both?

That seems like a speculative reason to reject existing and well established software patterns like d-bus, that are correctly leveraging a well established extension mechanism for adding new protocol families. Again, if it wasn't meant to be extended, then why have protocol families at all? Why was the 'sk is first member of the struct' pattern so well thought out from the beginning? It was done this way to provide ways for the mechanism to grow and evolve.


Copyright © 2012, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds