|
|
Log in / Subscribe / Register

The kdbuswreck

The kdbuswreck

Posted Apr 27, 2015 11:24 UTC (Mon) by zyga (subscriber, #81533)
In reply to: The kdbuswreck by bandrami
Parent article: The kdbuswreck

Man if only human beings would agree on one specific way to use all of those features so that applications have an inter-operable way of talking to each other. If only that specification got widely implemented and got massive usage in all environments. Then we could see if we could put some of that into the kernel to avoid the one process from having to be the central point of contention. If only someone would have proposed some patches that implement this to the kernel.


to post comments

The kdbuswreck

Posted Apr 27, 2015 20:44 UTC (Mon) by flussence (guest, #85566) [Link]

There's no need for sarcasm; your point is agreeable. X11 *has* improved drastically since most of the X server's functions were moved into the kernel.

The kdbuswreck

Posted Apr 28, 2015 0:04 UTC (Tue) by luto (subscriber, #39314) [Link] (10 responses)

One thing I've learned about software development: never assume that your performance sucks for the reason that you think it sucks. It sure seems obvious that dbus is slow because there's a single process that's a central point of contention. Too bad this doesn't appear to be the case [1] [2].

I can think of a couple reasons that the kernel might be slower than it ought to be for workloads like dbus-daemon. I fixed one of them in 3.16 (it affected me, too). All of this stuff is so far down in the noise, though, that I don't think it's even worth trying to optimize any of the kernel's part yet.

[1] http://lkml.kernel.org/g/CA+55aFxRa3mwL-17hUuUGpjCeGJXseG...
[2] http://lkml.kernel.org/g/CALCETrWLTLqZ0pioOEHakd_S+h=F1X2...

The kdbuswreck

Posted Apr 28, 2015 0:35 UTC (Tue) by dlang (guest, #313) [Link] (1 responses)

the first rule of optimization, measure first and find your bottleneck

The kdbuswreck

Posted Apr 28, 2015 0:46 UTC (Tue) by jspaleta (subscriber, #50639) [Link]

I thought the first rule of optimization was not to talk about optimization.

The kdbuswreck

Posted Apr 28, 2015 6:42 UTC (Tue) by zyga (subscriber, #81533) [Link] (4 responses)

The one thing that I think you may be missing is that kdbus-based dbus doesn't do much at all in the userspace deamon. It is the current design that does put all of the overhead in the one userspace process. With the kernel based version half of the overhead is removed outright (A->server->B->server->A becomes A->B->A in the common case).

Secondly, AFAIR, the current dbus daemon gets penalized by fair kernel scheduling. That issue goes away with kdbus. Lastly I think that it's prety clear that kdbus unlocks a whole new level of performance with code based on memfd that current dbus doesn't use.

Still, the threads you've referenced are interesting and I need to read more into them to understand how kdbus-based changes applies to them.

The kdbuswreck

Posted Apr 28, 2015 7:07 UTC (Tue) by luto (subscriber, #39314) [Link]

I realize that a dbus-like design (central daemon relaying messages) will probably take a performance hit due to context switches and copies. However, there's no reason that a synchronous method call should need 15 context switches, nor is there any reason that dbus couldn't use memfd for large messages.

Regardless, this particular dbus benchmark is so incredibly slow that none of this explains it, and kdbus is apparently only twice as fast. I'm not sure what the problem is, but it's not the scheduler or the fact that there's a central daemon.

IOW, yes, kdbus is in principle twice as fast as a dbus-like design. But dbus is several hundred times slower than it should be. Let's fix that first before quibbling over the other factor of two by moving some or all of it into the kernel.

The kdbuswreck

Posted Apr 28, 2015 14:15 UTC (Tue) by granquet (guest, #60931) [Link]

>Secondly, AFAIR, the current dbus daemon gets penalized by fair kernel scheduling. That issue goes away with kdbus. Lastly I think that it's prety clear that kdbus unlocks a whole new level of performance with code based on memfd that current dbus doesn't use.

yes, I concur here.
The switch to the CFS broke some use cases at the place I was working at that time.

but probably, those use cases where a bit stupid ;)

The kdbuswreck

Posted Apr 30, 2015 16:01 UTC (Thu) by ksandstr (guest, #60862) [Link] (1 responses)

>Secondly, AFAIR, the current dbus daemon gets penalized by fair kernel scheduling. That issue goes away with kdbus.

The issue should've gone away when priority inheritance was mooted for AF_UNIX to support lower latency in Xorg: the scheduler should've been altered to also select the previous process' IPC peer ("partner") to run until the client's wakeup condition was satisfied, and then return to the client immediately. This would've made a closed wait over AF_UNIX equivalent to a syscall, some thousands of clock cycles notwithstanding.

It's my opinion that a transitive form of partner scheduling and priority inheritance would've made an userspace DBus daemon near-transparent from a performance point-of-view, were a sufficient "counts as partner call" boundary possible to distinguish from the many states and forms of I/O sleep found in Unix. However today, instead of a relatively simple and well-defined primitive behaviour (and perhaps a tiny control API to manage it), we have 10_000 lines of lennartware being pushed for inclusion -- and not in staging like Android's "binder", either.

And before someone else in our little peanut gallery chimes in about priority inheritance: while that is necessary for a well-performing IPC architecture, it's insufficient a solution to the whole of the latency issue because rather than re-using the abstract scheduling decision that made a client process run in the first place, it only elevates the recipient's priority. A scheduler may well schedule an unrelated process in the server's (elevated) priority band, for example. The inheritance mechanism's interactions with scheduling quantums (the server's? the client's? at what priority? for how long?) and its teardown conditions have also remained poorly defined, which suggests that these issues just cold-up aren't being considered.

Finally, to not call "lennartware" without justification, and based on the considerations above, it's my prediction that if kdbus is merged, there'll be a span of two to six years immediately afterward at the end of which that which remains of kdbus-2015 will not be a net loss to its applications anymore, as with PulseAudio and Avahi before that.

The kdbuswreck

Posted May 4, 2015 7:58 UTC (Mon) by dgm (subscriber, #49227) [Link]

> the scheduler should've been altered to also select the previous process' IPC peer ("partner") to run until the client's wakeup condition was satisfied, and then return to the client immediately. This would've made a closed wait over AF_UNIX equivalent to a syscall, some thousands of clock cycles notwithstanding.

Hear! Hear!

This has the potential to make requesting services from a daemon (any daemon) much more efficient. Everything from web servers to desktop environments could benefit. Just think about how many daemons are constantly running in any typical desktop (answer: dozens!)

One has to wonder why something like this doesn't exists yet?

The kdbuswreck

Posted Apr 28, 2015 8:23 UTC (Tue) by paulj (subscriber, #341) [Link] (2 responses)

I wish LWN had a "+1 Awesome" button.

Kdbus looks like the mother of all premature optimisation from this.

The kdbuswreck

Posted Apr 29, 2015 16:35 UTC (Wed) by Uraeus (guest, #33755) [Link] (1 responses)

If that is your takeaway from this article I think you probably suffer from confirmation bias :)

The kdbuswreck

Posted Apr 29, 2015 16:57 UTC (Wed) by paulj (subscriber, #341) [Link]

Quite possible. ;)

To be honest, I'd prefer if this was done with something more generic, i.e. multi-listener AF_UNIX-like and whatever new SCM_CRED stuff needed to support authentication, so that it could benefit now just DBus but also whichever IPC system ends up replacing DBus.

Attaching and exposing kernel capabilities to sockets by default in a new API definitely sounds scarey!


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds