The Debian init system general resolution returns
The Debian init system general resolution returns
Posted Oct 25, 2014 2:31 UTC (Sat) by viro (subscriber, #7872)In reply to: The Debian init system general resolution returns by zblaxell
Parent article: The Debian init system general resolution returns
More interesting class of bugs stems from the systemd propensity to spew^Wdistribute tons of information to the rest of the system. Extra complexity in a sensitive place is only a part of the problem - much worse is that if dbus-daemon can't keep pace with it, the backlog is stored in memory of PID 1 until it can be sent. Now, recall what makes PID 1 special from the OOM killer point of view...
And no, it's not a pure theory - I've run into one of the bugs in that class; a lot of umount activity going on (e.g. on shutdown with several thousand bindings present in the system) ended up with quadratic amount of dbus traffic. The damn thing kept resending the entire mount table every time it saw a change. Welcome to 8G of dirty memory held by PID 1... Basically, they were too lazy to compare the old and the new tables and send a proper delta. Sure, fixing that one hadn't been hard, but the underlying architectural deficiency is still there.
_Anything_ that convinces systemd to generate a major spew is likely to take the system down. From what I've heard they had an earlier bug of the same kind; this one - with spew consisting of network device information.
It's not so much systemd codebase fault as one of the reasons why dbus is not suitable for high-volume traffic, combined with systemd using it for potentially huge amounts of such with PID 1 as originator. And no, bringing that festering shitpile of protocol into the kernel won't make the things any better - dbus is broken by design.
Posted Oct 25, 2014 2:48 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (17 responses)
PS: no, Plumber is not in any way better.
Posted Oct 25, 2014 3:22 UTC (Sat)
by viro (subscriber, #7872)
[Link] (16 responses)
PID 1 is in extremely *bad* position for keeping track of everything and sending reliable notifications of everything. No mechanism would really help with that; it's just that dbus pretends to provide one that will cope with such demands. It can't. Neither can kdbus.
Every time somebody finds a way to trigger a huge amount of traffic originating at PID 1, it's a serious bug. And they insist on keeping track of a lot of system state in PID 1, with all kinds of traffic being sent. Asking for trouble...
Posted Oct 25, 2014 3:59 UTC (Sat)
by raven667 (subscriber, #5198)
[Link] (7 responses)
Posted Oct 25, 2014 4:47 UTC (Sat)
by viro (subscriber, #7872)
[Link] (6 responses)
And yes, _that_ particular bug had been fixed - current systemd (since March or so) doesn't produce quadratic amount of traffic in that situation. My point is that _anything_ that tricks it into hosing the bus with shitloads of traffic will cause the same kind of problem, kdbus or no kdbus.
IOW, it's something they have to watch out for, and sending a lot of stuff (a lot of kinds of stuff, even) as part of normal operation, expected by the rest of the system, is seriously asking for trouble.
Posted Oct 25, 2014 6:15 UTC (Sat)
by viro (subscriber, #7872)
[Link] (5 responses)
The point being, bugs happen; the architectural mistake there is what's making them a lot more severe. Namely, the use of dbus to send notifications of many kinds of system state changes as they are happening, with PID 1 as sender. And one *still* can trigger obscene amount of traffic there - mount tmpfs on /tmp/a, create a bunch of bindings in /tmp/a/* and then keep doing mount --move /tmp/a /tmp/b; mount --move /tmp/b /tmp/a. Nowhere near as bad as "it panics on shutdown when there's a lot of mounts", but the same "let's keep telling dbus-daemon about those changes, no matter what" easily translates into severe slowdowns and OOMs. Single syscall, done in constant time kernel-side, ends up with massive dbus spam, and no throttling. Moreover, the processes receiving those notifications can bloody well get the same information themselves, and do it cheaper...
Posted Oct 25, 2014 14:56 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Nov 2, 2014 15:27 UTC (Sun)
by nix (subscriber, #2304)
[Link] (3 responses)
The same applies to anything at all related to network interfaces.
Presumably this traffic is meant to be consumed by udev, i.e. it's a replacement for the existing in-kernel uevent messages over the netlink socket. Seems like a rather baroque, ludicrous, and bug-prone change to me.
Posted Nov 2, 2014 16:33 UTC (Sun)
by raven667 (subscriber, #5198)
[Link]
Posted Nov 2, 2014 17:08 UTC (Sun)
by viro (subscriber, #7872)
[Link] (1 responses)
There's a very strong smell of PHB all over the design. Worse, a PHB that had been told by some conslutant about Web 2.0 and social media being The Thing for millenial generation and decided to have a local equivalent of twitter built for communication with the plebes. It doesn't work well? Why, let's move it to the critical servers; those are on beefier intertubes, or something... Still doesn't work well? Too fucking bad for those who maintain those servers - it's their responsibility now (and of course, any questions regarding the basic design of the damn thing are countered with generous loads of "we had it behave that way before, therefore it must behave the same").
And yes, I am talking about dbus and plans of moving it kernel-side ;-/
Posted Nov 2, 2014 21:35 UTC (Sun)
by johannbg (guest, #65743)
[Link]
That nack of his was a bit weird if you ask me but I guess I need to sacrifice a chicken, dance on one foot and drink some of that kernel koolaid to get my mind into the kernel cult and communication.
Posted Oct 25, 2014 6:03 UTC (Sat)
by johannbg (guest, #65743)
[Link] (3 responses)
The underlying problem is the same basically parallelizing X where X can = d-bus, sockets, file system jobs, you name it.
"PID 1 is in extremely *bad* position for keeping track of everything and sending reliable notifications of everything. No mechanism would really help with that"
So let's hear it based on the function of PID-1 how would you do it if not PID 1?
What architectural design do you have in mind to solve this?
Posted Oct 25, 2014 6:38 UTC (Sat)
by viro (subscriber, #7872)
[Link] (2 responses)
And if you do insist on IPC, for some reason, you could bloody well start a caching daemon on demand (with e.g. timeout for inactivity). There's no reason whatsoever to keep that in PID 1 or anywhere near it.
We already have mechanisms for parallelizing. Had them for more than four decades. Called "processes"...
Having systemd forwarding a bunch of stuff it gleans from the kernel is asking for bottlenecks, for no good reason. System calls are not going away; not unless you want a truly monumental bottleneck with systemd playing the role of Mach server. Even then read(2) and open(2) wouldn't disappear, including those of /proc/self/mountinfo...
Posted Oct 25, 2014 8:40 UTC (Sat)
by johannbg (guest, #65743)
[Link] (1 responses)
We have to agree to disagree on the push vs pull implementation.
Posted Oct 25, 2014 19:23 UTC (Sat)
by dlang (guest, #313)
[Link]
Posted Oct 25, 2014 15:13 UTC (Sat)
by Cyberax (✭ supporter ✭, #52523)
[Link] (3 responses)
Systemd uses a protected DBUS interface, so if you're able to DDoS it then you already have enough capabilities to wreck the system. It's not a fatal flaw in the design.
Also, it looks like KDBUS can help to throttle the senders - the regular DBUS daemon can't really do it cleanly.
Posted Nov 2, 2014 15:28 UTC (Sun)
by nix (subscriber, #2304)
[Link] (2 responses)
Or they could read /proc/self/mounts or /proc/self/mountinfo and get an always-reliable, namespace-aware view with none of this nonsense.
Posted Nov 2, 2014 17:09 UTC (Sun)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
Besides, netlink sockets can block senders just as well as kdbus.
Posted Nov 2, 2014 17:31 UTC (Sun)
by viro (subscriber, #7872)
[Link]
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
It's not really worth bothering with any IPC, let alone the one of push instead of pull variety.
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns
The Debian init system general resolution returns