|
|
Subscribe / Log in / New account

How Debian managed the systemd transition

By Nathan Willis
September 16, 2015

DebConf

Debian's decision to move to systemd as the default init system was a famously contentious (and rather public) debate. Once all the chaos regarding the decision itself had died down, however, it was left to project members to implement the change. At DebConf 2015 in Heidelberg, Martin Pitt and Michael Biebl gave a down-to-earth talk about how that implementation work had gone and what was still ahead.

Pitt and Biebl are the current maintainers of the systemd package in Debian, with Pitt also maintaining the corresponding Ubuntu package. The pair began with a brief recap of the init-replacement story, albeit one that steered mercifully clear of the quarrels and stuck to the technical side. Initial discussions for replacing the System V init system began as far back as 2007, but pressure grew in recent years, included considerable demand from system administrators and upstream projects (typically wanting specific features like support for logind or journald). Once the Technical Committee had made its decision to adopt systemd as the default, Pitt said, "the real work" began.

The jessie release

Only a few months remained between the decision (in February of 2014) and the first freeze for "jessie" that November. Nevertheless, the migration was completed and jessie shipped with systemd 215. In the end, Biebl said, getting systemd into shape turned out not to be as big of a deal as had been feared—the systemd package in "wheezy" was well-tested. On the other hand, the requirement that administrators would be able swap in one of the other init-system packages added quite a bit of complexity. First, the old init system had long been marked as "essential," which meant that removing it required the user to fight quite a few warnings and protests from the package manager and other software. And, just as importantly, the idea of easily swapping init systems in and out was a new one for Debian.

Eventually, the systemd team split the existing sysvinit into several packages, which made it easier to cope with dependent packages that assumed some part of sysvinit would be available. The team also created a new init meta-package, which allowed them to ensure that users would not accidentally remove all of the init-system packages from their system.

Supporting init-system swapping is not merely a package-management problem, however: users are likely to expect some system state to be preserved between any two init systems. For example, if a service is started at boot time by one init system, the user likely expects that service to be started in a comparable fashion under the other init system, too. To ensure this preservation of state, the team re-implemented a subset of systemd's systemctl interfaces in a new package called init-system-helpers.

[Michael Biebl]

Along the way, team members also created a package called dh-systemd, which lets Debian package maintainers ensure that their package will be properly configured regardless of which init system the user employs. Finally, they created the systemd-shim package, which supports certain downstream projects like GNOME that have hard-coded dependencies on individual systemd components.

Although the team worked to develop these interface and compatibility packages, it took a different approach when considering what to do with sysvinit's collection of configuration files—such as those in /etc/default or /etc/rcS.d. "The major reason was that we would have to carry a patch for that for all eternity," Biebl said. So they performed a one-time migration of several such settings files.

The integration work consisted of a lot of small changes, Biebl said, "but I think we succeeded. I think that people don't realize that they are using systemd." He noted that people have actually approached him asking how they can start using systemd and been surprised when he told them that they already are.

Along those same lines, he noted that users who upgrade an older Debian system configured with sysvinit to the jessie release are automatically migrated to systemd, and few have noticed. The goal in undertaking this "scary" change, he said, was to have jessie systems be the same regardless of whether they are new installations or updated ones. For upgraded systems, though, there is a fallback sysvinit init binary installed, complete with a bootloader option to boot the system with it if the user encounters a problem with systemd.

There are bugs in every software package, of course. Biebl said that the team braced for a flood of bug reports when jessie was released, but that flood never came. Most of the bugs that have been reported have stemmed from the fact that systemd is stricter in situations where sysvinit tries to hide errors. For example, Pitt said, Debian had a longstanding bug in its ecryptfs-setup-swap package; systems ended up getting configured with unencrypted swap or no swap at all, and thanks to sysvinit, no one noticed for several years. But systemd complained, and now the bug has been fixed.

Looking forward to stretch

Pitt then turned the discussion to the changes that are in store for the next Debian release, "stretch." The first change is to udev, which will begin assigning predictable, stable names for network interfaces (in place of names using the ambiguous "eth0" form). This change has already landed, he said, and should fix problems that many users have encountered in the past. But there are some wrinkles involved: interface names cannot be migrated to the new format automatically when upgrading to stretch, because doing so could break (for example) firewall rules written with the old names in mind.

[Martin Pitt]

Another upcoming change is support for networkd, a lean interface for bringing up and taking down network interfaces. Pitt noted that there have been a lot of questions about whether networkd is meant to replace NetworkManager. It is not, he explained; rather, it is akin to ifupdown in its scope. But Debian users will benefit from it because it better handles virtual interfaces and hot-plugging.

Biebl explained another change that is still in the works: the ongoing process of removing sysvinit's rcS.d scripts. There is a wiki page detailing the current status and the roadmap of the process. Because these scripts are executed early in the boot process, they can be difficult to remove, and the removals can trigger dependency cycles. He noted that Felipe Sateler has started working on this task, and in doing so has located quite a few packages that need attention from their maintainers.

Other systemd-related changes that may or may not land in time for stretch include the addition of kdbus support. The Debian systemd team regards kdbus as a beneficial addition, since it would be available at the earliest stages of the boot process, but whether Debian includes it will depend on its inclusion in the kernel. There is, of course, an out-of-tree kdbus module, and Debian users interested in testing it out can do so using the systemd tools packaged for Debian unstable.

There are plenty of opportunities for new volunteers to help out, Pitt and Biebl said. But, just as importantly, Debian developers and package maintainers could start taking advantage of systemd's "shiny new features," including timers, socket activation, or security confinement. Where things go from here, in other words, depends at least as much on how Debian as a whole chooses to use its init system as it does on the team that maintains the init-system packages.

[The author would like to thank the Debian project for travel assistance to attend DebConf 2015.]

Index entries for this article
ConferenceDebConf/2015


to post comments

How Debian managed the systemd transition

Posted Sep 16, 2015 21:48 UTC (Wed) by luto (guest, #39314) [Link] (43 responses)

Userspace dbus can also be available from the earliest stage of boot. Link it statically (preferably against something like musl), stick it in initramfs, and run it from /init.

Sure, you don't easily get to switch it to something running from /usr, but the kernel works like that as well.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:03 UTC (Wed) by airlied (subscriber, #9104) [Link] (42 responses)

how does the init process use it then?

How Debian managed the systemd transition

Posted Sep 16, 2015 22:11 UTC (Wed) by luto (guest, #39314) [Link] (41 responses)

On Fedora, /etc/init is a symlink to systemd, so systemd could fork and exec dbus-daemon (and even remember its PID for tracking purposes). But /etc/init could also be a tiny script like:

#!/bin/sh
/path/to/dbus-daemon --fork
exec /usr/lib/systemd/systemd

It may be traditional for initramfs code to delete itself and make sure that no references to the old initramfs are left before execing the real init, but it's certainly not a requirement.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:19 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (9 responses)

And what if dbus-daemon is killed or fails to start? There are loooots of different and interesting failure modes here.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:28 UTC (Wed) by luto (guest, #39314) [Link] (8 responses)

What if kdbus is killed or fails to start? Then you panic. Surely the failure mode if userspace dbus is killed or fails to start is no worse than that.

Similarly, if systemd is killed or fails to start, you panic.

How Debian managed the systemd transition

Posted Sep 17, 2015 1:33 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

PID1 can not be killed by SIGKILL, a separate DBUS daemon most definitely can be.

How Debian managed the systemd transition

Posted Sep 17, 2015 1:36 UTC (Thu) by luto (guest, #39314) [Link] (5 responses)

So what? If you start SIGKILLing random system daemons, you can't really expect your system to do very well. This is solidly in the "doctor, it hurts when I do that" category.

How Debian managed the systemd transition

Posted Sep 17, 2015 4:25 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (4 responses)

That's a violation of the current API/ABI contract - you can kill all the processes on a normal system and leave it running.

How Debian managed the systemd transition

Posted Sep 17, 2015 5:59 UTC (Thu) by luto (guest, #39314) [Link] (1 responses)

What API/ABI contract?

The kernel promises not to die unless PID 1 dies (or a bug happens or your block device dies or...). But unless your distro promises to survive a SIGKILL to all processes, I think you're on your own if you try that.

How Debian managed the systemd transition

Posted Sep 17, 2015 7:39 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Classic SysV init and systemd survive mass process murder just fine.

How Debian managed the systemd transition

Posted Sep 19, 2015 14:13 UTC (Sat) by cortana (subscriber, #24596) [Link] (1 responses)

You can't send SIGKILL to dbus-daemon on a normal system and expect anything that uses dbus to behave sensibly however...

How Debian managed the systemd transition

Posted Sep 19, 2015 21:00 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

Certainly. But normal systems right now don't have DBUS as their critical component.

How Debian managed the systemd transition

Posted Sep 17, 2015 8:02 UTC (Thu) by fishface60 (subscriber, #88700) [Link]

Interestingly, it can (sort of).

If you do an exec() and it fails from some corruption of the page tables, or the executable at the wrong time without going through the file system (since you get -ETXTBUSY when trying to do anything with an executable file while a process is mid-exec on it), then the process is killed with an inescapable SIGKILL.

This is relevant for systemd, as there's a few circumstances where systemd will exec.

1. When shutting down it exec's systemd-shutdown, which primarily exists so that PID1 isn't holding files open and preventing a clean unmount.
Though a failure here is unlikely to ruin anyone's day too badly, since they were already shutting the box down, they just get a mystifying kernel stack trace on shutdown in the rare case where exec fails.
2. A suitably privileged user invokes `systemctl daemon-reexec`, which instructs systemd that it should serialise its state and exec itself again.
This is so that after a software update, systemd can use the new versions of dependent libraries, rather than the old ones that may have been removed, but systemd would still be holding open.
3. Privileged users may also request a `systemctl switch-root`, which is primarily intended for the initramfs transition, but could be used to move your rootfs to different storage (so long as you don't mind that most of your processes are killed).
In this case there's few processes hanging around to be able to break the executable mid-exec, but systemd supports the notion of "storage daemons", which are processes explicitly out of systemd's control, and are expected to have been exec'd from the initramfs.
If you are using a storage daemon to run a fuse-based rootfs then it could easily corrupt the executable mid-exec.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:29 UTC (Wed) by josh (subscriber, #17465) [Link] (30 responses)

And then if anything goes wrong with dbus, it isn't being monitored the way every other daemon is.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:33 UTC (Wed) by luto (guest, #39314) [Link] (29 responses)

Systemd could certainly learn how to start a single built-in non-restartable unit (dbus) as its very first act, and it could even serialize it across exec so that the final non-initramfs systemd would remember that dbus-daemon was running.

Also, note that kdbus already requires that systemd, as its very first act (or really quite early, anyway) mounts kdbus and connects to it. This isn't very different. Kdbus would certainly not be monitored the way every other daemon is.

How Debian managed the systemd transition

Posted Sep 16, 2015 22:41 UTC (Wed) by josh (subscriber, #17465) [Link] (28 responses)

systemd already does have some special handling for dbus-daemon, such as its interaction with services that use Type=dbus.

kdbus came about partly because of the need for such special handling, partly because that handling gets more complicated when an initramfs gets involved, and partly because dbus-daemon had such issues like "cannot be restarted or everything goes pear-shaped", which a robust system shouldn't have.

What's the *advantage* of trying to launch dbus-daemon from the initramfs?

How Debian managed the systemd transition

Posted Sep 16, 2015 22:54 UTC (Wed) by luto (guest, #39314) [Link] (27 responses)

> What's the *advantage* of trying to launch dbus-daemon from the initramfs?

It has a big advantage over the current scheme: you can start it early and still don't need to worry about restarting it. It has no advantage over kdbus in terms of where the code is loaded from or ease of initialization, but I don't think it has a disadvantage either.

AIUI, the current scheme comes from the idea that all userspace code running after startup must reside on a non-initramfs mount. I've heard people say that it's not even possible to keep an initramfs program running after pivot_root. This is simply incorrect. Back when initramfs was actually ramfs, it wasted unpageable memory (just like kernel code), but initramfs is tmpfs nowadays.

Heck, there's no fundamental need for systemd to re-exec itself after pivot_root either, although, given that daemon-reexec is well-supported, it's probably a good idea from a forced testing and memory conservation perspective.

As a concrete, if dubious, benefit, udevd really could depend on dbus even without kdbus. Just require that dbus-daemon be started before udevd. (If this happened, I would drop udevd as part of the virtme minimal guest and I'd seriously consider busybox's udev as an alternative, but that a bit off-topic.)

How Debian managed the systemd transition

Posted Sep 16, 2015 23:08 UTC (Wed) by josh (subscriber, #17465) [Link] (26 responses)

> It has a big advantage over the current scheme: you can start it early and still don't need to worry about restarting it. It has no advantage over kdbus in terms of where the code is loaded from or ease of initialization, but I don't think it has a disadvantage either.

It does have at least two disadvantages there. First, getting dbus-daemon and all of its dependencies into the initramfs would prove rather annoying. Statically linking it isn't a solution (distros, dependency management, and static linking don't mix well), and adding a pile of libraries to the initramfs doesn't appeal. But even after doing that, which is certainly doable, "don't need to worry about restarting it" is a bug, not a feature; dbus-daemon is apparently utterly incapable of handling a restart, but it needs to restart on upgrade. kdbus doesn't have that problem, because it doesn't need a userspace daemon. (It needs some initial setup, but systemd does that, and systemd handles upgrades just fine.)

How Debian managed the systemd transition

Posted Sep 16, 2015 23:24 UTC (Wed) by luto (guest, #39314) [Link] (25 responses)

> dbus-daemon is apparently utterly incapable of handling a restart, but it needs to restart on upgrade. kdbus doesn't have that problem, because it doesn't need a userspace daemon. (It needs some initial setup, but systemd does that, and systemd handles upgrades just fine.)

I think your argument here is a bit confused. dbus-daemon is indeed apparently utterly incapable of handling a restart, so you can't upgrade it without rebooting (or blowing up everything that depends on it). But the kernel and, hence, kdbus, is utterly and completely incapable of being upgraded without rebooting, so the behavior is similar.

I would argue that the real issue with current distros is that they might actually try to upgrade dbus-daemon on disk *and restart it without rebooting*, which is doomed unless a *userspace* dbus daemon gets a major rewrite.

So I still don't see how kdbus is any better at all in this regard, aside from the fact that distros have already figured out how to build the kernel as a self-contained thing but might have trouble building a minimal static dbus-daemon. (It would work fine as a dynamic library with eager binding, too, but that's ugly.)

I'll grant that kdbus is probably a much more streamlined, self-contained piece of code than dbus-daemon, but that's more or less irrelevant wrt this issue.

Also, the userspace approach has a huge advantage here: you can run different versions of it in different containers.

How Debian managed the systemd transition

Posted Sep 17, 2015 0:27 UTC (Thu) by einstein (guest, #2052) [Link] (3 responses)

> But the kernel and, hence, kdbus, is utterly and completely incapable of being upgraded without rebooting, so the behavior is similar.

Actually, the kernel can be upgraded without a reboot. I was using ksplice for that back in 2009 or so, and the feature is coming together in mainline.

How Debian managed the systemd transition

Posted Sep 17, 2015 0:32 UTC (Thu) by luto (guest, #39314) [Link] (2 responses)

If someone implements a reboot-less upgrade from x.y to x.(y+1), and it actually works, I will personally buy them the beer or tasty non-alcoholic beverage of their choice*. Snapshotting the world and restoring using CRIU or similar tools doesn't count.

I've gotten emails from the ksplice team asking me how the heck they're supposed to handle a small number of individual entry changes I've made, and those are tiny compared to replacing the whole kernel.

* Within some reasonable limits.

How Debian managed the systemd transition

Posted Sep 17, 2015 0:47 UTC (Thu) by josh (subscriber, #17465) [Link]

> Snapshotting the world and restoring using CRIU or similar tools doesn't count.

I'd argue that if you can successfully save userspace, kexec a new kernel, and seamlessly reload userspace, that's a huge accomplishment that counts as a "live" kernel upgrade.

How Debian managed the systemd transition

Posted Sep 22, 2015 16:20 UTC (Tue) by jejb (subscriber, #6654) [Link]

> If someone implements a reboot-less upgrade from x.y to x.(y+1), and it actually works, I will personally buy them the beer or tasty non-alcoholic beverage of their choice*. Snapshotting the world and restoring using CRIU or similar tools doesn't count.

Hey, that's not fair: to go from n to n+1 you know the only way is to save and restore the kernel state in a version independent manner, so you're trying to define the only possible method out of your challenge. The problem with the method is the time it takes, but there are people working on it

https://sslab.gtisc.gatech.edu/pages/kup.html

How Debian managed the systemd transition

Posted Sep 17, 2015 1:07 UTC (Thu) by josh (subscriber, #17465) [Link] (20 responses)

From your comment, you seem to think of kdbus as "dbus-daemon in the kernel", which explains why you consider it analogous that dbus-daemon can't handle live upgrades and that the kernel can't. I was commenting from the point of view that kdbus isn't "dbus-daemon moved into the kernel", but rather "DBus without dbus-daemon". The only userspace setup it needs (ignoring temporary compatibility shims for dbus-daemon) is to mount it. By contrast, dbus-daemon 1) offers bus services of its own, which gain new methods over time, 2) has a pile of evolving userspace configuration bits, and most critically 3) doesn't always function properly when new libraries run against old dbus-daemon or vice versa. None of those issues apply to kdbus.

(I'm going to ignore the case of unloading and reloading kdbus.ko, here, because I doubt you can do that without stopping all dbus users, so that doesn't count either. It does mean you could upgrade kdbus without upgrading the kernel, but that won't make sense once kdbus gets merged into the kernel. It also doesn't address your point.)

So, my contention is that if you ran dbus-daemon from the initramfs, then in addition to the pain of building a dbus-daemon that can run from the initramfs, while handling services and configuration files both from the initramfs *and* from the root filesystem, you'd also have cases where you need to reboot to upgrade dbus-daemon, because you want to upgrade the corresponding userspace and your userspace can't cope with an old dbus-daemon. (It *especially* can't cope with the dbus package getting upgraded on the filesystem but the running version being older than the installed package.)

How Debian managed the systemd transition

Posted Sep 17, 2015 1:32 UTC (Thu) by luto (guest, #39314) [Link] (19 responses)

I'm thinking of kdbus as "dbus-daemon in the kernel" where dbus-daemon is a hypothetical non-crufty daemon.

Sure, kdbus doesn't read config files, but there is no reason whatsoever that a userspace dbus daemon should need to read config files, especially if it's aiming for feature parity with kdbus. Similarly, kdbus claims ABI compatibility, but a userspace dbus daemon really ought to do the same.

I get kind of annoyed when kdbus gets compared to dbus-daemon-as-it-exists and the favorable comparisons are used as an argument for why kdbus is a good idea. Dbus-daemon has all kinds of problems, but, after reading far too many emails about it and thinking about it for far too long, I'm having trouble believing that there is a single respect in which kdbus solves a problem that a simple, streamlined userspace daemon can't easily solve.

If current dbus-daemon barfs when its package is upgraded under it, that's *pathetic*, but it's still not a good reason why distros should be excited about kdbus.

(The streamlined userspace daemon would need help from an improved AF_UNIX credential mechanism, but that's easy.)

How Debian managed the systemd transition

Posted Sep 17, 2015 2:07 UTC (Thu) by josh (subscriber, #17465) [Link] (15 responses)

> I'm thinking of kdbus as "dbus-daemon in the kernel" where dbus-daemon is a hypothetical non-crufty daemon.

That hypothetical non-crufty daemon would almost never need upgrading, sure. And neither does kdbus, so the comparison works. But the dbus-daemon we have *today* doesn't belong in an initramfs, and that's where this discussion started. And I see a distinct lack of people working on a hypothetical non-crufty dbus-daemon, hence why it remains hypothetical.

Apart from that, I can think of several things kdbus can do that an arbitrarily lightweight dbus-daemon can't, which explains part of why nobody seems to want to work on a hypothetical non-crufty dbus-daemon. Most notably, it eliminates a context switch from every message passed (two from every round-trip). If you had a "non-crufty" dbus-daemon that didn't need to touch the actual messages, what remaining non-cruft purpose does the daemon serve? Even having dbus-daemon involved in setup or broadcasts represents unnecessary overhead.

How Debian managed the systemd transition

Posted Sep 17, 2015 2:32 UTC (Thu) by josh (subscriber, #17465) [Link] (1 responses)

Note, though, that if the overhead could be entirely eliminated (context switches included) there *are* things I'd love to see moved out of the kernel. The vast majority of filesystems, for instance: a giant pile of C code, running at the highest possible security level, used to parse what should be arbitrary untrusted data, that we're increasingly exposing to arbitrary unprivileged userspace. There's no good reason for, for instance, isofs, freevxfs, or hfs to live in the kernel.

How Debian managed the systemd transition

Posted Sep 17, 2015 5:55 UTC (Thu) by luto (guest, #39314) [Link]

FUSE does pretty well for itself despite context switches. I've never profiled it, but I bet that context switches account for very little of its overhead. I would imagine that inefficient use of page cache is the main problem.

Dbus is a nasty model for things like filesystems, though. Some kind of fast capability-based transport would be much better suited, especially since a file descriptor (or directory reference or whatever) maps quite nicely to a capability.

How Debian managed the systemd transition

Posted Sep 17, 2015 5:52 UTC (Thu) by luto (guest, #39314) [Link] (12 responses)

I'm not really convinced by this context switch thing. For a messaging system, users are likely to care about latency and about throughput. Certainly, to send a single message via a central daemon, two context switches are required, whereas sending a message via kdbus or any other direct-through-the-kernel system only needs one context switch.

But context switches should be decently under 2 µs on a modern system. (The atrocious performance of libgdbus + dbus-daemon has *nothing* do with with the extra context switch.) With some optimization, which certainly could be done, I bet we can significantly improve context switches performance.

In any event, for applications that care about throughput, the extra context switch is a red herring. Under load, a good central daemon will process many messages per time slice, so the throughput bottleneck is much more likely to be message routing and such rather than context switches. Under that type of load, having a central daemon shouldn't by much slower than doing everything in the kernel. Kdbus is IMO unlikely to be particularly fast in terms of CPU time used per message because the per-message processing is rather complex.

With a userspace mechanism built on top of a serious IPC primitive, the extra context switch goes away because the central daemon can easily introduce parties for direct communication. Linux has no such mechanism (other than SCM_RIGHTS). seL4 does, and I suspect (although I don't know for sure) that the other L4 systems do as well. Binder also looks reasonable for such uses, even though it's rather crufty in other respects.

For dbus in particular (userspace or kernel), I think that good performance under load will be tough, because dbus has a reliable in-order broadcast model. If everyone can broadcast to everyone in order, then the overall system needs to buffer each message until every receiver has read it. Since the senders and receivers are all asynchronous, that can be a lot of buffering. For kdbus in particular, the fancy "pool" model means (AFAICT) that all of the broadcast messages need to be buffered *separately* for each receiver. IMO this will work considerably worse than just doing it with a lightweight userspace daemon. Realistically, though, the fully-ordered broadcast model seems unlikely to hold up under load with *any* implementation whatsoever.

How Debian managed the systemd transition

Posted Sep 23, 2015 9:33 UTC (Wed) by paulj (subscriber, #341) [Link] (10 responses)

The problem is some people already went and implemented a kernel DBUS, presumably without having thought too deeply about things and not having questioned the notion that the performance problems with dbus-daemon were to do with kernel-userspace transitions. So given it exists and does improve performance over the inefficient user-space implementation, and given those people (like any others) aren't keen to have their work wasted, there will now be pressure to integrate it.

That pressure will be hard to deflect by pointing out the correct solution to an inefficient user-space implementation is not a very $FAVOURED_IPC_OF_THIS_DECADE-specific kernel implementation, but instead to implement an efficient user-space implementation + whatever generalised kernel services are needed for IPC problems in the abstract. To deflect that pressure for good requires coming up with that efficient user-space implementation really.

How Debian managed the systemd transition

Posted Sep 23, 2015 9:47 UTC (Wed) by lgeorget (guest, #99972) [Link] (2 responses)

> The problem is some people already went and implemented a kernel DBUS, presumably without having thought too deeply about things and not having questioned the notion that the performance problems with dbus-daemon were to do with kernel-userspace transitions.

Actually, if I recall correctly the discussions on that matter, the main advantage of the in-kernel implementation of dbus was not that it reduces the number of context switches but that it reduces the number of memory copies because for the kernel, unlike a user-space daemon, copying memory can be as simple as mapping the same pages in two processes.

> those people (like any others) aren't keen to have their work wasted, there will now be pressure to integrate it.

As far as I can tell from reading the mails on the Linux mailing list, Greg Kroah-Hartmann has shown to be very professional. He would surely be pleased to see his work in the mainline kernel, but not to the point to "pressure" anyone.

How Debian managed the systemd transition

Posted Sep 23, 2015 15:06 UTC (Wed) by luto (guest, #39314) [Link]

Indeed, kdbus saves a memory copy in the common case if the receiver is able to consume data straight from the "pool" without copying the data itself.

For small messages, this barely matters, and for large messages, both kdbus and AF_UNIX users can use memfds, which does even less copying.

Actually, for small messages, I'll only believe that the kdbus approach is faster if someone benchmarks it cleanly. The saved copy is only possible because the kernel writes to the receiver's pool when the message is sent, and that means that the kernel has to map the receiver's pool, and that's not free. (In fact it can be very slow -- modern CPUs are very good at mapping things, but at least x86 makes *unmapping* extremely expensive.)

How Debian managed the systemd transition

Posted Sep 23, 2015 15:51 UTC (Wed) by dlang (guest, #313) [Link]

Linus has pointed out that the performance wins of kdbus have far more to do with horribly inefficient userspace dbus code than any advantage of being in the kernel (context switches or memory copies)

So the 'official' justification for kdbus is no longer performance, but rather security and/or reliability

How Debian managed the systemd transition

Posted Sep 23, 2015 16:29 UTC (Wed) by raven667 (subscriber, #5198) [Link] (6 responses)

> presumably without having thought too deeply about things

I've been on the sidelines, following development on LWN, but that doesn't seem representative of the people involved or the effort which has gone into this, so I wouldn't presume that at all.

> not having questioned the notion that the performance problems with dbus-daemon were to do with kernel-userspace transitions

I believe there was awareness that the existing dbus-daemon implementation was not performant but also awareness that even a perfectly implemented userspace daemon has an upper limit on what it can do because of serializing, memory copying and context switches. Experience with the X Window protocol is instructive here as it sits in a very similar place in the software stack and there was a desire for dbus to be able to scale to the point of handling graphics data, which has already been demonstrated with X that a userspace daemon cannot do this without kernel support. Less copying and less context switches are also a boon for power usage which is becoming more important every year, both for battery powered and datacenter devices.

> efficient user-space implementation + whatever generalised kernel services are needed for IPC problems in the abstract.

This was the original goal and implementation many years ago but was flatly rejected by the kernel developers who would have needed to approve it which is why we have the kdbus implementation we have today as opposed to some other design. The original thought would be for a multicast AF_UNIX type socket that a userspace daemon could control which would be capable of zero-copy message delivery but the network subsystem maintainers refused to entertain the changes required to make something like that work and be supportable, so a different design which is much more self-contained is being proposed instead.

How Debian managed the systemd transition

Posted Oct 9, 2015 23:29 UTC (Fri) by nix (subscriber, #2304) [Link] (5 responses)

Experience with the X Window protocol is instructive here as it sits in a very similar place in the software stack and there was a desire for dbus to be able to scale to the point of handling graphics data, which has already been demonstrated with X that a userspace daemon cannot do this without kernel support.
X was doing just that without kernel support for nearly two decades. The MIT-SHM extension is worth noting.

You don't need to be the kernel to share memory... and with memfds, you don't even need to be the kernel to share memory with untrusted partners.

How Debian managed the systemd transition

Posted Oct 10, 2015 1:24 UTC (Sat) by raven667 (subscriber, #5198) [Link] (4 responses)

Shared memory is a kernel feature that gets you some of the way there but doesn't have the access control interface that these applications require and the DRI/DRM interfaces in the kernel were created for graphics applications like X, much like memfd which was created for kdbus, so I don't think its fair to say that X runs undegraded without special kernel support.

How Debian managed the systemd transition

Posted Oct 13, 2015 13:50 UTC (Tue) by nix (subscriber, #2304) [Link] (3 responses)

What? X ran undegraded without kernel support for literally a decade plus, until hardware 3D stuff started turning up. MIT-SHM provided everything needed.

How Debian managed the systemd transition

Posted Oct 13, 2015 14:45 UTC (Tue) by nybble41 (subscriber, #55106) [Link] (2 responses)

> X ran undegraded without kernel support for literally a decade plus

I think one could argue that being given direct access to the graphics hardware, and thus effectively unlimited access to the entire system, should count as "kernel support". Sure, the driver code was inside the X server rather than compiled into the kernel or a loadable module, but it still required special interfaces used primarily by X, and it wasn't possible to run the X server as an ordinary, non-root user process.

How Debian managed the systemd transition

Posted Oct 13, 2015 15:09 UTC (Tue) by raven667 (subscriber, #5198) [Link] (1 responses)

That's a good point, but even if you don't consider allowing the userspace app to just bang away at /dev/mem "kernel support" because that really isn't a defined API, certainly we say that limiting to the performance and capabilities of the 1990's X stack would be considered "degraded" by modern standards and applications. Making this behave safely without degraded performance required the addition of dedicated APIs, to talk to the graphics co-processor, to share memory buffers, beyond the 1980's UNIX standard ones.

We've already gone down the route of adding dedicated IPC APIs for SysV, for Netlink, for X/Wayland and now for DBUS, which I see as following the evolution of OS design and the needs of the applications of the era when these interfaces were designed.

How Debian managed the systemd transition

Posted Oct 13, 2015 22:49 UTC (Tue) by nix (subscriber, #2304) [Link]

Oh, I agree it would be bad by modern standards -- however, it was quite clearly capable of scaling to the point of handling graphics data with no more kernel support than that. To get back to the original point: unless you think D-Bus is not just going to be asked to handle graphics data but the full graphics flow of a 3D game I think the volume of data involved in graphics should not serve as an argument for needing kernel support just to handle that.

How Debian managed the systemd transition

Posted Sep 25, 2015 22:24 UTC (Fri) by oak (guest, #2786) [Link]

Yea, the buffering is much larger performance issue than context switching. All it takes is some message that is generated very frequently, and a client that has subscribed to the message, but isn't reading its messages (e.g. because it's suspended for few days while on background).

Result is that daemon message buffers grow until they take all your memory, your system message transport goes to swap (with everything else) and things become *really* slow until the problematic client is killed. If the client is woken up, daemon and client can spend many minutes (or hours depending on how much swap & buffering you have) during which bus isn't very responsive. If allocations were mixed well enough, emptying the message buffer on daemon doesn't actually free its dirtied memory because it's gotten fragmented.

This is D-BUS experience from 5-10 years ago on semi-embedded device. Even worse, the user-space daemon gets it's memory fragmented very easily and doesn't return to system memory it's once allocated. So, local DOS is trivial to do with any client that can connect to bus.

Some of the things where kernel *might* be able to improve on this are:
* Assigning message buffers memory cost to corresponding client, so that admin can identify who's the culprit
* Better allocator that guarantees that after processing the messages, the emptied buffer can actually be freed for other purposes (i.e. allocation blocks don't mix data with unrelated life-times, e.g. send and receive messages or messages from/to different clients)
* If message is status broadcast, maybe having some mechanism where only last status update is buffered
* Suspending message sending if receiver isn't processing the messages

How Debian managed the systemd transition

Posted Sep 17, 2015 6:11 UTC (Thu) by alison (subscriber, #63752) [Link] (2 responses)

>Dbus-daemon has all kinds of problems, but, after reading far too many emails about it and >thinking about it for far too long, I'm having trouble believing that there is a single respect in >which kdbus solves a problem that a simple, streamlined userspace daemon can't easily solve.

Performance of Dbus-daemon aside, what about the more abstract question of whether a new message-passing API inside the kernel makes sense? From the shear design point of view, why does the kernel provide 3 notification services for userspace via fanotify, dnotify and inotify? Presumably the rationale for adding fanotify to dnotify and inotify was that fanotify was superior. Why does that rationale not apply to kdbus?

Both kdbus and Dbus-daemon will continue to evolve. The issue of whether the kernel should have a new feature would logically be decided on the basis of what the kernel's rightful role is. Mostly the kernel's job is to abstract away the details of hardware and to provide userspace with services (e.g. boot) that it would have difficulty managing itself. Is IPC like that provided by kdbus such a service, or no? If not, why is it fundamentally different from notification, to which it seems logically related?

How Debian managed the systemd transition

Posted Sep 17, 2015 11:46 UTC (Thu) by lsl (subscriber, #86508) [Link] (1 responses)

> Presumably the rationale for adding fanotify to dnotify and inotify was that fanotify was superior.

Wasn't the "rationale" more like "we hope it makes snake oil vendors stop torturing our enterprise kernels with horrible out-of-tree modules"? At least that's what I remember from it. It wasn't any less drama than kdbus. Also, it didn't get merged until attempts were made to rework it to be more generally useful, for tasks other than implementing snake oil products.

How Debian managed the systemd transition

Posted Sep 23, 2015 20:35 UTC (Wed) by foom (subscriber, #14868) [Link]

... Except, it failed to do so. (Man, what a *waste* of a third attempt to have functional fs watching functionality...)

They did a good job

Posted Sep 17, 2015 2:11 UTC (Thu) by dskoll (subscriber, #1630) [Link] (25 responses)

I've upgraded a number of systems from Wheezy to Jessie and it really was quite painless. The only real difference I've noticed is that the systemd-enabled Jessie machines boot a whole lot faster than they did when they were Wheezy.

They did a good job

Posted Sep 17, 2015 5:48 UTC (Thu) by alison (subscriber, #63752) [Link] (2 responses)

Hear, hear: chapeau to Martin and Michael for a job well done. I run the Testing release, and the transition was both trouble-free and relatively fast. The transition from Fedora 14 to 15, when systemd was introduced, was a lot rockier, and is part of the reason, I suspect, why developers and sysadmins so feared systemd.

They did a good job

Posted Sep 17, 2015 9:12 UTC (Thu) by salimma (subscriber, #34460) [Link] (1 responses)

Fedora serving its purpose as a more leading-edge distro then. Which makes it telling that it has not adopted Btrfs yet, but that's another huge topic on its own.

They did a good job

Posted Sep 17, 2015 22:54 UTC (Thu) by zlynx (guest, #2285) [Link]

Fedora does make it easy to set btrfs as your filesystem even if it is not the default.

I've been using btrfs on my Fedora laptop and a small home file, backup and email server since F20. No problems.

They did a good job

Posted Sep 17, 2015 15:46 UTC (Thu) by deater (subscriber, #11746) [Link] (21 responses)

I've had problems, not with the transition, but with systemd making "interesting" decisions after the upgrade.

For example, on my headless network gateway, the drive with the /home partition failed ( / was fine).

Instead of just booting with the /home drive removed, it instead sat there on the console with a fancy and colorful display letting me know that a partition was missing, counting down for a few minutes before in the end just totally giving up and dropping to single-user mode. Without bothering to start packet forwarding first. This was a headless machine, so my network was down until I could drag a monitor and keyboard over to figure out what was going on.

I've had a few other surprises like this. And I guess that's par for the course when transitioning such a big system component over, but I find myself spending a lot of time fighting against systemd in cases where things used to "just work".

They did a good job

Posted Sep 17, 2015 16:30 UTC (Thu) by MarcB (subscriber, #101804) [Link] (18 responses)

This is more or less a simple change of defaults. And it was documented.

Before Systemd the "nofail" flag was the default (more or less - errors were reported on the console but they did not block anything). Now, users have to explicitly set the flag if they think that it is OK if a specific automatic mount fails.

I *very* strongly prefer the new behaviour.

The old behaviour could lead to undefined system behaviour, especially when combined with the fact that folders do not need to be empty when mounting something into them.

We once had long deprecated software versions running on a system when a network mount failed silently. Fortunately, the input format had changed in the meantime, so that the old software failed completely - and complained loudly - instead of just producing formally correct but logically wrong output. The latter could have gone unnoticed for months and would have cost a lot of money. (Of course, it was a mistake to just mount a network filesystem over the previously local installation in the first place, but stuff like that happens.)

They did a good job

Posted Sep 17, 2015 21:15 UTC (Thu) by deater (subscriber, #11746) [Link]

> This is more or less a simple change of defaults. And it was documented.

Documented in the basement behind a sign saying "beware of leopard"?
Oh, I found it, section 5.6.1 of this document: https://www.debian.org/releases/stable/i386/release-notes... which I apparently missed at the time.

I still think it's a fairly major and surprising change, though I can grudgingly see why an upgrade to systemd wouldn't want to update the fstab file to make all mounts nofail for backward compatability reasons.

They did a good job

Posted Sep 18, 2015 1:11 UTC (Fri) by mchapman (subscriber, #66589) [Link] (11 responses)

> This is more or less a simple change of defaults. And it was documented.

To add to this, it could be argued that the default hasn't changed, and that the old behaviour was just plain wrong. The mount(8) manpage documents a "nofail" option, but there is no corresponding "fail" option. This seems to imply that "fail on error" was always intended to be the default behaviour, and that the old pre-systemd initscripts had simply got it wrong.

I don't actually agree with this view -- if the old behaviour is what distros used then the documentation should have been written to say that -- but I can certainly see the logic in it. I do prefer systemd's stricter handling of mount points during boot though.

They did a good job

Posted Sep 18, 2015 2:15 UTC (Fri) by dlang (guest, #313) [Link] (10 responses)

you should read Ingo's comments from the kernel qotw http://lwn.net/Articles/657428/

it's worth going and reading his full post.

This attitude of "well, if we broke it, it was wrong anyway" is part of the reason people distrust the systemd folks so much.

They did a good job

Posted Sep 18, 2015 4:17 UTC (Fri) by mchapman (subscriber, #66589) [Link]

> you should read Ingo's comments from the kernel qotw http://lwn.net/Articles/657428/

I said it was "logical", not that it was right. :-)

They did a good job

Posted Sep 18, 2015 10:41 UTC (Fri) by MarcB (subscriber, #101804) [Link]

There is a significant difference between ABI/API and the internal behaviour of system components. The first is supposed to be stable (or not, depending on the project), the second is only supposed to be documented and it must be sane (much more so than an ABI, where you can - at a cost - work around insanity).

There explicitly is no stability guarantee in the latter case (as evidenced by every major distribution's release notes). If you do a major upgrade - if your distribution supports that at all - you have to read the release notes, understand them and perform any extra steps that apply to your system. This has always been the deal.
(And not only that, you also have to read the changelog of every major library you use out of the distribution's repo, or have very good testing in place - that is the real effort.)

Of course, individual users might have gotten away without doing that in the past. They were lucky.

Just to make this clear: Had Debian made such a change within the life-cycle of a distribution, I would have been really annoyed.
But during a major upgrade I expect things like that. And for me, this particular change is more than welcome. Aborting the boot on errors is the only sane thing to do. The init system cannot know and understand the filesystems and simply ignoring the errors can lead to anything, while erroring out always leads to the same thing: A very clear and very visible error that should be caught by even the most simple monitoring.

They did a good job

Posted Sep 18, 2015 14:01 UTC (Fri) by deater (subscriber, #11746) [Link]

> you should read Ingo's comments from the kernel qotw

I thought that qotw was posted ironically due to the number of times his projects have broken various ABIs.

They did a good job

Posted Sep 18, 2015 19:08 UTC (Fri) by davidstrauss (guest, #85867) [Link] (6 responses)

> This attitude of "well, if we broke it, it was wrong anyway" is part of the reason people distrust the systemd folks so much.

systemd may have a default and an opinion, but it was Debian's prior, unusual default that created this divergence. If you're Saab -- with a long history of having the ignition in the center console -- and switch to stock parts, it is not the fault of the stock steering column manufacturer that ignition is now in a different place than you had put it before.

Also, Debian could have chosen to configure the post-upgrade system to carry over "nofail," but they didn't. It's not like systemd forces you to have mounts fail hard. Complain to the Debian maintainers if you don't like the way they're choosing to configure systemd.

They did a good job

Posted Sep 26, 2015 10:22 UTC (Sat) by rleigh (guest, #14622) [Link] (5 responses)

It wasn't an "unusual default". It had behaved this way for ~20 years and was the intended and expected behaviour for a Debian system.

They did a good job

Posted Sep 26, 2015 11:16 UTC (Sat) by zdzichu (subscriber, #17118) [Link] (4 responses)

Could you show where it was documented to behave that way?

They did a good job

Posted Oct 9, 2015 22:49 UTC (Fri) by nix (subscriber, #2304) [Link] (3 responses)

You're saying that only documented behaviour needs to be preserved? That users are expected to read *all* the documentation that applies to every little part of their system, and if they don't, it's their own damn fault for being annoyed when something that turns out not to have been documented randomly changes behaviour and breaks things?

This is a very legalistic (read, unhelpful) way to make a system. It's a way to make a system that, to be blunt, doesn't work very well.

They did a good job

Posted Oct 10, 2015 9:24 UTC (Sat) by zdzichu (subscriber, #17118) [Link] (2 responses)

If it's not documented, how can it be "intended"? How can you judge if the behavior is correct and it is not a bug? See also encrypted swap support in article – it was another bug exposed by systemd.

They did a good job

Posted Oct 10, 2015 9:49 UTC (Sat) by cebewee (guest, #94775) [Link] (1 responses)

A behavior does not need to be "intended" to be relied upon. It suffices if this behavior is consistently exposed. Then a change has a high chance of breaking user expectations. Quoting dlang's comment:

> you should read Ingo's comments from the kernel qotw http://lwn.net/Articles/657428/

Now, there are of course cases where breaking changes are justified (and I won't judge whether this was the case here), but you cannot blame people for being upset when said changes break their system.

They did a good job

Posted Oct 11, 2015 0:15 UTC (Sun) by mathstuf (subscriber, #69389) [Link]

One of systemd's goals is to make a layer for Linux that can be depended upon. Minor differences like this are the "death by 1000 papercuts" that makes cross-distro deployment a pain. Breaking every other distro in the same way Debian has for years is certainly not the better solution here. More docs would have been better. Even better would be some check in the updater about possible semantic changes in /etc/fstab entries.

They did a good job

Posted Sep 19, 2015 14:21 UTC (Sat) by cortana (subscriber, #24596) [Link] (4 responses)

It would have been nice if the upgrade to systemd would have added 'nofail' to entries in /etc/fstab. That way, the old behaviour would be preserved for existing systems, in a way that is easy to disable once you read the release notes and decide you want the new behaviour.

They did a good job

Posted Sep 19, 2015 15:16 UTC (Sat) by mathstuf (subscriber, #69389) [Link] (3 responses)

One of their goals was to make an upgrade the same as a new install. Plus, who reads release notes in detail? Does Debian have a nice section for "breaking changes" and the like?

They did a good job

Posted Sep 19, 2015 18:36 UTC (Sat) by cortana (subscriber, #24596) [Link] (2 responses)

Everyone who upgrades should read the release notes first. They mention this exact issue: http://www.debian.org/releases/stable/amd64/release-notes...

They did a good job

Posted Sep 19, 2015 21:52 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

That is exactly what I was looking for :) . Bonus points if they give it as reading material in the upgrade tool (and ponies if they filter/sort it by packages you have installed that are affected).

They did a good job

Posted Sep 19, 2015 22:04 UTC (Sat) by mathstuf (subscriber, #69389) [Link]

Hmm, this looks interesting:

> Lack of security support for the ecosystem around libv8 and Node.js
"Don't use Debian's packages for deploying node.js-based services."

I wonder how many will ignore this :( .

They did a good job

Posted Sep 18, 2015 14:00 UTC (Fri) by drag (guest, #31333) [Link] (1 responses)

> Instead of just booting with the /home drive removed, it instead sat there on the console with a fancy and colorful display letting me know that a partition was missing,

How is this different then what was life was like before systemd?

This sort of crap is why 'enterprise' machines have things like ILOM and iDrac and before that serial terminal servers.

They did a good job

Posted Sep 19, 2015 0:55 UTC (Sat) by deater (subscriber, #11746) [Link]

> How is this different then what was life was like before systemd?

For the previous 20 years I've run Linux/UNIX, if /home was unable to mount, the system would continue anyway and just do without. You could still log in, but obviously your files wouldn't be there and ssh and login might complain loudly. Then you could remotely do whatever was needed to get /home fixed/restored.

The new behavior unhelpfully just dumps you to a prompt, which is bad if you are headless and depending on being able to ssh in (or even worse, the machine is your network gateway so all of your other systems drop off the net because packet forwarding never gets started).

The machine doesn't have iLOM. A serial console would have helped in this situation but I have a lot of machines and it would be a wiring nightmare for all of them to have serial consoles.

So anyway yes, this is a distribution issue not necessarily an init issue. It's just hard when your 20-year old mental model of how a Linux server behaves is majorly changed without much warning. I'm used to other major OSes playing this game but I've been spoiled by Debian being a bit more sane over the years.

How Debian managed the systemd transition

Posted Sep 18, 2015 2:19 UTC (Fri) by flussence (guest, #85566) [Link] (22 responses)

> The first change is to udev, which will begin assigning predictable, stable names for network interfaces (in place of names using the ambiguous "eth0" form).

I hope, for all Debian users' sake, that they've gone an extra mile flushing out all the bugs in that.

I've seen quite a few instances (in Gentoo, which adopted it years ago) where someone's network setup was broken due to it: bus enumeration races causing a USB wifi dongle to change names every boot, the rules flip-flopping between bus and MAC naming patterns between udev versions, the "en"/"wl" prefix sometimes going missing entirely, and even multiple of these problems striking at once during the jump from install media to finished system.

In ten years of using it and participating in their help forums, I've yet to come across a *single* complaint that eth0 unexpectedly became eth1. Maybe Gentoo users simply lack sufficiently insane hardware?

How Debian managed the systemd transition

Posted Sep 18, 2015 3:02 UTC (Fri) by dlang (guest, #313) [Link] (17 responses)

predictable and stable if you happen to have the mac address of every interface memorized.

If my laptop only has one ethernet interface, "eth0" is predictable and stable.

enx00249b0e398f is neither (and yes, this is exactly what the new udev gave me for my laptop)

Now, I have managed a lot of systems with lots of interfaces over the years (22 gig-e interfaces on one system was one of my 'standard' configs).

I have had a couple cases where interface names have changed during kernel upgrades.

In one case it was because a new driver enabled an interface that previously had been unsupported

In the other case, it was because a different driver was used to support one card and it got sorted ahead of the other driver supporting other cards.

I can understand that people who do everything with modules loaded at runtime can run into grief by loading the modules in a different order, but this falls under the category of 'it hurts when I do that, so stop doing that'

Debian has for years supported pinning the interface names to the MAC addresses based on the detection order the first time the OS booted. This has actually caused me more grief than anything else, because when I replaced the card that held eth0-eth3 with an identical card, now I had eth22-eth25 and no eth0-eth3 (until I removed the udev rule that did this)

If you can't tell, I think the new udev 'solution' is far worse than any problem it may solve.

the good news is that it's fairly easy to disable

# ln -s /dev/null /etc/udev/rules.d/80-net-setup-link.rules

unfortunately this doesn't give me eth0 back, it's now usb0, but at least I don't have to remember the 12 digit hex id!

How Debian managed the systemd transition

Posted Sep 18, 2015 13:31 UTC (Fri) by epa (subscriber, #39769) [Link] (12 responses)

Maybe a halfway house would be device name 'eth_only' when there is exactly one card in the machine (a fairly common case). It would never change as long as there is only one. If you installed a second Ethernet adaptor then you would be forced to switch over to the funky udev names or else use eth0 and eth1 with some risk that they might be reordered for some random reason.

How Debian managed the systemd transition

Posted Sep 18, 2015 17:26 UTC (Fri) by kreijack (guest, #43513) [Link] (11 responses)

I think that a more reasonable solution is to still provide the ethN interface name as alias of the new name. The ethN interfaces may depend by the discovery order, but for the common case (a pc with only one ethernet) it avoids the problem.

BTW I never seen a swap of the ethernet cards; may be that I use only PVI ones ?

How Debian managed the systemd transition

Posted Sep 18, 2015 18:13 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (5 responses)

> I think that a more reasonable solution is to still provide the ethN interface name as alias of the new name.
Network code doesn't support it and network devices have always been a 'special case' in Linux. Notice how there are no '/dev/ethX' files.

How Debian managed the systemd transition

Posted Sep 19, 2015 14:34 UTC (Sat) by cortana (subscriber, #24596) [Link] (3 responses)

I wonder why no one has ever looked into increasing the maximum network interface name length, and adding a new kind of 'n' device node that network interfaces can use, so that they can show up in /dev just like other devices.

Then we could have /dev/net/by-{index,slot,path,mac} symlinks pointing to the original /dev/eth0 and friends. Both the original names and the predictable names would be there for those who use them, and all names would be usable simultaneously.

That way, so you no longer have to worry about a network interface having its name changed from eth0 to enp0s3 when you reboot with a newer version of udev, as happened to me last week.

You would also be able to pick and choose which name you want to use depending on your use case, just as how, with disks, it's sometimes useful to use the symlink in /dev/disk/by-path for a disk that is commonly hot-swapped (such as a backup disk you change every week) but at the same time it is useful to use /dev/disk/by-uuid for a permanently-mounted filesystem, such as / or /home.

How Debian managed the systemd transition

Posted Sep 19, 2015 20:40 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> I wonder why no one has ever looked into increasing the maximum network interface name length, and adding a new kind of 'n' device node that network interfaces can use, so that they can show up in /dev just like other devices.
As usual - legacy. The interface name is limited by IFNAMSIZ value which is used in multiple contexts, so you can not use long strings like '/dev/ethN' for interface names. And also there are no provisions for multiple interface names in getifaddrs() calls or similar functions.

Adding new functions that work with regular device nodes is certainly possible, but probably nobody cares too much about it.

How Debian managed the systemd transition

Posted Sep 20, 2015 0:02 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

if they can do eth<mac> then /dev/ethN isn't unreasonably long

How Debian managed the systemd transition

Posted Sep 20, 2015 7:37 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

IFNAMSIZ is equal to 16, so you only have 15 letters for the full path.

And the second problem is that you don't actually _need_ your devices to be mounted at '/dev'. A device node can be anywhere on the system.

It doesn't even require anything exotic - a host might simply with to get info about a TAP/TUN device in a container.

The correct way to fix it would be by creating a new API that uses file descriptors instead of interface names.

How Debian managed the systemd transition

Posted Sep 20, 2015 10:29 UTC (Sun) by kreijack (guest, #43513) [Link]

> > I think that a more reasonable solution is to still provide the ethN interface name as alias of the new name.
> Network code doesn't support it and network devices have always been a 'special case' in Linux.
> Notice how there are no '/dev/ethX' files.

May be that *now* doesn't exist a technical solution for doing that. But I don't see any reason which prevent us to implement it.
Let me to rephrase my sentence: my suggestion could be cause of some problem, or today it is impossible because nobody had coded that solution.
The former may/is a stop, the latter could be solved by some hours of coding (and several days of testing :-) )

How Debian managed the systemd transition

Posted Sep 19, 2015 1:34 UTC (Sat) by dlang (guest, #313) [Link] (4 responses)

> BTW I never seen a swap of the ethernet cards;

a swap can happen if you have cards that use different drivers and the driver load order changes (either from driver renaming if the drivers are statically linked, or the device discovery order if they are modules)

a swap can also happen if the bus the cards are on does not have an order and orders based on the order devices respond. I'm told that USB can detect the physical location of a device, but asaik, all the drivers ignore this and rely on which device responds first to device probing.

But if the device has a MAC address as part of the device, the existing tools can keep the ordering consistant.

If there is no such identifier built into the card, I don't believe that the new process is really any more reliable.

How Debian managed the systemd transition

Posted Sep 20, 2015 10:22 UTC (Sun) by kreijack (guest, #43513) [Link] (3 responses)

> a swap can happen if you have cards..
What you are saying is correct.... But I repeat in my (very limited) experience I never seen that.
What I see is that the ethernet name change from eth0 to eth1 when I moved from a disk image from one host to another, and that cause me more headache then the fact that I have more ethernet device of different hardware...

What I mean is that linking the ethernet name to the hardware is useful in some contexts, but in others not; and I suspect that these "others" are more common than the former...

How Debian managed the systemd transition

Posted Sep 20, 2015 10:52 UTC (Sun) by dlang (guest, #313) [Link]

I agree, I always disable that on my systems. But I can see that if someone was dealing with multiple USB interfaces, or wants to do async module loading (with all the ordering race conditions it introduces) in search of faster boot times, I can see why it could be useful.

How Debian managed the systemd transition

Posted Sep 25, 2015 20:16 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

> What I mean is that linking the ethernet name to the hardware is useful in some contexts

You mean, like in a firewall, for instance ...

I saw a comment somewhere where eth0 and eth1 got swapped. In other words, until someone noticed, the firewall's soft, unprotected, meant for the internal network, interface was the interface to the hostile outside world ...

Cue major panic, lan disconnected from the internet, rebuild the firewall from scratch, ...

At the end of the day, unpredictable behaviour is a security risk. What I think happened was that the system had always been booted from cold, and had always been predictable. Then for some reason, one day it did a warm-start, and came back with the interfaces swapped over ... oops ...

Cheers,
Wol

How Debian managed the systemd transition

Posted Sep 25, 2015 20:36 UTC (Fri) by dlang (guest, #313) [Link]

actually, if the firewall interfaces get swapped, it's not going to talk to anything, because it is going to send the reply packets out the wrong interface (unless you also have dynamic routing)

Yes, it is possible to configure a system so that it's IPs and routes will swap with the interface changes, but the firewall rules won't, but that's getting into rather contrived territory.

How Debian managed the systemd transition

Posted Sep 18, 2015 16:42 UTC (Fri) by HenrikH (subscriber, #31152) [Link] (3 responses)

Actually the current pinning seams to be a much better solution that creating a dev with the mac as name. Just as long as one rembers to edit /etc/udev/rules.d/70-persistent-net.rules if one have changed the NIC. Now with enx00249b0e398f one would possible have to edit tons and tons of config files if unlucky.

How Debian managed the systemd transition

Posted Sep 19, 2015 18:08 UTC (Sat) by cortana (subscriber, #24596) [Link] (2 responses)

The problem with the old 'pinning' approach was that it was racey. I have had to recover servers that stopped working because the interface usually known as eth0 is now called eth1_rename, and there is another interface now called eth0, or some such.

How Debian managed the systemd transition

Posted Sep 20, 2015 0:04 UTC (Sun) by dlang (guest, #313) [Link] (1 responses)

Interesting, i never ran into that.

It may be that I got bit by the rename due to card changes and disabled it entirely before the odds caught up with me.

How Debian managed the systemd transition

Posted Sep 20, 2015 13:14 UTC (Sun) by cortana (subscriber, #24596) [Link]

It is a probabilistic problem; you might never see it in your whole career; or you might see it twice in a week. It is annoying when those who have never seen it happen to them (and I am not including you in this set) assume that it is not a problem, and rubbish the efforts of those who are trying to fix it properly by removing the underlying race condition.

How Debian managed the systemd transition

Posted Sep 18, 2015 15:59 UTC (Fri) by lsl (subscriber, #86508) [Link] (1 responses)

Something in Debian sid recently felt the need to rename a virtio device from eth0 to ens3 with the result that the VM was offline until someone logged in on the console to do a s/eth0/ens3/. No idea if that is a bug or just the normal risk of running unstable.

How Debian managed the systemd transition

Posted Sep 18, 2015 19:12 UTC (Fri) by lsl (subscriber, #86508) [Link]

It apeears that this was indeed considered a bug and rolled-back with a new update. The device regained its eth0 name, so the VM is once again unreachable. Thankfully, crap like this is easy to deal with on virtualized systems.

How Debian managed the systemd transition

Posted Sep 20, 2015 12:09 UTC (Sun) by patrakov (subscriber, #97174) [Link]

Well, I could not say "eth0 became eth1", but I had a case when eth1 became eth2. On Debian.

How Debian managed the systemd transition

Posted Oct 5, 2015 14:22 UTC (Mon) by Creideiki (subscriber, #38747) [Link]

Another problem is that while the interface names might (sometimes) be predictable across time on a single machine, they are emphatically not predictable across several machines at the same time. Especially with hardware manufacturers changing how they wire up their PCI buses without changing anything other than the revision number printed on the motherboard.

I've had a batch of computers, all the same nominal model from the same (big) manufacturer, all installed from the same netboot image, half calling their only Ethernet interface one thing and half calling it another. System administration scripting was so much easier when all machines had an eth0 connecting to the main network.

How Debian managed the systemd transition

Posted Sep 24, 2015 4:09 UTC (Thu) by jb.1234abcd (guest, #95827) [Link] (5 responses)

Yo, men !

http://ibin.co/2Ge1kuufRD1G

The plot:

Recently a poster presented a view of systemD(efeat) - mostly pro, with
the exception of its packaging.
http://lists.freedesktop.org/archives/systemd-devel/2015-...

His packaging proposal was brushed off as wrong.

http://lists.freedesktop.org/archives/systemd-devel/2015-...
"It's 119 executables now, btw."
"It's a set of basic building blocks distributions can build an OS from."
"We provide people with sources, with a git tree, and it's up to them how they
decide to package that."
"(...) we keep all components of our system together in one repo, under
a single release schedule and without stable, internal APIs."
"(...) We need to keep things maintainable. And you don't make things
maintainable (...) by forcing us to stabilize internal APIs (...)".
"(...) the only folks who should care about our updates are those who
are technically versed enough to not need version numbers, but who can
read our NEWS files."
"Well, it's supposed to be a steady stream of smaller additions instead
of major feature additions in long intervals."
"We can drop things from our git tree from time to time, (...)".

There is a darker side behind the facade.

http://ewontfix.com/14/
"... an aggressive, dictatorial marketing strategy including elements such as:
Engulfing other "essential" system components like udev and making them difficult or impossible to use without systemd (but see eudev).
Setting up for API lock-in (having the DBus interfaces provided by systemd become a necessary API that user-level programs depend on).
Dictating policy rather than being scoped such that the user, administrator, or systems integrator (distribution) has to provide glue. This eliminates bikesheds and thereby fast-tracks adoption at the expense of flexibility and diversity.
"

An0nym0usC0ward
"True, systemd consists of 69 [edit: 119] binaries. But they are so tightly coupled that to my knowledge nobody so far was able to provide a fully compatible alternative implementation for any of them. Besides, since the APIs internal to those systemd binaries are not only undocumented but also constantly changing, providing a compatible alternative implementation means aiming for a moving target.
Even if LP holds the 69 [edit: 119] binaries as proof that systemd isn't monolithic, to me that's BS. The strong coupling is what makes for a monolith."

The plot thickens:

http://lists.freedesktop.org/archives/systemd-devel/2015-...
"systemd is a core port of Linux ecosystems these days, just like
the kernel. It's a foundation layer that finally brings some much
needed order, coherency, and vertical integration to the mess that
is Linux userspace."

The plot thickens more:

http://lists.freedesktop.org/archives/systemd-devel/2015-...
"To further resonate that. Just like with kernel, every vendor make
their own longterm maintenance thing of systemd.
Look at Centos vs Debian kernel, they are widely different, even if
released from same series or at the same time.
Ditto systemd, integration done in Debian, Ubuntu, openSUSE, Fedora
are all different as well."

Ouch !

The plot thickens even more:

Reason got an upper hand over vanity and a proposal of a prospect of
systemD(efeat) becoming a "Linux-based Init and Service Management System
Standard":
"Yeah, but no. I doubt writing standards like that makes any sense at all...
Sorry,
...
Lennart Poettering, Red Hat"

I hope the above topic and the following one will be discussed during their
upcoming systemD(efeat) conference:
http://lists.freedesktop.org/archives/systemd-devel/2015-...

Btw, this is Pieter Bruegel's take on it:
https://en.wikipedia.org/wiki/Satire#/media/File:Pieter_B...

jb

Sigh

Posted Sep 24, 2015 13:33 UTC (Thu) by corbet (editor, #1) [Link] (4 responses)

So I was really hoping we would get through this one without somebody trying to restart the whole systemd flamewar. No such luck. The last time you did this, just over a month ago, you were warned that your troll bit would be set the next time. This is the next time.

Sigh

Posted Sep 24, 2015 22:42 UTC (Thu) by alvherre (subscriber, #18730) [Link] (1 responses)

What is the "troll" bit? I don't think I see it set for this guy.

Sigh

Posted Sep 25, 2015 4:08 UTC (Fri) by bronson (subscriber, #4806) [Link]

I'm guessing it means any future trolling will be redacted or deleted? It's a symbolic bit?

Sigh

Posted Sep 25, 2015 1:12 UTC (Fri) by mgb (guest, #3226) [Link] (1 responses)

I found what you have designated as a troll to be a tad more useful than the article itself which was old news to those of us who have following events as they transpired.

But you clearly disagree with his opinion while I share it.

Sigh

Posted Sep 25, 2015 2:11 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

Umm…okay. So a comment *starting off* invoking Godwin's Law is more useful to you because you spent hours reading debian-devel. Great, maybe, just *maybe* there are some folks here don't follow debian-devel who are interested in how the transition was done (such as me who does some support for various Linux system issues at work, some of them being "how does systemd do this?"[1]).

I don't really see anything interesting in the original post other than a "see how much of am asshat I can be" link to a proposal for systemd.conf akin to asking to present a "90% of the world doesn't want to run Debian, nevermind know what it even is" poll done (based on results at a university computer lab) at a Debian conference asking them to give up and go back home.

The rest, I'm going to assume is cherry picking quotes because he has earned that reputation at least (and having read the source of one, yeah, there's no need to reconsider it right now).

[1]And the majority of the time it's "oh, that's much nicer", the rest being "hmm, that's different, but OK". But I'm going to assume you'll just deny their existence.

How Debian managed the systemd transition

Posted Sep 27, 2015 21:50 UTC (Sun) by sbergman27 (guest, #10767) [Link] (2 responses)

Based upon all my bad experiences with PulseAudio... I admit I was skeptical of systemd. But I have to admit, it's working well on my Debian 8 desktop. The administration tools are well designed. And after years of hearing about other people's fast boot times... which I'd decided I wasn't sure I believed... I was shocked when after the upgrade I was able to go from Grub screen to a fully operational system, logged in, and with Chrome up and on the Google page, in (drum-roll please) 5.7 seconds. And speed was not even the main focus of the systemd design. Just a side effect. Regarding the socket-based dependencies, I think it's clear that Poettering was right and Remnant was wrong. They really are are good as they look on paper.

How Debian managed the systemd transition

Posted Sep 28, 2015 17:30 UTC (Mon) by flussence (guest, #85566) [Link]

That's an impressive number! I doubt systemd would do much for me though; my browser already takes longer from exec to window draw than the rest of the boot process...

How Debian managed the systemd transition

Posted Sep 28, 2015 18:44 UTC (Mon) by bronson (subscriber, #4806) [Link]

Just another data point: I've only tried systemd on the server but I can't recommend it highly enough. I replaced a few different monit/god/custom setups started by all sorts of different distro-specific init.d files with a single systemd unit file. It worked first try and it's been stone stable ever since (so, no idea how helpful the support channels are since I haven't had a chance to use them).

Still waiting for the other shoe to drop.. but it's been excellent so far.


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds