|
|
Subscribe / Log in / New account

The Grumpy Editor's guide to surviving the systemd debate

The Grumpy Editor's guide to surviving the systemd debate

Posted Nov 13, 2014 18:10 UTC (Thu) by skorgu (subscriber, #39558)
In reply to: The Grumpy Editor's guide to surviving the systemd debate by anselm
Parent article: The Grumpy Editor's guide to surviving the systemd debate

Systemd in and of itself, sure. It is a bit of a lightning rod/poster child for a definite change in how systems are built though and teasing those two roles apart isn't always easy. I was trying to debug an early-boot process in CentOS 7 this week which supports both a complex set of shell scripts in the form of /etc/init.d/network (now started through systemd) and NetworkManager.

I decided NetworkManager was the Future, made my peace with the capitalization choices, read some docs and tried to use it. No dice, as far as I could tell the networkmanager service started but NetworkManager-wait-online.service didn't which caused kdump to determine there was no network which prevented the multi-user.target from being reached.

Ok. Let's look at the wait-online.service.

ExecStart=/usr/bin/nm-online -q --timeout=30

Great! I run nm-online and it hangs for 30 seconds before failing. What criteria is it using to check that the network is up? --help is useless, "nmcli networking connectivity" says "full" and I have no idea where to look next. strace and looking at the code upstream led me to the conclusion that it's doing something via dbus but to understand what exactly would require me to understand a good chunk of the NetworkManager/dbus codebase.

Fine. Let's try the init script way. Hm, doesn't work, looking at the logs (journalctl is good, why isn't -l the default though?) shows me that udev was renaming eth0 -> em1 at the same time as the network was trying to bring up em1, classic race condition. I wonder if there's a delay I can add?

grep -i delay /etc/sysconfig/network-scripts/*

Bam, add LINKDELAY=10 to the ifcfg-em1 file and move on with my life.

Is the problem here systemd? No, it's clear that this specific case was networkmanager's fault and systemd was faithfully executing what it was told to. However the /model/ of reducing complexity by tighter coupling, dbus and C instead of files and shell scripts, monolithic entities instead of loose components made this much harder to debug.

There's no reason to foam at the mouth and cancel my LWN subscription, we're all adults here and should be able to say "hey this isn't great" without going off the rails. I have no doubt that in a few years these bugs will have tripped up enough people and been fixed, better tools will exist or simply be more widely known and it'll be fine, probably better than the old world.

However we *are* all adults here and shouldn't pretend that systemd is just PID 1 and not also a central and highly visible component of a fundamental change in how linux systems are built and operated or that there aren't tradeoffs in discoverability and debugability being made.


to post comments

The Grumpy Editor's guide to surviving the systemd debate

Posted Nov 13, 2014 19:33 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

A better solution would have been to add a em1 dependency to the NetworkManager target. It can be done by creating a socket unit depending on the em1 device and adding it to the dependency list.

The Grumpy Editor's guide to surviving the systemd debate

Posted Nov 14, 2014 18:59 UTC (Fri) by skorgu (subscriber, #39558) [Link] (1 responses)

I didn't think of that, thanks for the suggestion!

Though I'm not sure it would have actually helped the NetworkManager case since the network-online target /never/ came up, even when the system manifestly had connectivity (and even nmcli agreed).

The Grumpy Editor's guide to surviving the systemd debate

Posted Nov 14, 2014 19:03 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

I don't think that socket targets actually depend on the network-online target, they are activated directly by udev events. So it should have helped.

Also, can you file a bug for this? It's clearly an issue that should be fixed.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds