Not logged in
Log in now
Create an account
Subscribe to LWN
LWN.net Weekly Edition for June 20, 2013
Pencil, Pencil, and Pencil
Dividing the Linux desktop
LWN.net Weekly Edition for June 13, 2013
A report from pgCon 2013
so by that logic, and the fact that upstart existed before systemd, shame on the systemd people for creating a new project rather than just joining an existing project.
GNOME and/or systemd
Posted Nov 1, 2012 13:40 UTC (Thu) by HelloWorld (guest, #56129)
Posted Nov 1, 2012 18:38 UTC (Thu) by dlang (✭ supporter ✭, #313)
And this is why you won't get all these people joining systemd
Posted Nov 1, 2012 19:41 UTC (Thu) by Cyberax (✭ supporter ✭, #52523)
Upstart is clearly flailing in that regard. It definitely is NOT better and is actually getting worse with time.
What's funny, upstart's principal developer has stopped using Ubuntu because Ubuntu is not able to support the "forked world" anymore: http://netsplit.com/2012/10/30/goodbye-ubuntu/
Posted Nov 1, 2012 20:39 UTC (Thu) by HelloWorld (guest, #56129)
Posted Nov 1, 2012 22:42 UTC (Thu) by nix (subscriber, #2304)
Posted Nov 1, 2012 23:16 UTC (Thu) by HelloWorld (guest, #56129)
Here's a rather witty quote by Georg Christoph Lichtenberg: "I cannot say whether things will get better if we change; what I can say is they must change if they are to get better."
Posted Nov 1, 2012 23:42 UTC (Thu) by apoelstra (subscriber, #75205)
Given a choice between "I know how this is broken", and "I don't know how this is broken, and even if I did, that could change in a month", I'd take the former.
Having said that, I've been using systemd since it became the default on Fedora, and I've really had to go out of my way to break it.
Posted Nov 2, 2012 0:56 UTC (Fri) by HelloWorld (guest, #56129)
Sorry, but "it changes too fast and is thus too unstable" is just a knee-jerk reaction to any kind of change, be it for the better or for the worse. It is *not* what I consider a sensible criticism.
Posted Nov 2, 2012 8:01 UTC (Fri) by paulj (subscriber, #341)
So if you take a piece of software, with its features just developed to an acceptable state, then compare that software with itself after a long period of use, it should be obvious that a lot more bugs will be /known/ about after the period of use. That information alone is worth something if you want to understand the behaviour and stability of a system in the face of whatever inputs. It means if needs be you could choose to limit the input in order to avoid whatever bugs. Further, if during that period of use those bugs are fixed (and fixes strictly limited to that), then the software at the end stands a good chance of being less buggy than the software at the beginning, for the given features.
This all sounds obvious to me. In case it isn't to you, or you think it's hand-wavingly subjective, let me point that the like of RedHat objectively earn billions of $ per annum exploiting this.
Posted Nov 2, 2012 18:29 UTC (Fri) by nix (subscriber, #2304)
*None* of these properties are true of systemd as PID 1 yet, and given its rate of continued development they seem unlikely to be true for many years to come... and I really really need PID 1 never ever to die. I actually use my system for useful work and cannot afford to turn PID 1 of all things into Lennart's debugging playground. (Note that I had no objections to turning my desktop's sound card into Lennart's debugging playground: if that fails all that happens is that I have a pause with no music until I figure out what is wrong and fix it. If PID 1 dies, I lose everything I'm working on, instantly, and quite possibly get a bunch of stuff I did recently replaced with zero-byte files as well. Not an acceptable tradeoff. And it uses large, complex libraries like libdbus which I *know* to have had crash and security bugs, and fairly recently at that. I'm not willing to tolerate the risk of losing PID 1 to such bugs.)
systemd is just approaching the degree of stability where I might be willing to tolerate it in unimportant virtual machines with no data I value. Anywhere else, you must be joking. (In this it is very like filesystems: I don't use btrfs anywhere I need to keep running in order to get my work done either.)
Lennart's extensively documented contempt for people not running Fedora is just the icing on the cake. I have no desire to see a systemd crash get answered with 'run Fedora' like PulseAudio bugs have been in the past. But that's not the really important thing. The complexity and resulting potential instability of PID 1 is the important thing. Complexity is OK in e.g. Emacs or PulseAudio: if they die, you still have a system you can debug them with. If PID 1 dies, you don't, and you probably have disk corruption too. If PID 1 dying didn't instantly panic the kernel, this might be different -- but it does, so I am incredibly paranoid about it.
Posted Nov 2, 2012 19:59 UTC (Fri) by HelloWorld (guest, #56129)
> and if they don't, kill(1) does).
Except when sshd(8) was shut down before and you thus don't have a shell to run kill(1) from, which is common when rebooting a machine. I actually remember a comment from somebody here on lwn who had to drive hundreds of miles to a server room precisely because of this.
> However, it works in the sense that the system boots reliably, 100% of the time: sysvinit /sbin/init has never ever once failed for me, probably because it is dead stable and simple and does almost nothing and never changes.
Otoh, all those pig ugly shell scripts you mentioned earlier can and do fail all the time. Cyberax pointed to this blog post earlier, and that's but one example.
You compare sysvinit to systemd and ignore all those by no means trivial shell scripts which do all the actual work and which, unlike sysvinit itself, change all the time and are riddled of bugs. This is exactly the kind of apples-to-oranges comparison I meant earlier.
> And it uses large, complex libraries like libdbus which I *know* to have had crash and security bugs, and fairly recently at that. I'm not willing to tolerate the risk of losing PID 1 to such bugs
Otoh, you are willing to use the Linux kernel, which is much, *much*, *MUCH* bigger and more complex than libdbus, gets a major release with tons of changes every ~3 months and has security issues so regularly that its developers don't even bother documenting them anymore. Yeah, that makes sense.
> In this it is very like filesystems: I don't use btrfs anywhere I need to keep running in order to get my work done either
Heh, so you stick to tried-and-true file systems like ext4? Oh, the irony :)
> The complexity and resulting potential instability of PID 1 is the important thing.
systemd is *way* less complex than what it replaces. This guy put it best:
Posted Nov 2, 2012 21:58 UTC (Fri) by nix (subscriber, #2304)
How do those scripts make sure that things like double-forking perl scripts started by apache are killed when apache is? Oh wait: they don't.
Except when sshd(8) was shut down before and you thus don't have a shell to run kill(1) from, which is common when rebooting a machine.
Otoh, all those pig ugly shell scripts you mentioned earlier can and do fail all the time.
You compare sysvinit to systemd and ignore all those by no means trivial shell scripts which do all the actual work
Otoh, you are willing to use the Linux kernel
The Linux kernel introduces instability, granted, and every upgrade is a bit hair-raising (even stable kernel upgrades these days! :) ). However, systemd introduces more instability, and just like the kernel -- and unlike almost everything else other than glibc -- can instantly wedge and panic my system if it goes wrong. If I would prefer less instability to more, and am happy with the minimal features sysvinit provides (as I am), it follows that I should avoid systemd. As I do.
Heh, so you stick to tried-and-true file systems like ext4?
The latter factor cannot be underestimated. Lennart is also responsive, but he is all bristling with opinions I strongly disagree with, sharp edges and active but invisible laser beams: dealing with him is a lot more stressful than dealing with e.g. tytso. I do not like the thought of getting a load of social stress when already stressed out from trying to fix a system-destabilizing problem, and that's what I suspect I'd get from systemd.
This too is something that probably doesn't apply to other people: a huge proportion of my life has always been devoted to stress management, and 'X is more stressful than Y' is almost always a reason to choose Y over X, regardless of any other benefits of X. In this case, systemd bugfixing is both more likely and more likely to be stressful than the nigh-nonexistent sysvinit bugfixing: thus, sysvinit wins by default, regardless of any killer features it may or may not have. Other people have different priorities.
systemd is *way* less complex than what it replaces.
sysvinit is dead simple and utterly stable. systemd PID 1 is terrifyingly complex by comparison. Sorry, for people who value stability, systemd is still completely out of the question, and will be for years. You'll note that this would be true no matter the code quality of systemd, no matter its benefits, it could be the best code in the history of the human race -- where things that can panic the kernel if they go wrong are concerned, that's irrelevant beside the failure risk.
Posted Nov 2, 2012 22:20 UTC (Fri) by dlang (✭ supporter ✭, #313)
systemd bugfixing would be bad enough, but systemd is continually growing new, and more sophisticated features as well.
Posted Nov 2, 2012 23:05 UTC (Fri) by raven667 (subscriber, #5198)
Posted Nov 2, 2012 23:12 UTC (Fri) by Cyberax (✭ supporter ✭, #52523)
Posted Nov 2, 2012 23:11 UTC (Fri) by Cyberax (✭ supporter ✭, #52523)
I've had to make a very real 4am trip to our datacenter once when a rebooting node was stuck in bind9 initscript, right after it killed sshd.
Systemd makes sure that services always stop successfully. Reliably and automatically.
Posted Nov 3, 2012 2:45 UTC (Sat) by HelloWorld (guest, #56129)
> Er, if that's happened your network interfaces have almost certainly been shut down as well. How on earth is systemd supposed to help here?
systemd helps because, unlike sysvinit, it can reliably terminate a service so you don't have to mess around with kill(1) in the first place.
> You don't get it. They don't fail for me.
Well, lucky you. They do fail for others. I've had at least one boot failure due to a broken init script myself. And it broke for others too, see Scott's blog entry.
> but it is more unstable than sysvinit PID 1,
So you keep saying. But where are all those bugs that supposedly crash systemd all the time? I'm not from Missouri, but you'll have to show me anyway.
Besides, as Cyberax and raven667 have pointed out, there's not actually a whole lot going on in systemd itself, most of the recent activity is in systemd-journald and systemd-logind, both of which don't have PID 1.
Posted Nov 3, 2012 10:06 UTC (Sat) by rleigh (subscriber, #14622)
Code has bugs, and the number of bugs increases as a function of the code size. systemd is much bigger than sysvinit, with a correspondingly larger probability of hitting such a bug. init is absolutely critical, and having it kept as tiny and simple as possible is essential for a reliable system.
This is not to say that systemd can't be large and complex, just that the complexity should not be in PID1. There's no reason why systemd PID1 can't be as small and tiny as sysvinit, with the rest in other processes. As a good example, look at s6, which has focussed on reliability to an even greater extent. There is no reason why systemd and other init systems couldn't adopt this approach.
At present, running a safety-critical or guaranteed reliable system with systemd is an untenable proposition. The risk of failure is too high. This isn't about "bugs that supposedly crash systemd"--it doesn't matter if any have been found or not. It's about the fact that a fault in PID1 will bring the system down, and managing that risk. systemd is a much greater risk than sysvinit. It's more reliable in other ways, as discussed in the thread. But that improvement is irrelevant so long as systemd PID1 remains a critical point of failure that is impossible to validate for correctness.
Posted Nov 3, 2012 11:39 UTC (Sat) by HelloWorld (guest, #56129)
Posted Nov 3, 2012 11:44 UTC (Sat) by rleigh (subscriber, #14622)
This is unhelpful, and completely ignores what I said.
Posted Nov 3, 2012 11:58 UTC (Sat) by HelloWorld (guest, #56129)
And again, if you care so much about reliability, the elephant in the room is the Linux kernel. If that is an acceptable risk, then so is systemd.
Posted Nov 3, 2012 12:19 UTC (Sat) by rleigh (subscriber, #14622)
Arguing that reliability in a critical system component is not of concern is absurd. Yes, the linux kernel can contain bugs. If you do care about reliability, you'll take steps to mitigate the chance of failure. This is a completely separate issue to the robustness and reliability of PID1; there's no need to confuse the discussion by mixing them together.
Ignoring the fact that this is an important concern does not do systemd any favours. systemd could certainly be changed to move the vast majority of the complexity in PID1 to a separate program running in a separate process. There's no intrinsic need for anything to be in PID1 except process reaping, starting/restarting another process and handling shutdown. Everything else can be done in another process; even shutdown--you can just exec the shutdown program. Take a look at how s6 is structured--it has a lot to be said for it, and there's no reason why systemd can't do this.
Posted Nov 5, 2012 1:04 UTC (Mon) by HelloWorld (guest, #56129)
Anyway, I think the risk is tolerable. I have never seen systemd crash on the systems I've been using. Also systemd doesn't just abort on SIGSEGV, it serializes its state and then execs itself anew. The code to do that is used in other places too (i. e. configuration reloading and reboot-less upgrades), so it's not some obscure code path that is never tested.
You'd have to be very unlucky to hit a bug that makes systemd crash and corrupts its internal state enough for the recovery mechanism to fail.
Posted Nov 5, 2012 20:22 UTC (Mon) by nix (subscriber, #2304)
Also systemd doesn't just abort on SIGSEGV, it serializes its state and then execs itself anew.
Posted Nov 5, 2012 20:50 UTC (Mon) by jimparis (subscriber, #38647)
Also systemd doesn't just abort on SIGSEGV, it serializes its state and then execs itself anew.Now that is neat.
Also systemd doesn't just abort on SIGSEGV, it serializes its state and then execs itself anew.
Posted Nov 5, 2012 22:12 UTC (Mon) by raven667 (subscriber, #5198)
Posted Nov 6, 2012 0:23 UTC (Tue) by HelloWorld (guest, #56129)
Anyway, if you configure systemd to spawn a shell, you can exec systemd from there, so not everything is lost.
Posted Nov 3, 2012 20:32 UTC (Sat) by Cyberax (✭ supporter ✭, #52523)
What are you going to do now?
Posted Nov 4, 2012 1:16 UTC (Sun) by rleigh (subscriber, #14622)
There is a difference between the reliability of PID1 (e.g. /sbin/init) and the reliability of the programs run by that init such as rc (/etc/init.d/rc) for runlevel change, getty, and then e.g. individual init scripts run by rc/startpar.
In the case of sysvinit, init itself is small, simple and robust. It does little more than run rc on runlevel change, respawn gettys and handle a few other events such as shutdown signals. There is nothing stopping systemd, or a systemd-like complex init running as a respawnable service run directly from init (like getty), layering the more complex stuff on top of an ultra-simple PID1. This is partly what openrc does, building a more complex dependency-based boot on top of sysvinit.
The point here is that a bug in rc or getty will not kill init. And a bug in an init script will not kill rc. PID1 will carry on running, as will your system, if there is a bug in one of these higher level layers. Even in the case of sysvinit, there is scope to strip down PID1 even further--the runlevel change and service respawning could be moved into a separate process, as could shutdown.
While systemd does split some still out into additional binaries, the chance of a bug compromising PID1 functioning is much, much higher. Upstart is in a similar situation. Neither of these /need/ to have the complexity directly in PID1.
Posted Nov 4, 2012 1:33 UTC (Sun) by Cyberax (✭ supporter ✭, #52523)
> The point here is that a bug in rc or getty will not kill init. And a bug in an init script will not kill rc. PID1 will carry on running, as will your system, if there is a bug in one of these higher level layers.
Yeah. It's kinda like old classic cars with thick steel frame - it can run just fine after collision. You just scrape driver from the steering wheel, replace glass and your descendants can drive it as if nothing has happened!
What use in a robust PID1 if it can't do ANYTHING reliably?
Posted Nov 4, 2012 12:27 UTC (Sun) by nix (subscriber, #2304)
Well, in this case the sysvinit PID1 is !@#*&^!*& unreliable. It can't even kill processes robustly. It can't start them robustly as well - I've had more than one hangup during startup.
You keep on giving complaints about sysvinit that have nothing to do with PID 1 robustness, which is my primary concern when choosing an init implementation. sysvinit never fails to reap zombies: it never fails to run its single rc script per runlevel change (those scripts might later hang, but that is not PID 1's fault). It never, ever dies.
I would be happy with systemd were its PID 1 incredibly simple and never changing and all the work done by something else (which can change as often as it likes without causing instant kernel panics if it goes wrong). But instead its PID 1 is more of a kitchen sink than I'd like. Even sysvinit PID 1 really does too much: I'm definitely going to have a look at s6 and see if it has moved things like process supervision to some other binary. PID 1 should not do this job.
Posted Nov 4, 2012 17:25 UTC (Sun) by Cyberax (✭ supporter ✭, #52523)
Yet it makes absolutely NO sense to view PID1 functionality in itself. It can't do anything, and any script it runs becomes mission-critical. It's easy to make a non-bootable (or non-haltable) system by making a small mistake in a myriad of twisty [not so] little scripts. And it makes no freaking sense that PID1 itself worked fine.
A car analogy: sysv is a metal cube with thick metal walls. It's very safe (since it can't move) and simple. Only to make it actually do anything useful you need to add wheels, engine, steering system, windows and windshields, etc. And in the end it turns out that a cube on wheels actually doesn't really work as a car and isn't safe anymore.
Posted Nov 4, 2012 16:59 UTC (Sun) by mathstuf (subscriber, #69389)
IIRC, those comments were from a time when Ubuntu packaging was causing many of the bugs users were running into. Fedora's package was maintained closer to upstream and (I would expect) a newer version with other fixes.
Posted Nov 2, 2012 2:09 UTC (Fri) by Cyberax (✭ supporter ✭, #52523)
Besides, for simple uses systemd is already quite OK. I'm switching to it once it becomes available in Debian.
Posted Nov 2, 2012 18:30 UTC (Fri) by nix (subscriber, #2304)
Posted Nov 2, 2012 17:24 UTC (Fri) by intgr (subscriber, #39733)
Yes. But Poettering seriously studied and considered Upstart. The design of Upstart didn't allow what he wanted to achieve: http://0pointer.de/blog/projects/systemd.html
I believe he said in one of the systemd presentations that he did approach Upstart developers with his ideas, but they did not reach an agreement.
Posted Nov 2, 2012 17:54 UTC (Fri) by dlang (✭ supporter ✭, #313)
I was pointing out that by that by that logic, the systemd people should never have started systemd, they should have joined one of the other existing projects and modified that instead.
In other words, stop criticizing people for working on the project that they think is right, or you will find that the same criticism works against you as well.
I don't think that there's any niche left where there isn't _some_ existing project that could be enhanced to fit whatever you want to have done. There can be very good reasons for wanting to do something new instead of joining and modifying the existing project (which can include the fact that the devs of the existing project may not want the changes you want)
Posted Nov 5, 2012 2:28 UTC (Mon) by HelloWorld (guest, #56129)
Too much unity leads to stagnation and monopolies, too much diversity leads to chaos, bikeshedding and infighting. Neither is good for the linux community as a whole, and we should work on finding a middle ground. Today, we're way too far on the bikeshedding side of that fine line, and I happen to think that upstart is one part of that problem.
Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds