User: Password:
|
|
Subscribe / Log in / New account

Changes coming for systemd and control groups

Changes coming for systemd and control groups

Posted Jun 24, 2013 21:32 UTC (Mon) by heijo (guest, #88363)
In reply to: Changes coming for systemd and control groups by mezcalero
Parent article: Changes coming for systemd and control groups

Why not just sync filesystems and hard reboot?

All this work seems totally useless.


(Log in to post comments)

Changes coming for systemd and control groups

Posted Jun 24, 2013 22:36 UTC (Mon) by foom (subscriber, #14868) [Link]

Is it possible for a single "sync" to cause a safe-point in all combinations of filesystems on top of other filesystems etc etc without unmounting things?

I'd expect not (and thus what systemd does is likely safer), but I dunno.

Changes coming for systemd and control groups

Posted Jul 2, 2013 20:56 UTC (Tue) by zblaxell (subscriber, #26385) [Link]

Actually the sync is often the step that causes a shutdown to hang if the "clean" shutdown procedure is not going well (e.g. due to flaky disks or kernel bugs). In that situation it is better to ask watchdog hardware to reboot, and failing that, tell the kernel to reboot itself without risking a failure in sync. Less is better.

Changes coming for systemd and control groups

Posted Jul 2, 2013 23:14 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

And what do you think, systemd HAS support for hardware and software watchdogs. It's right there in the documentation: http://0pointer.de/blog/projects/watchdog.html

And you might note, that traditional rc-scripts have nothing of this kind. I once had a very nice 3am trip to our datacenter to power-cycle a server that was stuck at trying to stop BIND during a reboot.

Changes coming for systemd and control groups

Posted Jul 3, 2013 15:39 UTC (Wed) by zblaxell (subscriber, #26385) [Link]

Yes, I've had to disable that systemd feature too; however, in this case it's because I usually deploy a much more application-specific watchdog process. systemd's built-in watchdog might be inadequate or non-portable, but it is not a radical departure from legacy behavior and it is better than not having any watchdog code at all.

Obviously every system based on rc-scripts has been able to run such a daemon for decades prior to systemd's existence. The page at the link you provided even has a link to such a daemon that is at least 14 years old. I recently had an unsolicited email conversation with one of that daemon's maintainers about their plans to extend the daemon to do more invasive application-specific aliveness checking (I thought the idea wasn't insane, but I probably wouldn't use it because watchdog daemons are trivial to implement while solutions to political problems arising from software integration in critical code paths are not).

The more complicated the shutdown code is, the more likely it is to fail. If we try to stop mostly-stateless server daemons (like BIND) which are explicitly designed for and respond well to a famous widely-deployed 20-year-old system-wide SIGTERM/pause/SIGKILL sequence using anything with more failure modes than exactly that sequence, then embarrassing failure is simply inevitable.

Less is better. systemd has lots of tactical cleverness in its implementation, but at the same time it gets the basic strategy wrong.

I have a test machine running systemd. On February 13, 2013 I executed the 'reboot' command and today I'm still waiting for it to finish. Interestingly, the rest of this particular system's function seems to be unimpaired--including systemctl and systemd services--and I still push software to it for testing regularly. I'm now tempted to take bets on how many years it will take for that machine to reboot. It happens to have two independent battery-backed power supplies... };->

Changes coming for systemd and control groups

Posted Jul 4, 2013 4:13 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

>The more complicated the shutdown code is, the more likely it is to fail. If we try to stop mostly-stateless server daemons (like BIND) which are explicitly designed for and respond well to a famous widely-deployed 20-year-old system-wide SIGTERM/pause/SIGKILL sequence using anything with more failure modes than exactly that sequence, then embarrassing failure is simply inevitable.
I had to go to our datacenter to power-cycle our server exactly because BIND9 rc-scripts did not do the timeout correctly. All on stock Debian installation, no cgroups or systemd in sight.

Changes coming for systemd and control groups

Posted Jul 4, 2013 13:02 UTC (Thu) by jubal (subscriber, #67202) [Link]

…what kind of “server” doesn't offer remote powercycle these days?

Changes coming for systemd and control groups

Posted Jul 4, 2013 17:17 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

It has been about 4 years ago, and at that time I had to jump through some hoops to get a separate protected management circuit from our datacenter for IPMI.

Changes coming for systemd and control groups

Posted Jul 4, 2013 17:59 UTC (Thu) by zblaxell (subscriber, #26385) [Link]

FWIW I modify the Debian init scripts too. rm -f /etc/rc?.d/K* is a pretty good start. The K* scripts are only needed when switching between runlevels, not when strictly booting up (when there is nothing to kill) and shutting down (when imminent termination is inevitable).

The problem in the BIND case isn't SysVInit or rc-style scripts, and it's not systemd's prerogative to solve. The problem is someone put BIND code on the critical path for rebooting. That is the mistake that needs to be corrected. Repeat for daemons we might find in a thousand other packages with code that is spuriously placed where it does not belong.

Server daemons that have special state-preserving needs can have scripts that try to bring them down with a non-blocking timeout (or systemd can do it itself). In practice, such servers don't get rebooted intentionally so the extra code executes only under unusual conditions where criteria for success are strict, or routine conditions (i.e. supervised upgrade of the software) where the criteria for success are greatly relaxed. That means the code doesn't get a lot of field testing, and its worst-case behavior only shows up in situations that are already full of unrelated surprises.

If I'm responsible for an application, then servers are just buildings for my application to live in. I rearrange the interior walls and fixtures of the building for the convenience of my application. If I need to reboot the server, it's because that building is on fire and I need a new one. I'll try to rescue my application state first--asynchronously, and with application-specific tools. When I have finished that, I'll tell the server to reboot. With that reboot request I implicitly guarantee there is no longer state on the server that I care about losing--my application is not running any more, or its state is so badly broken that I've given up. It would be convenient to umount filesystems and clean up state outside of my application if possible (in threads or processes separate from the rebooting thread due to the high risk of failure) but it is never necessary. The only necessary code in this situation is a hard reboot. Anything in the reboot critical path that isn't rebooting is a bug.

If I'm responsible for the server, then applications are cattle and I might want robot cowboys to organize them. This case is the same as the previous case, since my server would effectively be running a single customer-hosting application. systemd makes some sense as that application--although still not necessarily as PID 1, and certainly not as the sole owner of a variety of important kernel features. If a customer stopped paying for service or did something disruptive, I might intentionally destroy their process state with SIGKILL and cgroups. My customer agreement would have the phrase "terminated at any time" sprinkled liberally throughout the text so that nobody can claim to be surprised.

Changes coming for systemd and control groups

Posted Jul 5, 2013 1:22 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

No, the problem of unreliable scripts in the boot process IS something that MUST be solved by the systemd (or its equivalent).

Bugs ALWAYS happen and they MUST be accounted for. That's why we use OS with memory protection and separate address spaces.

If I have to manually check all the scripts to make sure that an error in a BlinkKeyboardLightDaemon doesn't stop the entire boot process, then such a system has no place except in a trashcan.

Changes coming for systemd and control groups

Posted Jul 5, 2013 3:14 UTC (Fri) by zblaxell (subscriber, #26385) [Link]

Wait, did we switch topics to the boot process now? (as opposed to shutdown/reboot)

I have no complaints about the way systemd starts processes. The insanity starts when it's time to keep processes alive or reboot the system.

If we are still talking about reboot, it sounds like your solution to the problem of having buggy software in a critical code path is to add even more software in the critical path to supervise the buggy software and contain the impact of known bugs without fixing them. Presumably you also run this in some sort of nested container so that if systemd is buggy, some higher level of supervision (maybe another systemd?) can detect the problem and execute even more code in response. That layer could be buggy too, so it's nested supervisor software all the way down? Just the first level (the one with the initial bug) sounds insane to me, and every level of recursion squares the insanity.

"Yo dawg, I heard you like software, so I put some software on your software so you can run code to run your code..."

My approach is to look at the unnecessary code, and realize that even if that code was utterly perfect, it would not do anything more useful than no code at all, but would use more time, space, and power to achieve the null result. The sane thing to do is identify such code and simply remove it.

Changes coming for systemd and control groups

Posted Jul 5, 2013 3:30 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> Wait, did we switch topics to the boot process now? (as opposed to shutdown/reboot)
It doesn't matter.

>If we are still talking about reboot, it sounds like your solution to the problem of having buggy software in a critical code path is to add even more software in the critical path to supervise the buggy software and contain the impact of known bugs without fixing them.
Yup. My solution is to put a fairly SMALL amount of carefully tested code that can cope with whatever crap is thrown at it.

Your solution is to build a house of cards. Carefully checking each card, because we know that all bugs are easy to spot and fix.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds