Of course this goes to a General Resolution

Posted Feb 14, 2014 19:50 UTC (Fri) by smurf (subscriber, #17840)
In reply to: Of course this goes to a General Resolution by mbt
Parent article: The Debian technical committee vote concludes

> core process-starter/stopper

If you want to stop processes you need to be PID 1, otherwise you get a race condition.

> device-event handler,

Starting processes depends on events.

> CGroups manager,

When you start a server you need to put it in a CGroup.

> logger,

On-disk logging is separate. In-memory logging cannot be, since you want to start logging before root is even mounted.

> cron manager,

Because cron jobs don't just depend on the time of day, these days. They depend on being on AC power, or on the database running, or whatever. So best handled within the

> login/session manager,

separate.

> power management

For instance, shutting down? init needs to cleanly stop all its children and then exec something that's in RAM, otherwise the root file system cannot be unmounted cleanly.

You seem to forget that all these jobs need to closely communicate with each other, need to run all the time, need to serialize their internal state if they're to be updated without rebooting (i.e. you cannot just kill and restart your event handler and expect that everything is magically still hunky-dory.

So you either deal with five processes and their communication overhead and the subtle race conditions which are *certain* to crop in … or you put all of this into one single-threaded program which always has consistent local state which you can serialize safely if you want to re-exec yourself after an upgrade … and which you call "pid 1", necessarily.

In summary, can we please stop pretending that Lennart & Co. (a) put all of this into one huge binary that/s /sbin/systemd, (b) cobbled the part that _is_ PID 1 together out of spite, or because they didn't know better? 'Cause, you know, it's quite obvious that they actually thought about this and had/have sound technical reasons.

Of course this goes to a General Resolution

Posted Feb 18, 2014 17:23 UTC (Tue) by HelloWorld (guest, #56129) [Link] (4 responses)

> If you want to stop processes you need to be PID 1, otherwise you get a race condition.
You mean because orphans are reparented to init? That can be avoided with prctl(PR_SET_CHILD_SUBREAPER). So I don't really see why systemd needs to run as PID 1 instead of, say, 2 nowadays.

Of course this goes to a General Resolution

Posted Feb 18, 2014 18:44 UTC (Tue) by smurf (subscriber, #17840) [Link] (3 responses)

Thanks, I wasn't aware of the SUBREAPER call.

Anyway, the point isn't whether it's possible to run systemd as PID-2. You'd still need a way to signal PID-1 that it should please pivot its root file system to the RAM disk and exec /shutdown, so that your root file system can be unmounted cleanly. And probably some other minor quibbles that seem perfectly solvable, but also completely unnecessary when you can just have the features in PID-1.

Which begs the question: why bother? I still haven't seen any reason what the advantage of running systemd as pid-2 (or pid-2+pid-3+pid-4) would be. "Clean separation of responsibilities into separate processes" is not an argument I can accept, because they all need to run continuously and they all need to talk to each other, so you get increased complexity for no net gain.

Of course this goes to a General Resolution

Posted Feb 19, 2014 10:15 UTC (Wed) by HelloWorld (guest, #56129) [Link] (2 responses)

> You'd still need a way to signal PID-1 that it should please pivot its root file system to the RAM disk and exec /shutdown, so that your root file system can be unmounted cleanly.
Why is it PID 1 who needs to do that?

> Which begs the question: why bother? I still haven't seen any reason what the advantage of running systemd as pid-2 (or pid-2+pid-3+pid-4) would be.
The kernel will panic if PID 1 crashes, so it should be as simple as possible. Now, systemd never actually crashed on any of my systems, but why take the risk if you don't have to?

Of course this goes to a General Resolution

Posted Feb 19, 2014 11:06 UTC (Wed) by mchapman (subscriber, #66589) [Link]

> The kernel will panic if PID 1 crashes, so it should be as simple as possible. Now, systemd never actually crashed on any of my systems, but why take the risk if you don't have to?

I think the alternative complicates things.

If systemd running as PID 2 and marked as a child subreaper were to crash, then its children would be inherited by PID 1. Even if PID 1 were to restart systemd, the new systemd wouldn't be able wait on those reparented processes any more. PID 1 would be responsible for reaping them when they exit, and PID 1 would need to pass on notifications to that effect to the systemd process (so that it could re-exec them or whatever).

In short, I think using a separate child subreaper brings as many problems as it solves.

Of course this goes to a General Resolution

Posted Feb 19, 2014 11:34 UTC (Wed) by smurf (subscriber, #17840) [Link]

> Why is it PID 1 who needs to do that?

*Every* program which holds a file open on the root file system (or any file system, for that matter) needs to either exit, or exec() a program within the new (RAM disk) root. Otherwise you cannot unmount the root FS.

PID-1 may not exit. Therefore it's its job to exec the last step. PID-2 cannot do that. (OK, it could call the unmount-and-reboot program on the RAM disk after triggering PID-1 to exec a new init there, but again: what would be the point of that additional complexity?)