Another daemon for managing control groups
Another daemon for managing control groups
Posted Dec 5, 2013 13:03 UTC (Thu) by mezcalero (subscriber, #45103)In reply to: Another daemon for managing control groups by rwmj
Parent article: Another daemon for managing control groups
However, there are a few things you cannot turn off: That's PID1's process management of which cgroups is part of. That's udev device management and that's the journal (but you can turn off the journal's storage on disk, we however need it to pass stdout/stderr of all daemons through).
And yes, there are tons of reasons why you want cgroup management in PID1: because it's trivially easy there and simple, and it is not if you do it outside of PID1. If you do it out-of-process, you need to replicate pretty much the entire state of your service manager in your cgroup manager, since the entities that are resource managed are in 95% of the cases exactly the same entitites that are service managed. So splitting them up means you need some form of IPC that constantly replicates the entire tree of services from PID1 in that other cgroup daemon. That's fragile and messy, and a lot of unnecessary code. IPC always is. Then, there is the issue of cyclic deps: a good PID 1 knows at any time securely and reliably of each process to which service it belongs. That's easy to do with cgroups. However, if cgroup management is done outside of PID1 in a different process, then PID 1 suddenly becomes a client to that other process, while that other process is also a managed process PID 1 wants to use cgroups to manage for. To start and manage that other daemon that will allow you to deal with cgroups you need cgroups in the first place. And that's just broken.
Beyond that: When doing cgroups you need an execution queue of some kind and you need a dependency tree between your groups, so that you can always rebuild your tree from the root down, in the right order, with each property set before you go to the next child. You also need to dynamically react to devices coming and going and rebuild/refresh your tree since many cgroup props take device major/minor pairs which are pretty much dynamically assigned to devices these days. You also need to be able to deal with groups coming, and going, and possible propagating changes then to other groups on the same level. Soooo, an execution queue that is based on a dependency tree that is influenced by services/cgroups coming and going and devices coming and going, that's pretty much exactly what systemd *is*. If you avoid doing this in PID 1 then you are just recreating systemd a second time in a different process. A lot more code to write, to test, to spend resources on. I am sorry, but I am just not going to write or help building such a needlessly complex system. You can have it easy and small and today. I am really not interested in having it complicated, redundant, larger and one later day.
(And no, a library certainly doesn't work at all for this, you need to cross the privilege boundary and a single active queue dispatcher, which doesn't really make a library such a good choice...)
