Posted Jun 1, 2007 14:02 UTC (Fri) by utoddl
In reply to: Process containers
Parent article: Process containers
Fair enough. Let's see if I can connect the dots.
Ignore for the moment the implementation of either groups or process containers, and just look at the semantics. A given process can be in multiple groups; child processes inherit groups from their parents; special circumstances can alter which groups are added or dropped from a process' group list. Likewise for processes in containers. If you were to replace the labels in the diagram from the article with numbers, you could implement the processes "in-container-x" property with the existing group mechanism.
Process group lists have always been a light-weight set of properties that processes carry around and pass on through fork(). The fact that (almost) nothing except file systems uses them not withstanding, it seems somebody finally noticed that the semantics of passing around properties in this way is useful for other things like processor affinity, throttling, and other things the article mentions.
AFS (and later OpenAFS) piggy-backed process authentication group membership on the group mechanism. The AFS kernel module would add a group (actually a pair of group numbers) to a processes group list to create a new PAG. Child processes would inherit these just like any other groups through fork(), but no file system -- including AFS -- used these group numbers to check file access. Instead, AFS would use these numbers to associate a process with a specific PAG, which is just a set of processes which share a cached token. The token *is* used for access control, but membership in a PAG is just a property like any other group membership. The semantics for group membership and inheritance just happens to be exactly what you want for an authenticated file system like AFS.
Besides that, though, these semantics happen to be exactly what you want for processor affinity, bandwidth throttling, CPU limits, etc. But rather than piggy-backing these capabilities onto the existing group mechanism as AFS did, they've invented another parallel mechanism for passing process properties around. Group membership and process container "in-ness" are just properties after all.
To be fair, the time tested group mechanism has its limits. Group lists are rather short (or thay were last time I ran into that issue). They also aren't explicitly hierarchical like process containers (though what that buys us wasn't immediately obvious to me upon reading the article). It wouldn't surprise me if the old UNIX groups weren't eventually reimplemented as containers. Then you could eventually have hierarchical UNIX groups!
The point of my "camel in the tent" comment was that the way AFS piggy-backed the process properties it was interested in on top of groups was met with skepticism and sometimes out-right contempt by some kernel developers. The reasons include NIH (Not Invented Here -- AFS predates linux by a fair few years), the kernel module itself is maintained out-of-tree (it builds for several OSes other than Linux and not just on the current versions, so it contains a lot of "cruft", at least in the eyes of the kernel hard-core), and it's hobbled by being under the IPL license (basically IBM's GPL with a "we can take it proprietary later if we want" clause). AFS on recent kernels has switched to using keyrings -- yet another special purpose property propagation mechanism -- to implement PAGs, but the other factors still keep AFS/OpenAFS on the outside looking in.
The kernel goes through this periodic process where some new functionality is added, then somebody points out that this new thing and this other old thing have similar operations, then some common code is developed that they can both use or one gets folded into the other. We've seen it over and over, and I wouldn't be surprised to see it happen with groups and process properties.
to post comments)