|
|
Log in / Subscribe / Register

Another daemon for managing control groups

Another daemon for managing control groups

Posted Dec 8, 2013 20:41 UTC (Sun) by anselm (subscriber, #2796)
In reply to: Another daemon for managing control groups by Jandar
Parent article: Another daemon for managing control groups

Systemd still isn't udevd. Udevd is a separate program which can be used without systemd – it is just part of the same source tarball.

It does make technical sense for systemd's PID 1 process to be in charge of cgroups. See, for example, this message where Lennart Poettering explains why this is a reasonable idea.


to post comments

Another daemon for managing control groups

Posted Dec 9, 2013 11:39 UTC (Mon) by jubal (subscriber, #67202) [Link] (37 responses)

Are you sure we're reading the same text? Let me quote Lennart's explanation why this is a reasonable idea:

> a single-agent, we should make a kick-ass implementation that is
> flexible and scalable, and full-featured enough to not require
> divergence at the lowest layer of the stack.  Then build systemd on
> top of that. Let systemd offer more features and policies and
> "semantic" APIs.

Well, what if systemd is already kick-ass? I mean, if you have a problem 
with systemd, then that's your own problem, but I really don't think why 
I should bother?

I for sure am not going to make the PID 1 a client of another daemon. 
That's just wrong. If you have a daemon that is both conceptually the 
manager of another service and the client of that other service, then 
that's bad design and you will easily run into deadlocks and such. Just 
think about it: if you have some external daemon for managing cgroups, 
and you need cgroups for running external daemons, how are you going to 
start the external daemon for managing cgroups? Sure, you can hack 
around this, make that daemon special, and magic, and stuff -- or you 
can just not do such nonsense. There's no reason to repeat the fuckup 
that cgroup became in kernelspace a second time, but this time in 
userspace, with multiple manager daemons all with different and slightly 
incompatible definitions what a unit to manage actualy is...

We want to run fewer, simpler things on our systems, we want to reuse as 
much of the code as we can. You don't achieve that by running yet 
another daemon that does worse what systemd can anyway do simpler, 
easier and better.

The least you could grant us is to have a look at the final APIs we will 
have to offer before you already imply that systemd cannot be a valid 
implementation of any API people could ever agree on.

This is a reply to Tim Hockin's:

If systemd is the only upstream implementation of this single-agent
idea, we will have to invent our own, and continue to diverge rather
than converge.  I think that, if we are going to pursue this model of
a single-agent, we should make a kick-ass implementation that is
flexible and scalable, and full-featured enough to not require
divergence at the lowest layer of the stack.  Then build systemd on
top of that. Let systemd offer more features and policies and
"semantic" APIs.

We will build our own semantic APIs that are, necessarily, different
from systemd.  But we can all use the same low-level mechanism.

Note the usual non-confrontational Lennart's style of discussing obvious things and his typical respect to other participants in the discussion

Another daemon for managing control groups

Posted Dec 9, 2013 18:46 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link] (36 responses)

Note the usual non-confrontational Lennart's style of discussing obvious things and his typical respect to other participants in the discussion

If you want to dispute what he says, please address the technical point he goes into at some length. Either he has a valid point, in which case you need to address it technically, or he doesn't, in which case you can refute it. But as long as you spend your time ignoring his technical arguments and complaining that he's abrasive, people are left to conclude that you can't justify your objections to systemd and are just hating on it because of who designed it.

Another daemon for managing control groups

Posted Dec 9, 2013 21:49 UTC (Mon) by jspaleta (subscriber, #50639) [Link] (30 responses)

Its also instructive to go back and read the full cgroups list thread from the beginning to put the very short back and forth between Tim and Lennart into context. There is a ton of discussion well before Lennart pops in

Here's the deal... everybody except the systemd developers basically ignored the fact that the new "sane" cgroups interface roadmap for a year plus. systemd responded to the kernel dev side call and put effort into it. Everybody else basically just put their head in the sand for a year... and now that systemd has the first implementation meant to work with the new "sane" interface, the whole concept of having a single hierarchy with a dedicated manager is coming under fire.

I still don't see why the api systemd exposes is not adequate. I'm not saying that is is or is not. I'm saying I haven't seen anyone put effort into actually pointing out how its inadequate. I see a lot of discussion about why enforcing a single heirarchy is inadequate. I see a lot of discussion about how certain kernel-side changes enforced by "sane_behavior" cgroupfs mount option are going to require working userspace code. But an actual technical deficiency in the systemd abstraction to work with "sane_behavior" kernel development work..I haven't seen it.

And it still not clear to me that the new cgmanager implementation being spun up right not is going to meet the requirements imposed by the sane_behavior. I've taken a look at the cgmanager code, and I'm not convinced that its "sane_behavior" compatible at present. I think its relying on mechanisms marked as insane (for example it appears to me that its manipulating the tasks object, whcih I think was marked insane as of july in upstream patch commit)....if I'm reading the archived cgroups mailinglist discussion correctly.

I really need to stress this point. I do not think the alternative implementation consortium, that is pinning its hopes on cgmanager as a usable "sane_behavior" compatible manager implementation, is fully on the same page with where kernel devs are going with the sane_behavior work. There is a wealth of discussion archived in the cgroups mailinglist through all of 2013 concerning trying to mark functionality as sane or insane which anyone interested should go back and read up on. cgmanager may be a single hierarchy solution, but I'm not convinced that its going to meet the strictures of the "sane_behavior" mount options, of which a unified hierarchy is just one of the requirements.

I might be missing something significant, but it really seems like there continues to be a disconnect between what kernel side is doing to create the sane behavior and how cgmanager is being developed right now.

Another daemon for managing control groups

Posted Dec 9, 2013 22:14 UTC (Mon) by anselm (subscriber, #2796) [Link] (29 responses)

I still don't see why the api systemd exposes is not adequate. I'm not saying that is is or is not. I'm saying I haven't seen anyone put effort into actually pointing out how its inadequate.

That seems to be a motif in most discussions around systemd. Many people are quick to point out how (a) they think Lennart Poettering is a jerk, and (b) systemd is not what they're used to and they don't like it – but ask them to come up with something constructive and all you get is stunned silence.

Another daemon for managing control groups

Posted Dec 9, 2013 23:08 UTC (Mon) by khim (subscriber, #9252) [Link] (28 responses)

Note that here we have even worse situation.

Think about it. Kernel developers said that they could not offer API which is safe to use from untrusted code. Then they said that it's impossible to offer API which is safe to use from trusted code unless it's used by a single process. That is why they are planning to impose this change on a userspace.

Fine. Now cgmanager come along and basically proclaim: hey, kernel developers are lazy and stupid. They claim that it's impossible to offer sane cgroups API to userspace if it's supposed to be used by more then one process—but we'll just quickly go and cobble one together in hurry!

Now, I can easily imagine that they will solve their problem for some special case (note how Tim Hockin clearly says that we don't use udev (yet?) so we don't have these problems) and they will be able to cobble together some solution which will mostly work—if you are not doing “anything bad”. But if they will be able to invent something usable for general case? I highly doubt it. Why? It's easy: kernel developers can not do that. And kernel has access to all the knobs userspace can access and also to tons of other, userspace-invisible knobs and also can change kernel if it's really needed—and yet they could not do what cgmanager authors are planning to do.

Now, I don't fully understand why kernel developers insist that only one process must manage cgroups and why this work can not be delegated (even to other trusted processes, not to normal processes created by untrusted users) but if indeed cgmanager authors will be able to solve that issue sanely (which probably mostly means “safely in general case”) then indeed what James says will make perfect sense: If anyone, systemd included, wants to do a new API, it must support all use cases as well. Ideally, it should be agreed to and in the kernel as well rather than having some userspace filter. Yes, this is an ideal outcome and yes, if it's possible to do that then it's obviously must be done, but the whole story with a single controller in the userspace satrted with the supposition that such solution is impossible, remember?

Basically the whole discussion boils down one simple single point: systemd does not need any changes no matter what happens to cgmanager! If, indeed, cgmanager can offer sane, safe and flexible API to other daemons then kernel developers have made serious error in judgment and similar API can be offerend directly by kernel and can be used by systemd. If this endaveour will be found impossible and cgmanager will only manage to cover same corner cases (e.g. only cases where hardware is static and does not appear and disapper will-nilly) then, again, systemd should not be changed: it must work in general case, not just when cgmanager is safe to use!

Of course if cgmanager plans to offer somewhat restricted (and thus actually doable) API and not just a drop-in replacement for what we have now then said API can be added to systemd, but so far it does not look this way. On the contrary it looks like the plan is to offer the exact same API which kernel offers today (with all it's problems) just now as a separate daemon, not a kernel API. This is not a progress!

Another daemon for managing control groups

Posted Dec 10, 2013 2:50 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (27 responses)

>Why? It's easy: kernel developers can not do that
Wrong. They do not WANT to do that.

It's as simple as that. Right now delegation of cgroups to untrusted users _WORKS_ if one limits themselves to fully hierarchic groups like 'memory' or 'cpu'.

The proposed insane_behavior simply does the minimum amount required for SystemD to work and not an inch more.

Another daemon for managing control groups

Posted Dec 10, 2013 3:18 UTC (Tue) by pizza (subscriber, #46) [Link] (26 responses)

> Right now delegation of cgroups to untrusted users _WORKS_ if one limits themselves to fully hierarchic groups like 'memory' or 'cpu'.

Are you, as a kernel developer, willing to take the very massive chance that userspace will limit themselves to such an arrangement?

(If so, that is a very different stance than you typically take when it comes to such things...)

Another daemon for managing control groups

Posted Dec 10, 2013 3:28 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (25 responses)

> Are you, as a kernel developer, willing to take the very massive chance that userspace will limit themselves to such an arrangement?
Does kernel forbid to set SUID bit on /bin/bash? It's the same thing. If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

And trusting userspace to have interest in its own security is OK. For example, one can easily screw the kernel up by granting untrusted and malicious users excessive permissions on /sys. One can easily do "chmod -R a+w /sys", for FSM's sake!

Does it mean that kernel should forbid to change mode and ownership on /sys nodes and lock it down to be accessible only from SystemD? Oh wait, I don't want to give SystemD developers new ideas.

Another daemon for managing control groups

Posted Dec 10, 2013 3:53 UTC (Tue) by pizza (subscriber, #46) [Link] (7 responses)

Wow, way to throw out some strawmen arguments, peppered liberally with more uncalled-for Lennart-bashing. How was that in any way related to what I'd asked? How is *any* of this cgroup API nonsense Lennart's doing?

It's not like he wrote the original cgroup API, pointed out the problems with it, or wrote the new API. systemd was a user of the old API then adapted to kernel-imposed requirements with the new API, not the other way around.

I find it disappointing that you are trying to split hairs here, given how generously you tend to berate others for doing the same.

Another daemon for managing control groups

Posted Dec 10, 2013 4:07 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> How is *any* of this cgroup API nonsense Lennart's doing?

Stop pretending. Systemd drives cgroup development (Tejun Heo works at RedHat, remember) and Lennart gleefully supported "single-writer" model.

> It's not like he wrote the original cgroup API, pointed out the problems with it, or wrote the new API. systemd was a user of the old API then adapted to kernel-imposed requirements with the new API, not the other way around.

Wrong. Tejun Heo basically redesigned cgroups to better fit SystemD model. Everything else (nesting, delegation) is brushed aside as "impossible" or "insecure" without explanations (really, try to find them - there are none).

And it shows, Lennart basically goes out and says: "if we allow delegation and nesting then we'd have to actually treat cgroups interface as a kernel ABI, and if we don't then we can pretend that systemd is the sole user of cgroups and do whatever we want with it in future".

Now, I really like SystemD and I'd like to see it in Debian. But I totally disdain the behavior of cgroups developers.

Another daemon for managing control groups

Posted Dec 10, 2013 5:47 UTC (Tue) by raven667 (guest, #5198) [Link] (1 responses)

Well you've laid out a clear worldview of you you see things, I don't think there is anything that can be added or taken away. I have to add though that I don't see the corruption and negativity that you see so I don't know if that means I'm blind or if you just need to relax. All the best.

Another daemon for managing control groups

Posted Dec 10, 2013 5:53 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

It's not a corruption in the sense of taking bribes or anything. It's just tunnel vision - cgroups and systemd developers simply do only what they need, and nothing more. All the requests to be more reasonable are simply ignored.

I'd very much prefer if systemd and cgroups were developed by two different competing companies. Absent that, I hope that someone with a large enough cluebat makes cgroup developers see the error of their way.

Another daemon for managing control groups

Posted Dec 10, 2013 13:50 UTC (Tue) by corbet (editor, #1) [Link] (3 responses)

Having watched this whole process, I have to disagree a bit. Tejun set in to the task of reworking cgroups to make them more maintainable and usable; the systemd folks found themselves having to react to that change. Systemd was using the multiple hierarchy feature, keeping its own special cgroup hierarchy off to the side. Remember the PaxControlGroups document? See this note from last June (and this one) where systemd's response to the cgroup changes was worked out.

Now, in the process of redesigning things, Tejun has certainly talked to the users of cgroups, including the systemd developers. Doing otherwise would not have been smart. But I do not think it's fair to say that systemd has driven these changes.

Another daemon for managing control groups

Posted Dec 10, 2013 16:12 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

I was referring to these messages as well. They read exactly as if Tejun Heo privately talked to SystemD developers and then systemd and cgroup folks presented the future changes as a fait-accompli with no public discussions.

Another daemon for managing control groups

Posted Dec 10, 2013 17:23 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

In other words, your assertion is a guess and you don't really know but the language you use makes it sound almost like a conspiracy. It is misleading at best.

Another daemon for managing control groups

Posted Dec 10, 2013 18:06 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Well, from my vantage point it definitely looks like cgroups developers are doing only what systemd developers want. It might be accidental, but I really don't care.

Another daemon for managing control groups

Posted Dec 10, 2013 10:49 UTC (Tue) by khim (subscriber, #9252) [Link] (15 responses)

If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

Sure. Setuid was an interesting hack, but in hindsight it's obvious that it created a lot of security problems and gave very few practical benefits. Windows uses central daemons with DBUS-services to impelement such functionality and it works just fine there.

The only big question is how to support backward-compatibility: it may be bigger hassle then keeping setuid bit around.

Another daemon for managing control groups

Posted Dec 10, 2013 16:13 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (13 responses)

Except that any DBUS-based service would get the same troubles, only more complicated. The need to create a privileged process remains, but it's obscured by complex interfaces.

Another daemon for managing control groups

Posted Dec 10, 2013 17:13 UTC (Tue) by khim (subscriber, #9252) [Link] (12 responses)

Really? What's so complicated in the interface which is supposed to just start the program? It just needs to check credentials and do that. It always start applications in pre-determined environent with known starting conditions.

Compare with today's approach where bazillion parts of kernel must know about suid bit (euid vs uid), many libraries need to know about suid bit (euid vs uid), glibc must specifically handle startup of setuid binaries (and there were many exploits around this process), binaries often need special handling if they are supposed to ever run as suid binaries. Sorry, but argument is nor convincing.

Note that even today when suid bit is actually available many programs are not using it and use cetralized-privileged-daemon scheme instead (things like apache, ftp, mysql and other countless daemons). Strange, isn't it?

Sorry, but setuid bit is obviously a mistake. It's not easy to replace setuid bit with a DBUS interface today and perhaps it's not even worth trying (transition pain can easily outweight and potential gain), but the design itself is obviously too complex and too fragile. That's not even worth discussing.

Another daemon for managing control groups

Posted Dec 10, 2013 18:10 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (11 responses)

>Really? What's so complicated in the interface which is supposed to just start the program? It just needs to check credentials and do that. It always start applications in pre-determined environent with known starting conditions.

Because it will have ALL the faults of suid and lots of additional faults of a half-baked userspace implementation. For example, think about signals (especially RT signals and SIGSTOP/SIGKILL). I can kill my SUID program using a straightforward "kill" utility, how would you do this with SuidD?

I'm actually speaking from experience - we have such a daemon in our system. It's simply not possible to replicate all the kernel-level functionality.

SystemD is repeating ALL the problems of this approach. For example, they have to cobble something together to handle delegation to containers while simple bind-mount is enough right now to nest cgroups.

Another daemon for managing control groups

Posted Dec 10, 2013 21:46 UTC (Tue) by khim (subscriber, #9252) [Link] (10 responses)

Because it will have ALL the faults of suid and lots of additional faults of a half-baked userspace implementation.

Really? Which ones?

For example, think about signals (especially RT signals and SIGSTOP/SIGKILL).

What about signals?

I can kill my SUID program using a straightforward "kill" utility

Yup. And it is a problem security-wise.

how would you do this with SuidD?

Most likely answer: you would not be able to do that. You will probably have some high-level knobs but you will not be able to just send random signals to random provileged programs. And this is “good thing”™.

I'm actually speaking from experience - we have such a daemon in our system. It's simply not possible to replicate all the kernel-level functionality.

Of course not! It'll be pointless excercise to just shuffle functionality around. It's the other way around: suid is a problem because it gives you huge amount of rope to tie itself. SuiD will give you much, much smaller amount of rope. Yes, this will also mean that some brain-dead designs will become impossible, but this will just mean that you will need to spend few more time thinking about design of your system upfront. What real-world task are you trying to solve with signals? Why do you think it can only be solved by giving the rights to affect priveleged process on your system from some random shell script?

SystemD is repeating ALL the problems of this approach.

Wow. Thanks for bringing that to my attention. Seriously, no sarcasm. This cgroups ≈ suid analogy really helps to show just why it's bad idea to give access to just some random user to the capabilities of cgroups… but it still does not explain why only one daemon can ever manipulate cgroups. Ok, It needs to be privileged daemon, but it's still not entirely clear to me just why it must be PID 1.

For example, they have to cobble something together to handle delegation to containers while simple bind-mount is enough right now to nest cgroups.

Well, it was always good idea to have daemon which does that thus I'm not sure why you are just now trying “to cobble something together”. The problematic fact is that all these solutions must be tied somehow to systemd, they can not just exist as yet-another-daemon on the side, but this is not systemd's fault, AFAICS it was imposed by kernel side changes.

Another daemon for managing control groups

Posted Dec 10, 2013 22:39 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

>Most likely answer: you would not be able to do that. You will probably have some high-level knobs but you will not be able to just send random signals to random provileged programs. And this is “good thing”™.

And here's the problem - it's "The my way or the highway".

Let's discuss a very simple SUID program - good old 'ping' utility. A user should be able to watch live its output, so some kind of shim utility should be used to transfer standard FDs to the DBUS service. This shim must also be running all the time while the DBUS ping service is running.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).

So far so good, until I press Ctrl-Z. Whoops. SIGSTOP can't be captured.

And that's without going into the gory details of controlling terminals, ptys and realtime signals (can you say 'priority inversion'?). It doesn't matter that YOU don't like signals - they are de-fact used in the world out there.

But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want. Can you tell me how DBUS services are secured? How can I audit this security? Can I write an AppArmor policy to restrict '/usr/bin/firefox' access to '*cgroup*'?

Oh, and we have this nice Criu project - but it won't be able to checkpoint the DBUS-based service (it can't checkpoint only one end of a Unix socket).

And we can leave out minor details like confusing 'ps' output.

In the end, the DBUS-based solution is going to be an inferior and unreliable construct. And that's exactly what is happening with SystemD and cgroups right now. They are building an inferior wrapper on top of a kernel interface, that's in itself WORSE than the status quo.

>Wow. Thanks for bringing that to my attention. Seriously, no sarcasm. This cgroups ≈ suid analogy really helps to show just why it's bad idea to give access to just some random user to the capabilities of cgroups…
Yes, probably there are several tight spots in the cgroups API that might give users too much capabilities to harm the system. But so does /sys, /proc and namespaces - yet all of them are accessible to users.

Another daemon for managing control groups

Posted Dec 11, 2013 0:35 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

And that's without going into the gory details of controlling terminals, ptys and realtime signals (can you say 'priority inversion'?). It doesn't matter that YOU don't like signals - they are de-fact used in the world out there.

It's not about signals. It's about system design. Any time indirection goes from unprivileged process to privileged one it must be accounted and cotrolled. It's really hard to do with setuid approach and most programs don't bother to do that.

Let's discuss a very simple SUID program - good old 'ping' utility.

Sure, let's do that. Consider the fact that said utility plays with very low-level stuff and can easily hurt not just your system but also neigbhoring systems. Let's see if we can actually do that:
$ ping -f www.google.com
PING www.google.com (74.125.143.106) 56(84) bytes of data.
ping: cannot flood; minimal interval, allowed for user, is 200ms

Wow! Lookie: there are a protection! But does it actually work? Of course not: you can still run 1000 ping's in parallel and this will have basically the same effect.

In most cases what you need it something similar to tcptraceroute -f 30 -q 10 (which works without any special permissions), anyway.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).
So far so good, until I press Ctrl-Z. Whoops. SIGSTOP can't be captured.
Which is good because it's NOT good idea to do something to highly privileged process behind it's back. Actual priveleged ping process may notice that shim is no longer responding and will probably stop doing it's work. That's fine, don't see anything wrong with that.

Basically you are explaining why current [broken] interface is hard to replicate with SuiD deamon. That's fine, I agree with you: it's really hard to replace it with anything sane and perhaps it's not ever a good idea to try to do that right now. It still does not mean that it was good idea to build it in this form initially.

But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.

How is it different from /proc or /sys access?

Yes, probably there are several tight spots in the cgroups API that might give users too much capabilities to harm the system. But so does /sys, /proc and namespaces - yet all of them are accessible to users.

And that's why we must assume that any process started under any user yet with full access to all syscalls and /proc and /sys it having root access more or less automatically. It's basically impossible to make Linux kernel secure because it's attack surface is so wast.

Looks like people are really starting to think about it, but it's hard to change everything at once thus they are starting from most recent piece of the puzzle (which can be changed without affecting too many users yet). I'm just not sure what they are planning to do after that: sure, they will secure one tiny pice of the whole, but how exactly it'll help if everything else will remain in the same hodge-podge-with-bazillion-security-holes state?

Another daemon for managing control groups

Posted Dec 11, 2013 0:51 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

>It's not about signals. It's about system design. Any time indirection goes from unprivileged process to privileged one it must be accounted and cotrolled.
Welcome to Linux audit framework (that nobody uses, but it's there).

>Sure, let's do that. Consider the fact that said utility plays with very low-level stuff and can easily hurt not just your system but also neigbhoring systems.
Irrelevant. This particular warning is obsolete, anyway. I can just as well flood the network with UDP datagrams.

>Which is good because it's NOT good idea to do something to highly privileged process behind it's back.
Nope. It's a good idea, because SUID processes are specifically meant to interact with users. And SIGSTOP is one of the well-known ways to interact.

>Actual priveleged ping process may notice that shim is no longer responding and will probably stop doing it's work. That's fine, don't see anything wrong with that.
So there should be a heartbeat service? What about power consumption (all those spurious wakeups)?

You're digging hole even deeper.

>> But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.
>How is it different from /proc or /sys access?
That's it - it's not different at all. Except that I have easy to use tools to restrict access to /sys and /proc - AppArmor or SELinux (for masochists). I'm not aware of similar infrastructure for DBUS.

>And that's why we must assume that any process started under any user yet with full access to all syscalls and /proc and /sys it having root access more or less automatically. It's basically impossible to make Linux kernel secure because it's attack surface is so wast.
Container people managed to fix this. It's possible to start a namespaced container with its own view of /proc and /sys with full root access in it and it will be reasonably secure.

And puzzle comparison is apt - for many years full container support was known as 'containers puzzle' (just search LWN). Many people diligently chipped away all the pieces to make full isolation possible. And it's finally there.

Except now cgroups developers say: "It's too complicated for us, we'll just throw in the towel and make it impossible even if it works right now for many users. For their own good."

Another daemon for managing control groups

Posted Dec 11, 2013 10:33 UTC (Wed) by zdzichu (subscriber, #17118) [Link] (1 responses)

>> But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.
>How is it different from /proc or /sys access?
That's it - it's not different at all. Except that I have easy to use tools to restrict access to /sys and /proc - AppArmor or SELinux (for masochists). I'm not aware of similar infrastructure for DBUS.</i>

It's built-in in D-Bus. See http://dbus.freedesktop.org/doc/dbus-daemon.1.html (search for policy) or content of /etc/dbus-1/ directory.

BTW. the proper spelling is "systemd" (no arbitrary uppercase letters).

Another daemon for managing control groups

Posted Dec 11, 2013 19:38 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Nope. Can't use this policy to limit access to certain _processes_.

So we have two "solutions" already: polkit and DBUS policies. Which one is it?

Another daemon for managing control groups

Posted Dec 11, 2013 3:41 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> Can I write an AppArmor policy to restrict '/usr/bin/firefox' access to '*cgroup*'?

There's polkit (which is used to restrict access to APIs such as sleep/hibernate/shutdown) which can be used. I don't know where it acts though (whether at the dbus-daemon level or the receiver making *another* call out to polkit asking for permission.

Another daemon for managing control groups

Posted Dec 14, 2013 12:48 UTC (Sat) by kleptog (subscriber, #1183) [Link] (1 responses)

Is it just me, or is this example flawed:
Let's discuss a very simple SUID program - good old 'ping' utility. A user should be able to watch live its output, so some kind of shim utility should be used to transfer standard FDs to the DBUS service. This shim must also be running all the time while the DBUS ping service is running.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).

Pressing Ctrl-C is different from sending a SIGINT. Namely, when you press Ctrl-C, the kernel sends a SIGINT to anything using that terminal. I imagine the shim would pass through all necessary file descriptors and hence CTRL-C will work fine.

Would it be weird if you were allowed to Ctrl-C a process, but not be allowed to send it a signal from another terminal?

(Hmm, ping drops back to the normal user after opening the socket, does that mean another process could ptrace it and get access to the socket that way? ptrace block it now, but it is something to consider)

Another daemon for managing control groups

Posted Dec 14, 2013 18:47 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

The shim is using the controlling terminal so it'll get a signal. Not the privileged binary. I checked.

Anyway, you'll still have the problem with SIGSTOP.

Another daemon for managing control groups

Posted Dec 11, 2013 3:36 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

> but it's still not entirely clear to me just why it must be PID 1.

It doesn't *have* to be, just that with systemd, PID 1 is the simplest place to put it since it already dances around with cgroups pretty heavily and IPC to another service to manage cgroups makes that service a Special Snowflake that can't be set up using cgroups (since it would have to answer questions about how to start it before it starts).

Another daemon for managing control groups

Posted Dec 11, 2013 16:15 UTC (Wed) by khim (subscriber, #9252) [Link]

Well, that's the problem: PID 1 needs cgroups to manage services and container manager (presumably well-tested and quite privileged) needs them too. Why could not they both use cgroups? Why one must be client of the other one? That requirement was never properly explained.

Another daemon for managing control groups

Posted Dec 11, 2013 22:08 UTC (Wed) by nix (subscriber, #2304) [Link]

See Ian Jackson's 'userv'. I wish it had got more traction than the "none" it did...

Another daemon for managing control groups

Posted Dec 13, 2013 10:07 UTC (Fri) by gioele (subscriber, #61675) [Link]

> If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

That already exists and is widely used in RedHat and Debian systems: Polkit. https://en.wikipedia.org/wiki/Polkit

Another daemon for managing control groups

Posted Dec 10, 2013 17:59 UTC (Tue) by jubal (subscriber, #67202) [Link] (4 responses)

You might want to read the whole of my comment (yes, it contains two rather long quotes) and judge for yourself.

The italicised note was just a commentary, not a point in itself. The quotes were the point.

Another daemon for managing control groups

Posted Dec 10, 2013 20:58 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

I did read your comment, including the long quotes. I think Lennart is making a cogent point about why it makes sense to roll cgroup control into systemd. He might have been more polite about it, but his technical point seems sound. You have not yet addressed that technical point.

Another daemon for managing control groups

Posted Dec 11, 2013 13:27 UTC (Wed) by jubal (subscriber, #67202) [Link] (2 responses)

I don't see any technical point in telling other customers of the cgroups subsystem (and Google is probably the largest one at all, and will be in the foreseeable future) that their only options are to start using systemd or match its behaviour to the iota, just because it's more convenient for the authors of the systemd environment.

Another daemon for managing control groups

Posted Dec 11, 2013 15:23 UTC (Wed) by raven667 (guest, #5198) [Link] (1 responses)

What do you mean "match its behavior"? Do you mean having a system with a single writer or do you mean it's DBUS API? If you aren't using systemd as PID 1 and you don't need your userspace cgroup management client portable for other peoples who do run systemd then what systemd does or doesn't do is of no consequence to you, right? If anything you need to work with the kernel developers to make sure you understand their concerns and they understand your use cases. As has been pointed out many times the idea of a single userspace cgroups manager was an idea the kernel team had to remove cgroupfs as an attack surface for untrusted customer containers, they only wanted cgroupfs to provide the mechanism for changing settings and not also complicating the internals by encoding the security policy in the kernel.

Another daemon for managing control groups

Posted Dec 11, 2013 16:10 UTC (Wed) by jubal (subscriber, #67202) [Link]

That's grand and I don't think we're really in disagreement.

You may note I was referring to an old e-mail exchange (June, I believe), where Tim Hockin proposed to use a low-level library to provide basic cgroup management functions, a library that would be then used by both systemd and any non-systemd cgroup manager.

This approach was, as you may see, rejected by Mr. Poettering et co. as not viable and possibly non-constructive.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds