|
|
Log in / Subscribe / Register

Another daemon for managing control groups

Another daemon for managing control groups

Posted Dec 8, 2013 15:27 UTC (Sun) by mpr22 (subscriber, #60784)
In reply to: Another daemon for managing control groups by Jandar
Parent article: Another daemon for managing control groups

The problem is that systemd is the *entire system*.

systemd is not an text editor, interactive MUA, C (or LISP / Fortran / Pascal / insert-your-compiled-language-here) compiler, interactive command-line interpeter, interactive IRC client, Perl (or Python / Ruby / Lua / insert-your-dynamic-language-here) interpreter, assembler, linker, window manager, media player, web browser, PDF viewer, or any of a myriad of other interactive or user-invoked-batch tools I would need or want my desktop/workstation Linux systems to have.

Neither is it a web server, MTA, news server, IRC daemon, or any of a myriad of other things I might need or want my Linux servers to have.

In short: I cannot do useful work with nothing but a Linux kernel and systemd, so in what sense, then, is it the "entire system"?


to post comments

Another daemon for managing control groups

Posted Dec 8, 2013 20:15 UTC (Sun) by Jandar (subscriber, #85683) [Link] (39 responses)

Once upon a time systemd wasn't udevd or cgroupsd. Where is the boundary?

Another daemon for managing control groups

Posted Dec 8, 2013 20:41 UTC (Sun) by anselm (subscriber, #2796) [Link] (38 responses)

Systemd still isn't udevd. Udevd is a separate program which can be used without systemd – it is just part of the same source tarball.

It does make technical sense for systemd's PID 1 process to be in charge of cgroups. See, for example, this message where Lennart Poettering explains why this is a reasonable idea.

Another daemon for managing control groups

Posted Dec 9, 2013 11:39 UTC (Mon) by jubal (subscriber, #67202) [Link] (37 responses)

Are you sure we're reading the same text? Let me quote Lennart's explanation why this is a reasonable idea:

> a single-agent, we should make a kick-ass implementation that is
> flexible and scalable, and full-featured enough to not require
> divergence at the lowest layer of the stack.  Then build systemd on
> top of that. Let systemd offer more features and policies and
> "semantic" APIs.

Well, what if systemd is already kick-ass? I mean, if you have a problem 
with systemd, then that's your own problem, but I really don't think why 
I should bother?

I for sure am not going to make the PID 1 a client of another daemon. 
That's just wrong. If you have a daemon that is both conceptually the 
manager of another service and the client of that other service, then 
that's bad design and you will easily run into deadlocks and such. Just 
think about it: if you have some external daemon for managing cgroups, 
and you need cgroups for running external daemons, how are you going to 
start the external daemon for managing cgroups? Sure, you can hack 
around this, make that daemon special, and magic, and stuff -- or you 
can just not do such nonsense. There's no reason to repeat the fuckup 
that cgroup became in kernelspace a second time, but this time in 
userspace, with multiple manager daemons all with different and slightly 
incompatible definitions what a unit to manage actualy is...

We want to run fewer, simpler things on our systems, we want to reuse as 
much of the code as we can. You don't achieve that by running yet 
another daemon that does worse what systemd can anyway do simpler, 
easier and better.

The least you could grant us is to have a look at the final APIs we will 
have to offer before you already imply that systemd cannot be a valid 
implementation of any API people could ever agree on.

This is a reply to Tim Hockin's:

If systemd is the only upstream implementation of this single-agent
idea, we will have to invent our own, and continue to diverge rather
than converge.  I think that, if we are going to pursue this model of
a single-agent, we should make a kick-ass implementation that is
flexible and scalable, and full-featured enough to not require
divergence at the lowest layer of the stack.  Then build systemd on
top of that. Let systemd offer more features and policies and
"semantic" APIs.

We will build our own semantic APIs that are, necessarily, different
from systemd.  But we can all use the same low-level mechanism.

Note the usual non-confrontational Lennart's style of discussing obvious things and his typical respect to other participants in the discussion

Another daemon for managing control groups

Posted Dec 9, 2013 18:46 UTC (Mon) by rgmoore (✭ supporter ✭, #75) [Link] (36 responses)

Note the usual non-confrontational Lennart's style of discussing obvious things and his typical respect to other participants in the discussion

If you want to dispute what he says, please address the technical point he goes into at some length. Either he has a valid point, in which case you need to address it technically, or he doesn't, in which case you can refute it. But as long as you spend your time ignoring his technical arguments and complaining that he's abrasive, people are left to conclude that you can't justify your objections to systemd and are just hating on it because of who designed it.

Another daemon for managing control groups

Posted Dec 9, 2013 21:49 UTC (Mon) by jspaleta (subscriber, #50639) [Link] (30 responses)

Its also instructive to go back and read the full cgroups list thread from the beginning to put the very short back and forth between Tim and Lennart into context. There is a ton of discussion well before Lennart pops in

Here's the deal... everybody except the systemd developers basically ignored the fact that the new "sane" cgroups interface roadmap for a year plus. systemd responded to the kernel dev side call and put effort into it. Everybody else basically just put their head in the sand for a year... and now that systemd has the first implementation meant to work with the new "sane" interface, the whole concept of having a single hierarchy with a dedicated manager is coming under fire.

I still don't see why the api systemd exposes is not adequate. I'm not saying that is is or is not. I'm saying I haven't seen anyone put effort into actually pointing out how its inadequate. I see a lot of discussion about why enforcing a single heirarchy is inadequate. I see a lot of discussion about how certain kernel-side changes enforced by "sane_behavior" cgroupfs mount option are going to require working userspace code. But an actual technical deficiency in the systemd abstraction to work with "sane_behavior" kernel development work..I haven't seen it.

And it still not clear to me that the new cgmanager implementation being spun up right not is going to meet the requirements imposed by the sane_behavior. I've taken a look at the cgmanager code, and I'm not convinced that its "sane_behavior" compatible at present. I think its relying on mechanisms marked as insane (for example it appears to me that its manipulating the tasks object, whcih I think was marked insane as of july in upstream patch commit)....if I'm reading the archived cgroups mailinglist discussion correctly.

I really need to stress this point. I do not think the alternative implementation consortium, that is pinning its hopes on cgmanager as a usable "sane_behavior" compatible manager implementation, is fully on the same page with where kernel devs are going with the sane_behavior work. There is a wealth of discussion archived in the cgroups mailinglist through all of 2013 concerning trying to mark functionality as sane or insane which anyone interested should go back and read up on. cgmanager may be a single hierarchy solution, but I'm not convinced that its going to meet the strictures of the "sane_behavior" mount options, of which a unified hierarchy is just one of the requirements.

I might be missing something significant, but it really seems like there continues to be a disconnect between what kernel side is doing to create the sane behavior and how cgmanager is being developed right now.

Another daemon for managing control groups

Posted Dec 9, 2013 22:14 UTC (Mon) by anselm (subscriber, #2796) [Link] (29 responses)

I still don't see why the api systemd exposes is not adequate. I'm not saying that is is or is not. I'm saying I haven't seen anyone put effort into actually pointing out how its inadequate.

That seems to be a motif in most discussions around systemd. Many people are quick to point out how (a) they think Lennart Poettering is a jerk, and (b) systemd is not what they're used to and they don't like it – but ask them to come up with something constructive and all you get is stunned silence.

Another daemon for managing control groups

Posted Dec 9, 2013 23:08 UTC (Mon) by khim (subscriber, #9252) [Link] (28 responses)

Note that here we have even worse situation.

Think about it. Kernel developers said that they could not offer API which is safe to use from untrusted code. Then they said that it's impossible to offer API which is safe to use from trusted code unless it's used by a single process. That is why they are planning to impose this change on a userspace.

Fine. Now cgmanager come along and basically proclaim: hey, kernel developers are lazy and stupid. They claim that it's impossible to offer sane cgroups API to userspace if it's supposed to be used by more then one process—but we'll just quickly go and cobble one together in hurry!

Now, I can easily imagine that they will solve their problem for some special case (note how Tim Hockin clearly says that we don't use udev (yet?) so we don't have these problems) and they will be able to cobble together some solution which will mostly work—if you are not doing “anything bad”. But if they will be able to invent something usable for general case? I highly doubt it. Why? It's easy: kernel developers can not do that. And kernel has access to all the knobs userspace can access and also to tons of other, userspace-invisible knobs and also can change kernel if it's really needed—and yet they could not do what cgmanager authors are planning to do.

Now, I don't fully understand why kernel developers insist that only one process must manage cgroups and why this work can not be delegated (even to other trusted processes, not to normal processes created by untrusted users) but if indeed cgmanager authors will be able to solve that issue sanely (which probably mostly means “safely in general case”) then indeed what James says will make perfect sense: If anyone, systemd included, wants to do a new API, it must support all use cases as well. Ideally, it should be agreed to and in the kernel as well rather than having some userspace filter. Yes, this is an ideal outcome and yes, if it's possible to do that then it's obviously must be done, but the whole story with a single controller in the userspace satrted with the supposition that such solution is impossible, remember?

Basically the whole discussion boils down one simple single point: systemd does not need any changes no matter what happens to cgmanager! If, indeed, cgmanager can offer sane, safe and flexible API to other daemons then kernel developers have made serious error in judgment and similar API can be offerend directly by kernel and can be used by systemd. If this endaveour will be found impossible and cgmanager will only manage to cover same corner cases (e.g. only cases where hardware is static and does not appear and disapper will-nilly) then, again, systemd should not be changed: it must work in general case, not just when cgmanager is safe to use!

Of course if cgmanager plans to offer somewhat restricted (and thus actually doable) API and not just a drop-in replacement for what we have now then said API can be added to systemd, but so far it does not look this way. On the contrary it looks like the plan is to offer the exact same API which kernel offers today (with all it's problems) just now as a separate daemon, not a kernel API. This is not a progress!

Another daemon for managing control groups

Posted Dec 10, 2013 2:50 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (27 responses)

>Why? It's easy: kernel developers can not do that
Wrong. They do not WANT to do that.

It's as simple as that. Right now delegation of cgroups to untrusted users _WORKS_ if one limits themselves to fully hierarchic groups like 'memory' or 'cpu'.

The proposed insane_behavior simply does the minimum amount required for SystemD to work and not an inch more.

Another daemon for managing control groups

Posted Dec 10, 2013 3:18 UTC (Tue) by pizza (subscriber, #46) [Link] (26 responses)

> Right now delegation of cgroups to untrusted users _WORKS_ if one limits themselves to fully hierarchic groups like 'memory' or 'cpu'.

Are you, as a kernel developer, willing to take the very massive chance that userspace will limit themselves to such an arrangement?

(If so, that is a very different stance than you typically take when it comes to such things...)

Another daemon for managing control groups

Posted Dec 10, 2013 3:28 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (25 responses)

> Are you, as a kernel developer, willing to take the very massive chance that userspace will limit themselves to such an arrangement?
Does kernel forbid to set SUID bit on /bin/bash? It's the same thing. If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

And trusting userspace to have interest in its own security is OK. For example, one can easily screw the kernel up by granting untrusted and malicious users excessive permissions on /sys. One can easily do "chmod -R a+w /sys", for FSM's sake!

Does it mean that kernel should forbid to change mode and ownership on /sys nodes and lock it down to be accessible only from SystemD? Oh wait, I don't want to give SystemD developers new ideas.

Another daemon for managing control groups

Posted Dec 10, 2013 3:53 UTC (Tue) by pizza (subscriber, #46) [Link] (7 responses)

Wow, way to throw out some strawmen arguments, peppered liberally with more uncalled-for Lennart-bashing. How was that in any way related to what I'd asked? How is *any* of this cgroup API nonsense Lennart's doing?

It's not like he wrote the original cgroup API, pointed out the problems with it, or wrote the new API. systemd was a user of the old API then adapted to kernel-imposed requirements with the new API, not the other way around.

I find it disappointing that you are trying to split hairs here, given how generously you tend to berate others for doing the same.

Another daemon for managing control groups

Posted Dec 10, 2013 4:07 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (6 responses)

> How is *any* of this cgroup API nonsense Lennart's doing?

Stop pretending. Systemd drives cgroup development (Tejun Heo works at RedHat, remember) and Lennart gleefully supported "single-writer" model.

> It's not like he wrote the original cgroup API, pointed out the problems with it, or wrote the new API. systemd was a user of the old API then adapted to kernel-imposed requirements with the new API, not the other way around.

Wrong. Tejun Heo basically redesigned cgroups to better fit SystemD model. Everything else (nesting, delegation) is brushed aside as "impossible" or "insecure" without explanations (really, try to find them - there are none).

And it shows, Lennart basically goes out and says: "if we allow delegation and nesting then we'd have to actually treat cgroups interface as a kernel ABI, and if we don't then we can pretend that systemd is the sole user of cgroups and do whatever we want with it in future".

Now, I really like SystemD and I'd like to see it in Debian. But I totally disdain the behavior of cgroups developers.

Another daemon for managing control groups

Posted Dec 10, 2013 5:47 UTC (Tue) by raven667 (guest, #5198) [Link] (1 responses)

Well you've laid out a clear worldview of you you see things, I don't think there is anything that can be added or taken away. I have to add though that I don't see the corruption and negativity that you see so I don't know if that means I'm blind or if you just need to relax. All the best.

Another daemon for managing control groups

Posted Dec 10, 2013 5:53 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

It's not a corruption in the sense of taking bribes or anything. It's just tunnel vision - cgroups and systemd developers simply do only what they need, and nothing more. All the requests to be more reasonable are simply ignored.

I'd very much prefer if systemd and cgroups were developed by two different competing companies. Absent that, I hope that someone with a large enough cluebat makes cgroup developers see the error of their way.

Another daemon for managing control groups

Posted Dec 10, 2013 13:50 UTC (Tue) by corbet (editor, #1) [Link] (3 responses)

Having watched this whole process, I have to disagree a bit. Tejun set in to the task of reworking cgroups to make them more maintainable and usable; the systemd folks found themselves having to react to that change. Systemd was using the multiple hierarchy feature, keeping its own special cgroup hierarchy off to the side. Remember the PaxControlGroups document? See this note from last June (and this one) where systemd's response to the cgroup changes was worked out.

Now, in the process of redesigning things, Tejun has certainly talked to the users of cgroups, including the systemd developers. Doing otherwise would not have been smart. But I do not think it's fair to say that systemd has driven these changes.

Another daemon for managing control groups

Posted Dec 10, 2013 16:12 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

I was referring to these messages as well. They read exactly as if Tejun Heo privately talked to SystemD developers and then systemd and cgroup folks presented the future changes as a fait-accompli with no public discussions.

Another daemon for managing control groups

Posted Dec 10, 2013 17:23 UTC (Tue) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

In other words, your assertion is a guess and you don't really know but the language you use makes it sound almost like a conspiracy. It is misleading at best.

Another daemon for managing control groups

Posted Dec 10, 2013 18:06 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

Well, from my vantage point it definitely looks like cgroups developers are doing only what systemd developers want. It might be accidental, but I really don't care.

Another daemon for managing control groups

Posted Dec 10, 2013 10:49 UTC (Tue) by khim (subscriber, #9252) [Link] (15 responses)

If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

Sure. Setuid was an interesting hack, but in hindsight it's obvious that it created a lot of security problems and gave very few practical benefits. Windows uses central daemons with DBUS-services to impelement such functionality and it works just fine there.

The only big question is how to support backward-compatibility: it may be bigger hassle then keeping setuid bit around.

Another daemon for managing control groups

Posted Dec 10, 2013 16:13 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (13 responses)

Except that any DBUS-based service would get the same troubles, only more complicated. The need to create a privileged process remains, but it's obscured by complex interfaces.

Another daemon for managing control groups

Posted Dec 10, 2013 17:13 UTC (Tue) by khim (subscriber, #9252) [Link] (12 responses)

Really? What's so complicated in the interface which is supposed to just start the program? It just needs to check credentials and do that. It always start applications in pre-determined environent with known starting conditions.

Compare with today's approach where bazillion parts of kernel must know about suid bit (euid vs uid), many libraries need to know about suid bit (euid vs uid), glibc must specifically handle startup of setuid binaries (and there were many exploits around this process), binaries often need special handling if they are supposed to ever run as suid binaries. Sorry, but argument is nor convincing.

Note that even today when suid bit is actually available many programs are not using it and use cetralized-privileged-daemon scheme instead (things like apache, ftp, mysql and other countless daemons). Strange, isn't it?

Sorry, but setuid bit is obviously a mistake. It's not easy to replace setuid bit with a DBUS interface today and perhaps it's not even worth trying (transition pain can easily outweight and potential gain), but the design itself is obviously too complex and too fragile. That's not even worth discussing.

Another daemon for managing control groups

Posted Dec 10, 2013 18:10 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (11 responses)

>Really? What's so complicated in the interface which is supposed to just start the program? It just needs to check credentials and do that. It always start applications in pre-determined environent with known starting conditions.

Because it will have ALL the faults of suid and lots of additional faults of a half-baked userspace implementation. For example, think about signals (especially RT signals and SIGSTOP/SIGKILL). I can kill my SUID program using a straightforward "kill" utility, how would you do this with SuidD?

I'm actually speaking from experience - we have such a daemon in our system. It's simply not possible to replicate all the kernel-level functionality.

SystemD is repeating ALL the problems of this approach. For example, they have to cobble something together to handle delegation to containers while simple bind-mount is enough right now to nest cgroups.

Another daemon for managing control groups

Posted Dec 10, 2013 21:46 UTC (Tue) by khim (subscriber, #9252) [Link] (10 responses)

Because it will have ALL the faults of suid and lots of additional faults of a half-baked userspace implementation.

Really? Which ones?

For example, think about signals (especially RT signals and SIGSTOP/SIGKILL).

What about signals?

I can kill my SUID program using a straightforward "kill" utility

Yup. And it is a problem security-wise.

how would you do this with SuidD?

Most likely answer: you would not be able to do that. You will probably have some high-level knobs but you will not be able to just send random signals to random provileged programs. And this is “good thing”™.

I'm actually speaking from experience - we have such a daemon in our system. It's simply not possible to replicate all the kernel-level functionality.

Of course not! It'll be pointless excercise to just shuffle functionality around. It's the other way around: suid is a problem because it gives you huge amount of rope to tie itself. SuiD will give you much, much smaller amount of rope. Yes, this will also mean that some brain-dead designs will become impossible, but this will just mean that you will need to spend few more time thinking about design of your system upfront. What real-world task are you trying to solve with signals? Why do you think it can only be solved by giving the rights to affect priveleged process on your system from some random shell script?

SystemD is repeating ALL the problems of this approach.

Wow. Thanks for bringing that to my attention. Seriously, no sarcasm. This cgroups ≈ suid analogy really helps to show just why it's bad idea to give access to just some random user to the capabilities of cgroups… but it still does not explain why only one daemon can ever manipulate cgroups. Ok, It needs to be privileged daemon, but it's still not entirely clear to me just why it must be PID 1.

For example, they have to cobble something together to handle delegation to containers while simple bind-mount is enough right now to nest cgroups.

Well, it was always good idea to have daemon which does that thus I'm not sure why you are just now trying “to cobble something together”. The problematic fact is that all these solutions must be tied somehow to systemd, they can not just exist as yet-another-daemon on the side, but this is not systemd's fault, AFAICS it was imposed by kernel side changes.

Another daemon for managing control groups

Posted Dec 10, 2013 22:39 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link] (7 responses)

>Most likely answer: you would not be able to do that. You will probably have some high-level knobs but you will not be able to just send random signals to random provileged programs. And this is “good thing”™.

And here's the problem - it's "The my way or the highway".

Let's discuss a very simple SUID program - good old 'ping' utility. A user should be able to watch live its output, so some kind of shim utility should be used to transfer standard FDs to the DBUS service. This shim must also be running all the time while the DBUS ping service is running.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).

So far so good, until I press Ctrl-Z. Whoops. SIGSTOP can't be captured.

And that's without going into the gory details of controlling terminals, ptys and realtime signals (can you say 'priority inversion'?). It doesn't matter that YOU don't like signals - they are de-fact used in the world out there.

But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want. Can you tell me how DBUS services are secured? How can I audit this security? Can I write an AppArmor policy to restrict '/usr/bin/firefox' access to '*cgroup*'?

Oh, and we have this nice Criu project - but it won't be able to checkpoint the DBUS-based service (it can't checkpoint only one end of a Unix socket).

And we can leave out minor details like confusing 'ps' output.

In the end, the DBUS-based solution is going to be an inferior and unreliable construct. And that's exactly what is happening with SystemD and cgroups right now. They are building an inferior wrapper on top of a kernel interface, that's in itself WORSE than the status quo.

>Wow. Thanks for bringing that to my attention. Seriously, no sarcasm. This cgroups ≈ suid analogy really helps to show just why it's bad idea to give access to just some random user to the capabilities of cgroups…
Yes, probably there are several tight spots in the cgroups API that might give users too much capabilities to harm the system. But so does /sys, /proc and namespaces - yet all of them are accessible to users.

Another daemon for managing control groups

Posted Dec 11, 2013 0:35 UTC (Wed) by khim (subscriber, #9252) [Link] (3 responses)

And that's without going into the gory details of controlling terminals, ptys and realtime signals (can you say 'priority inversion'?). It doesn't matter that YOU don't like signals - they are de-fact used in the world out there.

It's not about signals. It's about system design. Any time indirection goes from unprivileged process to privileged one it must be accounted and cotrolled. It's really hard to do with setuid approach and most programs don't bother to do that.

Let's discuss a very simple SUID program - good old 'ping' utility.

Sure, let's do that. Consider the fact that said utility plays with very low-level stuff and can easily hurt not just your system but also neigbhoring systems. Let's see if we can actually do that:
$ ping -f www.google.com
PING www.google.com (74.125.143.106) 56(84) bytes of data.
ping: cannot flood; minimal interval, allowed for user, is 200ms

Wow! Lookie: there are a protection! But does it actually work? Of course not: you can still run 1000 ping's in parallel and this will have basically the same effect.

In most cases what you need it something similar to tcptraceroute -f 30 -q 10 (which works without any special permissions), anyway.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).
So far so good, until I press Ctrl-Z. Whoops. SIGSTOP can't be captured.
Which is good because it's NOT good idea to do something to highly privileged process behind it's back. Actual priveleged ping process may notice that shim is no longer responding and will probably stop doing it's work. That's fine, don't see anything wrong with that.

Basically you are explaining why current [broken] interface is hard to replicate with SuiD deamon. That's fine, I agree with you: it's really hard to replace it with anything sane and perhaps it's not ever a good idea to try to do that right now. It still does not mean that it was good idea to build it in this form initially.

But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.

How is it different from /proc or /sys access?

Yes, probably there are several tight spots in the cgroups API that might give users too much capabilities to harm the system. But so does /sys, /proc and namespaces - yet all of them are accessible to users.

And that's why we must assume that any process started under any user yet with full access to all syscalls and /proc and /sys it having root access more or less automatically. It's basically impossible to make Linux kernel secure because it's attack surface is so wast.

Looks like people are really starting to think about it, but it's hard to change everything at once thus they are starting from most recent piece of the puzzle (which can be changed without affecting too many users yet). I'm just not sure what they are planning to do after that: sure, they will secure one tiny pice of the whole, but how exactly it'll help if everything else will remain in the same hodge-podge-with-bazillion-security-holes state?

Another daemon for managing control groups

Posted Dec 11, 2013 0:51 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

>It's not about signals. It's about system design. Any time indirection goes from unprivileged process to privileged one it must be accounted and cotrolled.
Welcome to Linux audit framework (that nobody uses, but it's there).

>Sure, let's do that. Consider the fact that said utility plays with very low-level stuff and can easily hurt not just your system but also neigbhoring systems.
Irrelevant. This particular warning is obsolete, anyway. I can just as well flood the network with UDP datagrams.

>Which is good because it's NOT good idea to do something to highly privileged process behind it's back.
Nope. It's a good idea, because SUID processes are specifically meant to interact with users. And SIGSTOP is one of the well-known ways to interact.

>Actual priveleged ping process may notice that shim is no longer responding and will probably stop doing it's work. That's fine, don't see anything wrong with that.
So there should be a heartbeat service? What about power consumption (all those spurious wakeups)?

You're digging hole even deeper.

>> But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.
>How is it different from /proc or /sys access?
That's it - it's not different at all. Except that I have easy to use tools to restrict access to /sys and /proc - AppArmor or SELinux (for masochists). I'm not aware of similar infrastructure for DBUS.

>And that's why we must assume that any process started under any user yet with full access to all syscalls and /proc and /sys it having root access more or less automatically. It's basically impossible to make Linux kernel secure because it's attack surface is so wast.
Container people managed to fix this. It's possible to start a namespaced container with its own view of /proc and /sys with full root access in it and it will be reasonably secure.

And puzzle comparison is apt - for many years full container support was known as 'containers puzzle' (just search LWN). Many people diligently chipped away all the pieces to make full isolation possible. And it's finally there.

Except now cgroups developers say: "It's too complicated for us, we'll just throw in the towel and make it impossible even if it works right now for many users. For their own good."

Another daemon for managing control groups

Posted Dec 11, 2013 10:33 UTC (Wed) by zdzichu (subscriber, #17118) [Link] (1 responses)

>> But let's go on. Suppose that we have a browser running in a sandbox. Should it be able to access DBUS? Likely. But I definitely don't want it to access the SUID-runner service, while my beloved Tilda should be able to start whatever processes I want.
>How is it different from /proc or /sys access?
That's it - it's not different at all. Except that I have easy to use tools to restrict access to /sys and /proc - AppArmor or SELinux (for masochists). I'm not aware of similar infrastructure for DBUS.</i>

It's built-in in D-Bus. See http://dbus.freedesktop.org/doc/dbus-daemon.1.html (search for policy) or content of /etc/dbus-1/ directory.

BTW. the proper spelling is "systemd" (no arbitrary uppercase letters).

Another daemon for managing control groups

Posted Dec 11, 2013 19:38 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Nope. Can't use this policy to limit access to certain _processes_.

So we have two "solutions" already: polkit and DBUS policies. Which one is it?

Another daemon for managing control groups

Posted Dec 11, 2013 3:41 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

> Can I write an AppArmor policy to restrict '/usr/bin/firefox' access to '*cgroup*'?

There's polkit (which is used to restrict access to APIs such as sleep/hibernate/shutdown) which can be used. I don't know where it acts though (whether at the dbus-daemon level or the receiver making *another* call out to polkit asking for permission.

Another daemon for managing control groups

Posted Dec 14, 2013 12:48 UTC (Sat) by kleptog (subscriber, #1183) [Link] (1 responses)

Is it just me, or is this example flawed:
Let's discuss a very simple SUID program - good old 'ping' utility. A user should be able to watch live its output, so some kind of shim utility should be used to transfer standard FDs to the DBUS service. This shim must also be running all the time while the DBUS ping service is running.

So far so good. But now I want to stop the service, so I press Ctrl-C. And nothing happens, unless the shim captures this signal and somehow communicates it to the DBUS service (oh, and don't forget to authenticate the transmission).

Pressing Ctrl-C is different from sending a SIGINT. Namely, when you press Ctrl-C, the kernel sends a SIGINT to anything using that terminal. I imagine the shim would pass through all necessary file descriptors and hence CTRL-C will work fine.

Would it be weird if you were allowed to Ctrl-C a process, but not be allowed to send it a signal from another terminal?

(Hmm, ping drops back to the normal user after opening the socket, does that mean another process could ptrace it and get access to the socket that way? ptrace block it now, but it is something to consider)

Another daemon for managing control groups

Posted Dec 14, 2013 18:47 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

The shim is using the controlling terminal so it'll get a signal. Not the privileged binary. I checked.

Anyway, you'll still have the problem with SIGSTOP.

Another daemon for managing control groups

Posted Dec 11, 2013 3:36 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (1 responses)

> but it's still not entirely clear to me just why it must be PID 1.

It doesn't *have* to be, just that with systemd, PID 1 is the simplest place to put it since it already dances around with cgroups pretty heavily and IPC to another service to manage cgroups makes that service a Special Snowflake that can't be set up using cgroups (since it would have to answer questions about how to start it before it starts).

Another daemon for managing control groups

Posted Dec 11, 2013 16:15 UTC (Wed) by khim (subscriber, #9252) [Link]

Well, that's the problem: PID 1 needs cgroups to manage services and container manager (presumably well-tested and quite privileged) needs them too. Why could not they both use cgroups? Why one must be client of the other one? That requirement was never properly explained.

Another daemon for managing control groups

Posted Dec 11, 2013 22:08 UTC (Wed) by nix (subscriber, #2304) [Link]

See Ian Jackson's 'userv'. I wish it had got more traction than the "none" it did...

Another daemon for managing control groups

Posted Dec 13, 2013 10:07 UTC (Fri) by gioele (subscriber, #61675) [Link]

> If Lennart were designing Linux security then he'd rip out SUID bits and create a SuidD that would provide DBUS-based services to start SUID processes.

That already exists and is widely used in RedHat and Debian systems: Polkit. https://en.wikipedia.org/wiki/Polkit

Another daemon for managing control groups

Posted Dec 10, 2013 17:59 UTC (Tue) by jubal (subscriber, #67202) [Link] (4 responses)

You might want to read the whole of my comment (yes, it contains two rather long quotes) and judge for yourself.

The italicised note was just a commentary, not a point in itself. The quotes were the point.

Another daemon for managing control groups

Posted Dec 10, 2013 20:58 UTC (Tue) by rgmoore (✭ supporter ✭, #75) [Link] (3 responses)

I did read your comment, including the long quotes. I think Lennart is making a cogent point about why it makes sense to roll cgroup control into systemd. He might have been more polite about it, but his technical point seems sound. You have not yet addressed that technical point.

Another daemon for managing control groups

Posted Dec 11, 2013 13:27 UTC (Wed) by jubal (subscriber, #67202) [Link] (2 responses)

I don't see any technical point in telling other customers of the cgroups subsystem (and Google is probably the largest one at all, and will be in the foreseeable future) that their only options are to start using systemd or match its behaviour to the iota, just because it's more convenient for the authors of the systemd environment.

Another daemon for managing control groups

Posted Dec 11, 2013 15:23 UTC (Wed) by raven667 (guest, #5198) [Link] (1 responses)

What do you mean "match its behavior"? Do you mean having a system with a single writer or do you mean it's DBUS API? If you aren't using systemd as PID 1 and you don't need your userspace cgroup management client portable for other peoples who do run systemd then what systemd does or doesn't do is of no consequence to you, right? If anything you need to work with the kernel developers to make sure you understand their concerns and they understand your use cases. As has been pointed out many times the idea of a single userspace cgroups manager was an idea the kernel team had to remove cgroupfs as an attack surface for untrusted customer containers, they only wanted cgroupfs to provide the mechanism for changing settings and not also complicating the internals by encoding the security policy in the kernel.

Another daemon for managing control groups

Posted Dec 11, 2013 16:10 UTC (Wed) by jubal (subscriber, #67202) [Link]

That's grand and I don't think we're really in disagreement.

You may note I was referring to an old e-mail exchange (June, I believe), where Tim Hockin proposed to use a low-level library to provide basic cgroup management functions, a library that would be then used by both systemd and any non-systemd cgroup manager.

This approach was, as you may see, rejected by Mr. Poettering et co. as not viable and possibly non-constructive.

Another daemon for managing control groups

Posted Dec 8, 2013 20:36 UTC (Sun) by cas (guest, #52554) [Link] (19 responses)

> systemd is not an text editor, interactive MUA [...]
> Neither is it a web server, MTA, news server, IRC daemon, or any of
> a myriad of other things

not yet, give it time, be patient. you can't expect it to have *everything* in only a few years.

but it's well on the way, it's already assimilated logging, and there is no credible argument at all for that to be part of init or PID1.

Another daemon for managing control groups

Posted Dec 8, 2013 20:50 UTC (Sun) by mpr22 (subscriber, #60784) [Link]

And funnily enough, the systemd-journald executable does not run as PID 1 and is not a symlink to the systemd executable.

Another daemon for managing control groups

Posted Dec 8, 2013 21:25 UTC (Sun) by anselm (subscriber, #2796) [Link] (17 responses)

Systemd's logging functionality is neither part of init nor PID1. The journald process is separate from the init process.

What systemd's init process does as far as logging is concerned is arrange for output from a service process's stderr channel to go to journald (if the service process doesn't talk to journald directly either via the syslogd interface or else journald's native C library interface). This is (a) very convenient for the authors of service processes, and (b) hardly a difficult or problematic thing to do in principle. Thus it is probably worth the (little) additional complexity in PID 1.

Another daemon for managing control groups

Posted Dec 10, 2013 2:13 UTC (Tue) by cas (guest, #52554) [Link] (16 responses)

you say that as if it makes any practical difference at all. it doesn't.

can i run systemd as an init system *without* journald? no. systemd requires journald. it does not matter whether it forks a separate process or not.

and it does not help at all that systemd pushers say "just run *another* log daemon *as well as* (NOT instead of) journald if you don't like it". here's a free clue: someone saying "i don't want to run journald" is NOT saying "i want to run journald PLUS something else".

at best, if i play with the config, i can tell it not to store any log data on disk - but it's still running.

ditto with logind - WTF is the init system messing with logins? this is not something that an init system needs to do. it's just more systemd borg assimilation of everything.

and it's already got a half-arsed systemd/cron - systemd is the borg, it will assimilate everything.

so yeah, i fully expect systemd to have a mandatory web server within a few years (and if you're some kind of idiot that doesn't like it, just run something else on a high port and configure systemd-web to port-forward). and an irc server or text editor and whatever else lennartix feels like borging.

systemd probably wont have a C compiler because Pottering will no doubt invent his own super special language that is so much better than every other language ever that he'll just have to force everyone else to use it by making it mandatory for systemd. after all, he knows better than everyone else, so it's only fair that he gets to make that choice.

Another daemon for managing control groups

Posted Dec 10, 2013 2:33 UTC (Tue) by HelloWorld (guest, #56129) [Link]

> and it does not help at all that systemd pushers say "just run *another* log daemon *as well as* (NOT instead of) journald if you don't like it". here's a free clue: someone saying "i don't want to run journald" is NOT saying "i want to run journald PLUS something else".
And here's another free clue: journald can be considered an implementation detail as long as /var/log/journal doesn't exist. And complaining about implementation details usually doesn't make you look smart.

> ditto with logind - WTF is the init system messing with logins? this is not something that an init system needs to do. it's just more systemd borg assimilation of everything.
logind is completely optional.

The rest of your comment is so utterly stupid trolling that it doesn't deserve an answer.

Another daemon for managing control groups

Posted Dec 10, 2013 2:34 UTC (Tue) by dashesy (guest, #74652) [Link] (11 responses)

someone saying "i don't want to run journald"
Is analogous to "i don't want to eat my apple". It is like whining about kernel threads being running without being manually spawned, even though they do not hurt the performance or anything. Only if one has OCD, should care to control every single process running on a system.
Try it, it is one of the great things that happened to Linux and it is good for you.

Another daemon for managing control groups

Posted Dec 10, 2013 2:47 UTC (Tue) by cas (guest, #52554) [Link] (2 responses)

and here, along with the "response" by HelloWorld is a major part of the problem with systemd advocates.

*EVERY* *SINGLE* *OBJECTION* *TO* *ANYTHING* *ABOUT* *SYSTEMD* is dismissed with exactly the same unjustifiably arrogant 'we know better than you so just shut the fuck up and get with the program' response.

oddly enough, this is not likely to inspire any kind of trust or confidence.

worse, you lie. you say "oh, that's optional". except that it isn't optional. you can't disable journald. and if you try to disable logind or any of the other parts of systemd, you will break something - with a good chance of fucking up systemd so completely that it will fail to boot.

that does not meet the definition of "optional".

Another daemon for managing control groups

Posted Dec 10, 2013 3:10 UTC (Tue) by pizza (subscriber, #46) [Link]

> oddly enough, this is not likely to inspire any kind of trust or confidence.

Oddly enough, neither are raising objections that are plainly not supported by facts and have been debunked ad-nauseum.

Another daemon for managing control groups

Posted Jan 3, 2014 20:41 UTC (Fri) by rodgerd (guest, #58896) [Link]

I think before you go off on another sweary rant about the quality of discussion you might want to consider raising your own.

Another daemon for managing control groups

Posted Dec 10, 2013 7:33 UTC (Tue) by anselm (subscriber, #2796) [Link] (7 responses)

Try it, it is one of the great things that happened to Linux and it is good for you.

Doesn't matter. What matters is that it (a) was developed by Lennart Poettering, and (b) is not like syslogd, so must be bad – who cares if it comes with all sorts of compatibility features?

It's not as if these people could come up with a technical argument why systemd's journal (or indeed systemd in general) isn't a reasonable idea overall if their life depended on it. Bashing the very concept on general principles is their thing, and it is probably best to ignore them until they manage to find something to say that has actual substance.

Another daemon for managing control groups

Posted Dec 10, 2013 7:44 UTC (Tue) by Cyberax (✭ supporter ✭, #52523) [Link]

There is an argument to be made about too tight dependency on journald.

Systemd overall is a great idea, devil is in the details. As usual. And one of these small details is boneheaded attitude towards cgroups sharing with other actors.

Another daemon for managing control groups

Posted Jan 3, 2014 20:43 UTC (Fri) by rodgerd (guest, #58896) [Link] (5 responses)

There's a reasonable objection to be made about binary logging formats - system recovery and general analysis can be vastly more painful with such. It's a toss-up, because in many day-to-day cases journald's filtering is a win. But it's hardly an invalid complaint, unless you're considering "how we use logs to do our job" to be a non-technical complain, which would seem to be stretching.

Another daemon for managing control groups

Posted Jan 3, 2014 20:50 UTC (Fri) by raven667 (guest, #5198) [Link] (2 responses)

This is a reasonable place to do a risk analysis, how difficult is it to run the tool that can handle the file format of journald on a system that may not be working 100%, how often are the local logs not going to be sent via syslog to a central location? I don't think the designers of this would have done it this way unless they were convinced this was still a win but it's a discussion that can be had.

Another daemon for managing control groups

Posted Jan 3, 2014 21:53 UTC (Fri) by dlang (guest, #313) [Link] (1 responses)

rsyslog can get the logs from journald very rapidly, pretty close to real-time.

the designers of journald didn't have any idea what rsyslog, syslog-ng and similar logging daemons were able to do at the time they started journald, their justification of journald is full of "syslog can't do this" arguments that were true of traditional syslog, but not true on any of the modern syslog daemons.

journald works pretty well for a single machine personal system, but it was built in ignorance of how logging works in larger environments.

Another daemon for managing control groups

Posted Jan 3, 2014 22:06 UTC (Fri) by raven667 (guest, #5198) [Link]

> journald works pretty well for a single machine personal system, but it was built in ignorance of how logging works in larger environments.

Maybe you are right but it seems that when the systemd developers do something they do a ton of research before hand before committing resources. In any event it is pretty plain and stated that the journal isn't even trying to deal with networked logging or larger environments, it's main use case is the single personal machine, and capturing logs from early-boot which are normally lost. You don't judge a fish by how well it can climb trees.

Another daemon for managing control groups

Posted Jan 3, 2014 21:50 UTC (Fri) by dlang (guest, #313) [Link] (1 responses)

I would believe this more if there hadn't been a bug that prevented you from accessing the journald binary logs that went undetected for several months until it made it into a Fedora release that included rsyslog reading the files and going into a endless loop doing so (the journald tools also went into an endless loop, so you can't just call this a rsyslog bug)

This tells me that the people working on this are not actually using these tools enough.

Another daemon for managing control groups

Posted Jan 3, 2014 21:54 UTC (Fri) by mchapman (subscriber, #66589) [Link]

> This tells me that the people working on this are not actually using these tools enough.

That is as ludicrous as saying any long-standing bug in the kernel shows that the kernel developers don't use Linux enough.

Bugs happen. People fix them (or at least, *should* fix them) when they discover them, not before.

Another daemon for managing control groups

Posted Dec 10, 2013 13:35 UTC (Tue) by mathstuf (subscriber, #69389) [Link] (2 responses)

I think you're missing some things. Whether willfully or because you inspire knee-jerk reactions which snowball away from the original assertions or whatever, I'll avoid passing judgement for now, but we'll see how your reply goes.

First, yes, systemd provides a PID 1 binary which acts as init. It's not like there aren't other alternatives to sysv init out there (OpenRC, upstart, SMF, and more), so this is nothing particularly new.

Second, systemd has a goal of bringing a single *interface* (if you want to replace components, feel free[1]) to managing a Linux *system*. They also bring an implementation of the interface, but that's easily understood as using an interface before freezing it is rarely a bad thing. Managing a system includes a lot of things these days, like it or not. From hardware (udev) to users (multiseat is supported only in logind/systemd AFAIK and requires hardware awareness), these things all need to be coordinated by *something*. So where do you propose that be done? Systemd developers have laid out interfaces for you to adhere to if you want to replace individual components at will. They choose to do it in (or near, more often) PID 1. You would probably want it to be some service, but then you have to tie it with PID 1 anyways for dealing with services which trigger on hardware or network events anyways. Not to mention some FD passing to PID 1 to pass to the service. That's a lot of code that can be avoided if your aim is to minimize what PID 1 does (and still meet the use cases).

Third, for the (limited) cron features, PID 1 knows these things already (when a service started, when it last finished, etc.) and best. You're introducing more code by writing a daemon which does the same thing because now you have to reach from to watch services dying and showing up as well. Either way is more code, but now I can say "check email every 5 minutes" (as a user-level .timer unit) based on when I log in, not the clock values. What would a crontab entry which does that look like?

Last, you do like strawmen (e.g., I see no indication of systemd gaining an IRC server, but a link could help here) and name calling (Borg). If you could keep your arguments technical in nature, maybe you'd have a clearer discussion with others about your issues.

What it sounds like you want to do is remove components entirely, ignoring the use cases they cover because you don't care for them or can't see why they might be important. Try reading the rationales behind the components. Lennart is very clear when it comes to that. As an example, journald is there because syslog *is not available* at early boot. Would you rather PID 1 to start caching log messages (meaning more code), then swamp syslog as soon as it gets to that point? Or stick with the "no logs before syslog" behavior of today (or maybe that should be yesterday; systemd solved this over a year ago)? I guess you could delay everything until after syslog starts, but then you are sticking a delay in where nothing can be parallelized.

[1]Go ahead and implement the journald interface which just forwards stuff to syslog if you want. The interface has to be there, not systemd's implementation. But of course, journald already does that and by default usually I might add (journald persistence wasn't default in Fedora for a while, but probably will if/when rsyslog is kicked out of @base).

Another daemon for managing control groups

Posted Dec 11, 2013 22:18 UTC (Wed) by nix (subscriber, #2304) [Link] (1 responses)

Would you rather PID 1 to start caching log messages (meaning more code), then swamp syslog as soon as it gets to that point?
If your syslogd cannot cope with a splurge of on-boot-time log messages, your syslogd is unbelievably awful (and would never be used).

Straw man.

Another daemon for managing control groups

Posted Dec 11, 2013 22:59 UTC (Wed) by mathstuf (subscriber, #69389) [Link]

I'll grant that a decent syslog should be able handle the initial logs, but that still means PID 1 is caching things.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds