LWN.net Logo

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Linux.com has a high-level look at control groups (cgroups), focusing on the problems with the current implementation and the plans to fix them going forward. It also looks at what the systemd project is doing to support a single, unified controller hierarchy, rather than the multiple hierarchies that exist today. "'This is partly because cgroup tends to add complexity and overhead to the existing subsystems and building and bolting something on the side is often the path of the least resistance,' said Tejun Heo, Linux kernel cgroup subsystem maintainer. 'Combined with the fact that cgroup has been exploring new areas without firm established examples to follow, this led to some questionable design choices and relatively high level of inconsistency.'"
(Log in to post comments)

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 15, 2013 18:53 UTC (Thu) by vrfy (subscriber, #13362) [Link]

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 15, 2013 21:08 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

Yeah, and Tejun Heo with Lennart Poettering in the best traditions of cgroups development went on and made an uber-iditic decision to stop using traditional Linux API and instead create an ugly monster API based on SystemD.

Do they even *THINK* about what they're doing?

If they replace filesystem-based API then we lose:
1) Access control.
2) Ability to delegate subtrees to unprivileged users of VMs.
3) Familiar API.
4) Ease of system audit and setup.
5) Namespaces? Never heard of them!

And no, whines about "but cgroups can't really delegate stuff now" go to /dev/null - I can delegate memory and cpuacct cgroups just fine. So go on and fix the rest of cgroups to be delegatable.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 15, 2013 21:19 UTC (Thu) by deepfire (subscriber, #26138) [Link]

The boldness of our dear winners is going to write history, again. Minor details like these will be forgotten.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 15, 2013 21:47 UTC (Thu) by ovitters (subscriber, #27950) [Link]

LWN could do without "uder-iditic", "traditional Linux API", "ugly monster", "whines" and incorrect spellings like SystemD (it is systemd). Thanks in advance.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 15, 2013 22:18 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

It's a little bit trollish, sure. But nevertheless, it's true.

Right now cgroups is a traditional filesystem-based API which can be manipulated by pretty much anything. It supports usual delegation - just set the appropriate permissions on the filesystem and bind-mount it in a namespace.

Tejun Heo wants to replace it with a single daemon. No plans for delegation or access control are presented so far.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 8:21 UTC (Fri) by ovitters (subscriber, #27950) [Link]

I am and was not disagreeing with what you said. I know too little to have a real opinion on this. :P

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 6:15 UTC (Sat) by Rudd-O (subscriber, #61155) [Link]

You're blatantly misinforming people.

Stop.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 15:47 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

No, I'm not.

There's a plan to add a new kernel option ('insane_behavior=1') to cgroups, which will:
1) Join them into a single hierarchy.
2) Disallow changing ownership and access rights.
3) Lock cgroups for modifications, except by one process.

So yes, technically filesystem interface won't go away. It just will be completely useless.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 14:27 UTC (Fri) by SEJeff (subscriber, #51588) [Link]

Funnily enough, I agree with both of you. Cyberax's points are entirely valid. As someone who likes to use cgroups and their various incarnations to manually carve up systems for maximum performance and utilization, the echo / mkdir / rmdir based interface is a breeze. Rolling this all into systemd is not going to negate, but make what I currently do quite easily very very difficult.

Myself and many others in the HPC industry are likely not pleased with this decision whatsoever. That being said, I do love systemd overall, just not this specific part of it.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 15:09 UTC (Fri) by ovitters (subscriber, #27950) [Link]

So make noise, but make sure it is the right noise. Saying the decision is terrible for X,Y,Z is ok. But as soon as it includes "OMG Lennart" or anything similar I'd just zone out and not listen anymore. Which is bad if your criticism was valid&important.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 16:13 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

Well, "OMG Lennart" is also a good way to start a flame war which might attract interest to the topic :)

BTW: I really like systemd/journal/etc. overall.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 15:36 UTC (Fri) by raven667 (subscriber, #5198) [Link]

Engaging with the cgroups and systemd maintainers where they congregate so that you understand the problems they are facing and they understand the problems you are solving is going to be the best way to make sure that your use cases are handled in ways that are acceptable to you. The reason that systemd maintainers are at the head of the table is that this software is the most visible and widely deployed user of the cgroups interface and they are engaged with the cgroup maintainers. Anyone who relies on this interface who sits back and waits is going to be disappointed and frustrated if their needs aren't anticipated and met.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 16:12 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

I think people told them multiple times that this is a bad idea, including guys from Google and other Linux maintainers.

Maybe talking with them at the Kernel Summit will help?

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 6:16 UTC (Sat) by Rudd-O (subscriber, #61155) [Link]

Cgroups are not being rolled into systemd. You will still be able to do cd mkdir and mv and whatever.

Stop misinforming people.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 19:46 UTC (Fri) by hnaz (subscriber, #67104) [Link]

Nobody is replacing the filesystem interface. The problem is just that because it IS a filesystem interface, people added knobs and exported information like there is no tomorrow. We would never have been this reckless with system calls even though both are unchangeable ABI.

While the rework of the cgroups interface is coordinated with systemd people, cgroups won't be based on or depend on systemd.

Cgroups are going to be more consistent across controllers and they are going to assume a single agent setting up and administrating a group (could be systemd, could be your custom management script, could be you from the shell). There is a ton of cruft in the code that bends over backwards to support multiple users accessing the control and info knobs _at the same time_ but nobody is doing that anyway.

You can still delegate if you like, but delegate to a single entity.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 20:09 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link]

> Nobody is replacing the filesystem interface.
They do (by switching to the 'single writer' model). And that's the problem.

> While the rework of the cgroups interface is coordinated with systemd people, cgroups won't be based on or depend on systemd.
Yeah, sure. Right now cgroups do whatever systemd people want, and damn the consequences.

> Cgroups are going to be more consistent across controllers and they are going to assume a single agent setting up and administrating a group (could be systemd, could be your custom management script, could be you from the shell).
Yeah, so there'll be no way for my user processes to setup their own cgroups.

And I actually DO use this. For example, I have a 'deeptime' utility to get the accurate runtime for a process tree which works by creating a cpuacct cgroup.

>There is a ton of cruft in the code that bends over backwards to support multiple users accessing the control and info knobs _at the same time_ but nobody is doing that anyway.
I understand, it's better if nobody (or only systemd, as a concession) has access to cgroups.

Also, somehow, /proc and /sys filesystems work without single entities controlling access to them.

> You can still delegate if you like, but delegate to a single entity.
How? And what if I don't WANT a single entity?

Don't get me wrong, I'm all for general cgroups interface cleanups. There are lots of missing or inconsistent features (there's no way to get notifications when cgroup becomes empty, for example). Multiple tree structure also complicates things - I totally understand that. IMO, merging blkio and memory controller totally makes sense.

But then we have freezer controller and cpuacct. I fail to see how merging them into the single controller would help. There are no places in the kernel where it's unclear if a resource should belong to cpuacct/freezer or to some other controller (like in the case of blkio and memcg). So why are they being merged?

For example, right now I'm using cpuacct to measure time spent in various kinds of processes (i.e. 10 cpu-seconds were spent in 'make', 20 cpu-seconds were spent in 'bash', etc). How am I going to do this with single cgroups, assuming that I need to gather statistics across several memcg system partitions?

Then there's freezer controller - we use it to atomically pause a graph of dependent processes (e.g. a Postgres database and its clients) in case of low-memory conditions. How are we going to do this with the single tree model?

So yes, I see that the development of cgroups follows the usual fouled-up course of previous cgroups development. Second-system syndrome in full force.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 8:52 UTC (Sat) by jospoortvliet (subscriber, #33164) [Link]

See, now you've done it. By opening with a trollish statement, nobody replies to your actual description of the problems you have - you got yourself ignored.

And as innocent bystander who's always interested in following debates like these, I'm deprived from a reply just as you are ;-)

Yet I totally understand that you started with that statement - I did the same two days ago and now nobody listens to my legitimate complaints about $SUBJECT anymore :D

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 6:17 UTC (Sat) by Rudd-O (subscriber, #61155) [Link]

Thank you. Cyberax has been persistently spreading disinformation and outright lies about what's going to happen. These people are toxic.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 17, 2013 10:41 UTC (Sat) by jubal (subscriber, #67202) [Link]

You might want to actually read what he wrote before you start to denigrate him; ritually repeating phrases about FUD, misinformation and toxicity does not award you any points (and makes you look like a toxic personality, mind).

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 16, 2013 18:01 UTC (Fri) by landley (guest, #6789) [Link]

I was interested right up until it mentioned systemd.

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 18, 2013 4:18 UTC (Sun) by psiekl (subscriber, #74032) [Link]

The hierarchical nature of cgroup means users can change permissions on subdirectories and give access to a non-privileged security domain, ie non-root users, Heo said. This, in turn, means an individual application can interact directly with the cgroup filesystem and access the kernel control knobs, effectively exposing the raw knobs to the full kernel API without the required review.

The same could be said about /dev - is that next on the chopping block?

All About the Linux Kernel: Cgroup's Redesign (Linux.com)

Posted Aug 18, 2013 14:19 UTC (Sun) by Cyberax (✭ supporter ✭, #52523) [Link]

Just you wait. Very soon /proc, /sys and /debug will be rolled into systemd.

/dev is already rolled in (in form of udev).

containers: lxc and systemd

Posted Aug 21, 2013 22:14 UTC (Wed) by lbt (subscriber, #29672) [Link]

I'm interested in running systemd based containees (? the thing inside the lightweight container) in non systemed distros - which essentially means using lxc.

I expect others would like to run non systemd based containees inside a systemd system.

As I understand it this is not going to be possible - which seems to be a rather significant regression in linux light containers.

Any information on this?

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds