Resource management in KDE

October 19, 2020

This article was contributed by Marta Rybczyńska

Applications that run on the Linux desktop have changed significantly under the hood in recent years; for example, they use more processes than before. Desktop environments need to adapt to this change. During Akademy 2020, KDE developers David Edmundson and Henri Chain delivered a talk (YouTube video) about how KDE, working with other desktop environments, is starting to use advanced kernel features to give users more control over their systems. This talk complements a presentation by GNOME developers that was recently covered here.

Process-management issues in Plasma

Edmundson started by explaining that the job of a desktop environment is deliver applications to the user. Users "need to be in control", he said. That role has become more complicated in recent years. Some time ago, when a user was running a web browser like Firefox or a chat application like Kopete, the management of running processes was easy. The user could run a ps command and would see just one line of output for each of those applications. This was easy to understand and self-explanatory.

Now, the situation is "very different". When a user opens a Firefox instance they can get a dozen processes; Discord in a Flatpak ("because it is cool now") launches 13 processes. The ps output is unreadable; it consists of "random names doing random things". Just understanding that output is difficult; aggregating the results to get an idea of how much CPU time or power the application is using has become even more challenging. There is thus a need to track processes properly in desktop environments, since the available data no longer means anything. We "need some metadata", Edmundson concluded.

Fairness is also an increasingly important issue. Edmundson gave an example of Krita, an advanced graphics application. It performs some heavy processing, all contained within a single process. On the other hand, Discord has those 13 processes, many of which will be making heavy use of the CPU "because it is written in Electron". The system's CPU scheduler will see those two applications as 14 opaque processes, not knowing what they correspond to. This means that Krita could get only 1/14 of the available CPU time, even though it represents half of the applications running. Metadata about running applications needs to propagate through the whole software stack to be available to the scheduler, he said.

One of Plasma's tasks is mapping windows to applications. More precisely, it tries to map windows to their associated desktop files — the configuration files containing metadata that are used, for example, to create menu entries. Applications open windows and "we hope we can match it all up". The Plasma developers use a lot of hacks and heuristics to perform this matching, but "we do not like guessing", he said. He made an example of a Firefox window being used to watch an Akademy talk like his. There is an audio icon inside that window, but this icon is not managed by the same process as the one controlling the outer window, he explained. Plasma needs to find the link between them, and "it is an arbitrary guessing game".

Introducing control groups

Chain came in to explain that "the good news is that this is essentially a solved problem". The solution exists in the form of control groups (or cgroups), a kernel feature available since the 2.6.24 kernel release in January 2008. Control groups enable a partitioning of the process space with a hierarchical structure. Originally cgroups were designed for servers; there is a lot of use in containers, he added.

Resource controllers can be attached to control groups; there are three main ones: CPU, memory and I/O (for disk access). Their basic functions are to limit overall resource use, influence the scheduler, and account for resource use at the group level. Another feature of cgroups is a finer control over the out-of-memory policy. The CPU controller influences scheduling through "weights", which can be defined in relation to siblings in the process tree. Groups with a higher weight receive proportionately more CPU time. For the desktop, the weight strategy is more interesting than setting priorities with nice. Among other things, weights can be used to raise a process's priority within a control group as well as to lower it.

Cgroups can be controlled directly from sysfs, but in practice there is a need for a central daemon to manage the placement of processes withing groups. In Linux systems, systemd has been supporting that "forever", Chain said, adding that every systemd unit is placed into its own cgroup already. Systemd is capable of delegating management of the cgroup tree to another user-space process; this is done by a separate systemd instance. Systemd also provides a D-Bus interface that can be used from user space, allowing cgroups to be controlled from any application. The configuration issue is also solved, as it can be configured at several levels.

Cgroups were redesigned in 2015 to solve a number of problems; the resulting API is called cgroups v2. However, most distributions configure systemd to work using a hybrid hierarchy (described in this document) that does not use all of the new features. All modern distributions except Fedora use this model at this time. In consequence, systemd is usually not capable of grouping at the user level.

Then Chain explained that even cgroups v2 has some constraints. For example, inner nodes cannot contain processes; the reason is to keep accounting simple "for the kernel". This includes also a case where a cgroup has other cgroups as children. In this case it is not possible to attach processes to the parent cgroup. Those inner nodes are called slices in systemd.

Configurations for cgroups can be prepared in advance. In the case the configuration is not available, there are two ways to leverage systemd for new applications. One of them uses scopes: the process is launched as usual, and then systemd is asked to put it into a cgroup. A scope thus contains a group of processes that were started separately from systemd. The other mode uses services: like typical daemons, in this case systemd launches the application and manages its lifetime.

Chain showed an example output of the systemd-cgls command, which shows the control-group hierarchy. This output (seen on the right or in the slides from the talk [PDF]) shows slices and services, the systemd user instance is also visible (as the user:@1000.slice) entry.

Edmundson explained that Plasma 5.19 spawns every application in its own cgroup. This is a tiny part of the code in KIO, he said, but it required a lot of effort across the KDE code base to unify the methods for launching applications. "This is something that became fragmented in the last 20 years", he noted, "so it was worth unifying anyway". The developers found dozens of edge cases that needed to be fixed. Their goal is to do the migration safely.

Introducing cgroups is a sizeable change; the first step is to start using cgroups with scopes with as few changes from the previous setup as possible. Application processes are started, then they are tagged as belonging to a cgroup. Once it is all working, developers will be able to start using more metadata. For now, the scheduler is using it already. Taskmanager patches are in review, and libksysguard can expose this information to the user. Edmundson showed the old ksysguard alongside a rewritten version; they show the same information, but organized differently. In the new version, the main list shows only the application name; the processes list is shown separately. Aggregation is more reliable than before, he noted.

Cooperation with GNOME

"Plasma doing things on its own isn't going far", Edmundson said as he explained how the collaboration with other projects was put in place. The Plasma team started working on the problem and talking to GNOME developers. It turned out that both teams wanted to do the same thing (albeit for different reasons — GNOME was concentrating on out-of-memory handling). They started working together, unifying the tagging of desktop files, which map application names to cgroup names. They came up with a common set of slices the applications will be running in. The KDE and GNOME teams met with some kernel developers to work on this problem; they came up with a common solution. "It is going to be universal", Edmundson said.

Upcoming features

The basic functionality is currently working and the Plasma team is planning its next steps. One of them consists of statically assigning weights to some services and programs to control the CPU time they use. There will be three slices, one of which will be a specific session slice for KWin and Plasma. This is especially important on Wayland, where "KWin is doing some very important jobs". Another slice will include all user-run applications; they will be competing for system resources within their slice. Finally, the background slice will include processes (like file indexing) that need to run at some point but which can run at a lower priority.

In the future, users will be able to apply limits to applications, For example, it will be possible to limit the file indexer to only 10% of the CPU, set out-of-memory parameters, and stop fork bombs from happening.

Chain has been working on changing resource allocation based on run-time information. He created a small library wrapping the systemd DBus API to expose resource control for currently running applications. The audience was encouraged to take a look. The library includes a demo application (YouTube video available), which talks to the window manager to get the currently focused application in order to change its weight. This feature is useful for improving battery life on mobile systems, Chain explained in a response to a question from the audience.

Another upcoming change is to move from scopes to systemd services. The process lifetime management will be done by systemd then, with extra configuration available compared to scopes. The use of services also avoids a bug in systemd older than 238 where applications could not be grouped into scopes. The implementation of this is already ready, hidden behind a flag, but the developers have run into issues with systemd and the experience is not as good now as using scopes. More work is needed also to get crash handling working, Chain added.

The remaining work is converting the rest of Plasma, including the background services. The goal is to have the Plasma session as a systemd unit. Another goal is to make the experience familiar for sysadmins. That was requested by a distributor, so that there is no new layer to learn and administrators could continue to use familiar tools. The desktop files are being converted "magically" to systemd, thanks to the work from Benjamin Berg (from the GNOME team). Also, all environment variables should "just work".

Conclusion

The KDE and GNOME teams have accomplished some important tasks to prepare the desktop to use cgroups. As the basic infrastructure has already been laid out, users may check out how those improvements influence the overall experience right now. More work is still needed to complete the work, and we should be witnessing improvements in this area in the months and years to come.

Index entries for this article
GuestArticles	Rybczynska, Marta
Conference	Akademy/2020

Resource management in KDE

Posted Oct 20, 2020 11:03 UTC (Tue) by ovitters (guest, #27950) [Link] (7 responses)

How well does this work under cgroups v1 and v2? Does some of this work in v1? Is it advisable to have distributions switch to v2? The article goes into v1 and v2 and the article might explain what is possible under v1 vs v2 but not clearly enough for me.

For Mageia I only noticed comments that v2 does not work nicely with some software, plus v2 doesn't seem to provide benefits. In case KDE and GNOME have features that rely on v2 then someone might spend the effort of switching to v2. Though I'd expect KDE and GNOME to let distributions know (e.g. distributor-list for GNOME).

Resource management in KDE

Posted Oct 20, 2020 16:30 UTC (Tue) by Conan_Kudo (subscriber, #103240) [Link] (4 responses)

The current assumption with the cgroup code in GNOME and KDE Plasma is that cgroup v2 is in use. That is, the unified hierarchy is enabled (as systemd calls it). Today, that is active in Fedora (since Fedora 31), and I expect it will be enabled in openSUSE Tumbleweed soon.

Resource management in KDE

Posted Oct 20, 2020 21:43 UTC (Tue) by ms (subscriber, #41272) [Link] (3 responses)

Last I checked (~2 months ago), Docker still requires v1. So for a huge number of us doing development work who sadly depend on Docker, the hybrid "unified" hierarchy is the furthest you can get to turning v1 off.

Resource management in KDE

Posted Oct 20, 2020 21:45 UTC (Tue) by Conan_Kudo (subscriber, #103240) [Link] (1 responses)

Moby supports cgroup v2 in the mainline code, it just hasn't been released yet. Once the first Moby/Docker 20.xx release arrives, that should be resolved. Alternatively, you can ship a snapshot release...

Resource management in KDE

Posted Oct 21, 2020 8:17 UTC (Wed) by ms (subscriber, #41272) [Link]

Very useful to know - thank you.

Resource management in KDE

Posted Oct 20, 2020 23:05 UTC (Tue) by intelfx (subscriber, #130118) [Link]

There is Podman which is mostly compatible to Docker and supports unified hierarchy (via crun runtime) for some time now. This is how I solved this problem for myself.

Resource management in KDE

Posted Oct 21, 2020 13:20 UTC (Wed) by hchain (guest, #142655) [Link]

cgroups v2 is the intended successor, hybrid hierarchy is only a transition.
New cgroup-related features are only happening on v2, both in the kernel and in systemd.
The "processes only in leaf nodes" rule, while a constraint, actually makes things easier to manage and reason with.

This page https://systemd.io/CGROUP_DELEGATION/ has a lot of details on cgroups v1 vs v2 from a systemd point of view.
Highlight: "To say this clearly, legacy and hybrid modes have no future. If you develop software today and don’t focus on the unified mode, then you are writing software for yesterday, not tomorrow."

Another limitation is that managing user resources through systemd will never be possible with cgroups v1 / hybrid hierarchy, as this would require delegation to the user systemd instance, and "Delegation to less privileges processes is not safe in cgroup v1 (as a limitation of the kernel), hence systemd won’t facilitate access to it."

Also, docker (git master) finally supports cgroups v2, which should be in the next release (https://github.com/moby/moby/milestone/76). AFAIK, most other container runtimes have had support for a while.

Resource management in KDE

Posted Oct 21, 2020 16:05 UTC (Wed) by d_ed (guest, #142332) [Link]

Good question.

Super summary is:
- We work with both, systemd abstracts the settings
- Our system monitor works with both
- Better "fairness" should work with both
- Applying additional resource constraints (CPU/Memory/etc) is V2 only and within that some features relies on additional setups or versions.