RealtimeKit and the audio problem
Audio skipping can have a number of causes. If the stream is traveling over the net, those causes may be entirely external to the system involved. The rest of the time, resource contention is usually the problem. And, often, the limiting resource is the CPU. The response time requirements for audio are not especially tight, but, if the processor gets busy with something else for too long, it's still possible that the system will fail to get audio data to the hardware before the stream runs dry. If the sound card runs out of sound to play, it will go silent, thus introducing a skip into the audio stream. Even if the skip improves the material being played (one of the current flood of Michael Jackson retrospectives, say), users still tend to get unhappy.
Efforts to improve audio latency have taken several forms. One is the ongoing effort to identify and fix latency problems throughout the kernel. New scheduling algorithms (the completely fair scheduler, for example) have also helped. But, even after all that work has been done, there seems to be little alternative to running core audio processing code with realtime scheduling priority. And that is where the trouble starts.
More than four years ago, the audio community came forward with its proposed solution to the problem: the realtime security module. This module used the Linux security module (LSM) API to allow the system administrator to grant realtime scheduling privileges to specific users and groups. This solution slightly reduced the security of the system (opening it up to denial of service attacks by the privileged users), but it also solved the latency issues in ways which made audio developers happy.
Kernel developers didn't like this approach, though. The seeming misuse of the LSM API - which is supposed to only limit privileges, never enhance them - was part of the problem. But the realtime module just looked like an ad hoc solution that would not stand the test of time. It was never merged; the kernel developers, instead, opted for an approach based on resource limits. As of 2.6.12, any process is allowed to set a realtime priority up to the value of its RLIMIT_RTPRIO limit. By default, this limit does not allow any realtime scheduling at all, but the system administrator can change the default for specific users or groups by editing the limits.conf file read by the PAM subsystem.
This feature would seem to solve the problem, and, indeed, the media-focused distributions make good use of it. The major distributions tend not to use RLIMIT_RTPRIO, though, because it makes it so easy for a process (malicious or just buggy) to completely freeze the machine. Once a process with realtime priority goes into a tight loop, there is little that the user or administrator can do to stop it short of hitting the reset button. That sort of behavior creates unhappy users and inflammatory bug tracker entries - both things that distributors hate. So those distributors have mostly avoided enabling this feature.
More recent kernels have seen the addition of features which could mitigate this problem somewhat. A new limit (RLIMIT_RTTIME) sets an upper bound on the amount of time a realtime process can monopolize the CPU; after that time, it must make a blocking system call or, eventually, be killed by the kernel. This limit solves the rogue process problem, but it does little against deliberate denial-of-service attacks, which can get around the limit by continually forking new processes. As a result, RLIMIT_RTTIME doesn't make nervous distributors feel much better.
The other feature of note is realtime group scheduling, which allows a group of processes to be given a "realtime bandwidth" value limiting the amount of available CPU time the group can use at realtime priority. That, in turn, limits ability of the group as a whole to completely take over the system. The group scheduling feature looks like it should be a complete solution to the problem; it allows groups of processes to be given access to the realtime scheduler while limiting their ability to affect the overall operation of the system.
When Lennart Poettering set out to solve the audio skipping problem, though, he didn't use any of the above solutions. The security issues associated with resource limits were more than he was willing to deal with, and he describes the group scheduling feature this way:
It is true that a process cannot simultaneously be in a container-related control group and an audio-related control group if both groups want to control scheduling-related parameters.
Lennart's proposal is a new daemon called "RealtimeKit"; it has already found its way into the Fedora Rawhide distribution. The RealtimeKit daemon has a relatively simple job: it grants realtime scheduling priority to processes in response to requests sent via D-Bus. There are, of course, some catches:
- Any process requesting realtime priority must have the
RLIMIT_RTTIME limit set to ensure that it cannot completely
take over the system.
- There are administrator-set limits on the number of processes which can
be running with realtime priority at any given time.
- The requesting process must have the SCHED_RESET_ON_FORK policy flag set in the kernel.
The SCHED_RESET_ON_FORK flag is implemented by a kernel patch written by Lennart and accepted into Ingo Molnar's "tip" tree; this patch has not, as of this writing, been merged for 2.6.31. This flag, when set, prevents any child processes from inheriting any enhanced scheduling priorities from the parent. It thus is effective against fork bombs and other multi-process attacks; it allows RealtimeKit to give realtime priority to a single process in the knowledge that this priority will not be passed on to any others.
As a solution, RealtimeKit looks like it should work, but its reception in the audio development community was chilly at best. As Paul Davis put it:
There are a number of reasons for the objections to RealtimeKit. It adds a dependency on D-Bus regardless of whether audio developers want it. It's far from a POSIX interface. RealtimeKit takes certain decisions (such as whether to use the round-robin or FIFO scheduling classes) out of the developers' hands. It's not sufficient for the needs of pro audio users. And so on. But the real complaints would appear to be these:
- The audio community feels a little burned. They were told four years
ago that they needed to drop their preferred solution (the realtime
security module) in favor of the rlimit-based solution, which is now,
in turn, being pushed aside for RealtimeKit. How long will it be,
they wonder, until yet another solution is put forward as the real
answer?
- Nobody asked the audio developers how they would like a solution to look; instead, RealtimeKit was simply presented to them as the new way to go.
Lennart's response suggests that he's not likely to go asking the Linux audio development ("lad") community for input in the future:
More seriously, he points out that RealtimeKit does not break the existing rlimit-based system. Instead, it just adds an option for distributors who want to make realtime scheduling available to specific processes in a safe way. So nothing which works now will stop working under RealtimeKit. That is little comfort, though, to developers and users who feel that they will be forced to run RealtimeKit to use their audio applications in the future.
The Linux audio community contains no end of highly talented and highly motivated developers. But audio support under Linux still falls short of what it should really be. Audio development has suffered from a lack of consensus on solutions, a lack of communications between different development communities, and a lack of a maintainer with a view of the full problem. So it is not surprising that our audio applications don't always play well together and don't always work as well as we would like.
Lennart has dedicated a good part of his life toward improving this situation. A certain amount of controversy and pain has accompanied this work; one need not look very far to find no end of PulseAudio horror stories. But PulseAudio seems to be getting better, and Linux may well be getting closer to having basic (non-professional) audio "just work" out of the box. The goals of this work are hard to criticize, and criticism of the results might just be on the wane.
Perhaps RealtimeKit is the missing link which will enable distributors to improve audio responsiveness; Fedora looks to be the laboratory in which this particular experiment will be conducted. If RealtimeKit works as advertised, the audio community will eventually move to make use of it - regardless of whether they were asked ahead of time or not. For better or for worse, that's often how our community works: problems are solved not by talk, but by a determined developer who creates code that works.
(See this
README file for more information on RealtimeKit).
