The development cycle for GStreamer 1.0 took longer than many (including some in the project itself) had originally anticipated. A big part of the reason was the GStreamer team's desire to deliver a stable and well-rounded 1.0 release — but that does not mean that the 1.0 milestone designated a "completed" product with no room for improvement. Several sessions at the 2012 GStreamer Conference in San Diego explored what is yet to come for the multimedia framework, including technical improvements and secure playback for untrusted content.
GStreamer release manager Tim-Philipp Müller gave the annual status report, which both recapped the previous year's developments and set the stage for what lies ahead. Several new content formats need support, including the various flavors of Digital Video Broadcasting (DVB) television (all of which are based on MPEG-2), the new streaming standard Dynamic Adaptive Streaming over HTTP (MPEG-DASH) in both server and client implementations, and the 3D Multiview Video Coding (MVC) format. There is also room for improving the MPEG Transport Stream (MPEG TS) demultiplexer, where Müller said "lots of stuff is happening," to the point where it can be confusing to follow. GStreamer also still lacks support for playlists, which is a very common feature that ends up being re-implemented by applications.
In addition to the media formats, there are several additional subtitle formats that the framework needs to support. But subtitle support requires extension in other areas as well, such as porting all of the subtitle elements to the new overlay compositor API, which allows an application to offload compositing to the video hardware. A related feature request is for a way to overlay subtitles at the native resolution of the display hardware, rather than at the video content's resolution. The two resolutions can be different, and to be readable subtitles should be rendered at the sharpness provided by the display resolution. The project also wants to expose more control over subtitle rendering options to the application, again to provide smarter choices and clearer rendering.
Hardware accelerated rendering has taken major steps forward in recent releases, but it, too, has room for improvement. Müller mentioned NVIDIA's Video Decode and Presentation API for Unix (VDPAU) as needing work, and said the libva plugin that implements Video Acceleration API (VA-API) support needed to be moved to the "good" plugin module and be used for playing back more content. He also said more work was required on GStreamer's OpenGL support. Although OpenGL output is possible, a lot could be done to improve it and make integration more natural. For starters, all OpenGL-based GStreamer elements must currently be routed through special glupload or gldownload elements; being able to directly connect OpenGL elements to other video elements would simplify coding for application developers. Second, OpenGL coding is easier when operations remain in a single thread, which conflicts with GStreamer's heavy use of multi-threading. There is a long list of other proposed OpenGL improvements, including numerous changes to the OpenGL structures.
The project is also intent on playing better with other device form factors, including set-top boxes and in-vehicle systems. In some cases, there is already outside work that simply needs to be tracked more closely — for example, the MeeGo project's in-vehicle infotainment (IVI) platform wrote its own metadata extraction plugin that reportedly has excellent performance, but has not been merged into GStreamer. In other cases, the project will need to implement entirely new features, such as Digital Living Network Alliance (DLNA) functionality and other "smart TV" standards from the consumer electronics industry.
GStreamer developers have plenty of room to improve the framework's day-to-day functionality, too. Müller noted that the new GStreamer 1.0 API introduces reworked memory management features (to reduce overhead by cutting down costly buffer copy operations), but that many plugins still need optimization in order to fully take advantage of the improvements. It is also possible that the project could speed up the process of probing elements for their features by removing the current GstPropertyProbe and relying on D-Bus discovery. There is also room for improvement in stream switching. As he explained, you certainly do not want to decode all eight audio tracks of a DVD video when you are only listening to one of them, but when users switch audio tracks, they expect the new one to start playing immediately and without hiccups.
Some refactoring work may take place as well. A big target is the gstreamer-plugins-bad module, which is huge in comparison to the gstreamer-plugins-good and gstreamer-plugins-ugly. Historically, the "good" module includes plugins that are high-quality and freely redistributable, the "ugly" module includes plugins with distributability problems, and the "bad" module contains everything else not up to par. But plugins can end up in -bad for a variety of reasons, he said — some because they do not work well, others because they are missing documentation or are simply in development. Splitting the module up (perhaps adding a gstreamer-plugins-staging) would simplify maintenance. The project is also considering moving its Bluetooth plugins out of the BlueZ code base and into GStreamer itself, again for maintenance reasons. Post-1.0 development will also allow the project to push forward on some of its add-on layers, such as the GStreamer Streaming Server (GSS) and GStreamer Editing Services (GES) libraries.
Finally, there have been several improvements to the GStreamer developer tools in the past year, including the just-launched SDK and Rene Stadler's log visualizer. Collecting those utilities into some sort of "GStreamer tools" package could make life easier for developers. The project is committed to accelerating its development cycle for the same reason: faster releases mean improvements get pushed out to application authors sooner, and code does not stagnate relying on old releases. Müller announced that the project was switching to a more mainstream N.odd unstable, N.even stable numbering scheme, with the addendum that the framework will stick to 1.x numbers until there is the need to make an ABI break for 2.0.
On a different note, Guillaume Emont presented a session about his ongoing experiments with sandboxing GStreamer media playback. The principal use case is playing web-delivered content inside a browser. The Internet may have been invented to watch videos of cute animals, he said, but that does not mean you should trust arbitrary data found online. In particular, untrusted data is dangerous when used in combination with complex pieces of software like media decoders.
For media playback, the security risk stems from the fact that although the decoder itself should not be considered evil and untrustworthy (as one might regard a Java applet), the process becomes untrustworthy when it must handle untrusted data. Thus, GStreamer should be able to use the same decoder plugin on untrusted and trusted content, but when handling untrusted content the framework must have a way to initialize the player, then drop its privilege level to isolate it.
Emont's work with sandboxing GStreamer playback started with setuid-sandbox, a standalone version of the sandbox from Google's Chromium browser. Setuid-sandbox creates a separate PID namespace and chroot for the sandboxed process. Although it is not very fine-grained, Emont thought it a good place to start and produced a working implementation of a sandboxed GStreamer playback pipeline.
The pipeline takes the downloaded content as usual and writes it to a file descriptor sink (fdsink element). When the fdsink element reaches the READY state, it is opened in an fdsrc element inside a setuid-sandbox, where it is then decoded, loaded into a GStreamer buffer, and loaded into a shmsink shared memory sink. The shmsink is the last stage in the sandboxed process; outside the sandbox, the pipeline accesses the shared memory and plays back the contents within. This design sandboxes the demultiplexing and decoding steps in the pipeline, which Emont said were the most likely to contain exploitable bugs.
The playback pipeline worked, he said, but there were several issues. First, he discovered that many GStreamer elements do not acquire all of their resources by the time they reach the READY state, though they do by the time they reach the PAUSED state that follows. It might be possible to modify these elements to get their resources earlier, he said, or to add an ALL_RESOURCES_ACQUIRED signal. Next, he noted that the memory created by the shmsink inside the sandbox could not be cleaned up by the sandboxed process, but only by the "broker" portion of the pipeline outside the sandbox. A more noticeable problem was that sandboxing the decoder made it impossible to seek within the file. Finally, the sandboxing process as a whole adds significant overhead; Emont reported that a 720p Theora video would consume 30-40% of the CPU inside the sandboxed pipeline, compared to 20-30% under normal circumstances.
Some of the problems (such as the READY/PAUSED state issue and the lack of seekability) might be solvable by sandboxing the entire pipeline, he said, or by adding proxy elements to allow for remote pipeline control. Either way, going forward there is still a lot of work to do.
It is also possible that setuid-sandbox is simply not the best sandboxing solution. There are others that Emont said he was interested in trying out for comparison. He outlined the options and their various pros and cons. Seccomp, for example, is even less flexible, which probably makes it a poor replacement. On the other hand, seccomp's new mode that combines with Berkeley Packet Filters (BPF) provides a much greater degree of control. It also has the advantage of being usable without end-user intervention. SELinux, in contrast, could be used to define a strict playback policy, but it is under the control of the machine's administrators. GStreamer and application developers could make suggestions for users, but ultimately SELinux is not under the developers' control. Finally, Emont did his experiments on Linux, but in the long term GStreamer really needs a sandboxing framework that is cross-platform, and perhaps provides some sort of fallback mechanism between different sandboxing options.
Emont's work is still experimental, and more to the point he is not conducting it as part of GStreamer's core development. But he did make a good case for its eventual inclusion. Certainly any part of a large framework like GStreamer has bugs and therefore the potential to be exploited by an attacker. But isolating the un-decoded media payload from the rest of the system already goes a long way toward protecting the user. As did Müller's talk, Emont's presentation shows that GStreamer may reach 1.0 soon, but it is still far from "complete."
The problem, simply put, is this: the objective of secure boot is to prevent the system from running any unsigned code in a privileged mode. So, if one boots a Linux system that, in turn, gives access to the machine to untrusted code, the entire purpose has been defeated. The consequences could hurt both locally (bad code could take control of the machine) and globally (the signing key used to boot Linux could be revoked), so it is an outcome that is worth avoiding. Doing so, however, requires placing limitations in the kernel so that not even root can circumvent the secure boot chain of trust.
The form of those limitations can now be seen in Matthew Garrett's secure boot support patch set. These patches may see some changes before finding their way into the mainline, but chances are that their overall form will not evolve that much.
The first step is to add a new capability bit. Capabilities describe privileged operations that a given process can perform; they vary from CAP_DAC_OVERRIDE (able to override file permissions) to CAP_NET_BIND_SERVICE (can bind to a low-numbered TCP port) to CAP_SYS_ADMIN (can do a vast number of highly privileged things). The new capability, called CAP_SECURE_FIRMWARE, enables actions that are not allowed in the secure boot environment. Or, more to the point, its absence blocks actions that might otherwise enable the running of untrusted code.
Naturally, the first thing reviewers complained about was the name. It describes actions that can be performed in the absence of "secure firmware"; some reviewers have also disputed whether it has anything to do with security in the first place. So the capability will probably be renamed, though nobody has come up with an obvious replacement yet.
Whatever it is eventually called, this capability will normally be available to privileged processes. If the kernel determines (by asking the firmware) that it has been booted in the secure mode, though, this capability will be removed from the bounding set before init is run; once a capability is removed from that set, no process can ever obtain it. Matthew's patch set also adds a boot-time parameter (secureboot_enable=) that can be used to simulate a secure boot on hardware that lacks that feature.
In the secure boot world, processes lacking the new capability can no longer access I/O memory or x86 I/O ports. Either of those could be used convince a device to overwrite the running kernel with hostile code using DMA, compromising the system, so they cannot be allowed. One consequence is that graphics cards without kernel mode setting (KMS) support cannot be used; fortunately, the number of systems with (1) UEFI firmware and (2) non-KMS graphics is probably countable using an eight-bit signed value. Other user-space device drivers will be left out in the cold as well. Someday, Matthew says, it may be possible to enable I/O access on systems where the I/O memory management unit can enforce restrictions on the range of DMA operations, but, for now, all such access is denied.
Similarly, all write access to /dev/mem and /dev/kmem must be disabled, even if the kernel configuration would otherwise allow such access.
The strongest comments came in response to another limitation — the disabling of the kexec() system call. This call replaces the running kernel with a new kernel and boots the result without going through the system's firmware. It can be used for extra-fast reboots, though the most common use, arguably, is to boot a special kernel to create a crash dump after a system failure. Booting an arbitrary kernel obviously goes against the spirit of secure boot, so it cannot be allowed.
Eric Biederman, in particular, complained about this limitation, saying:
Matthew responded that, in fact, we can't always trust root, and never have trusted it fully:
In this case, the proper solution would appear to be to allow kexec() to succeed if the target kernel has been properly signed. That support has not yet been implemented, though. It's apparently on the to-do list, but, as Matthew said: "We ship with the code we have, not the code we want."
One other important piece of the puzzle, of course, is module loading; if unsigned modules can be loaded into the kernel, the game is over. But, unlike kexec(), module loading cannot simply be turned off, so the implementation of some sort of signing mechanism cannot be put off. The module signing implementation is not part of Matthew's patch set, though; instead, David Howells has been working on the problem for some time now. This code has been delayed as the result of strong disagreements on how signing should be implemented; a solution was worked out at the 2012 Kernel Summit and this feature, in the form of a new patch set from Rusty Russell, should find its way into the mainline as soon as the 3.7 development cycle.
The end result is that, by the time users have machines with UEFI secure boot capabilities, the kernel should be able to do its part. Whether users will like the result is another story. There is great value in knowing that the system is running the software you want it to be running, and many users will appreciate that. But others may find that the system is refusing to run the software they want; that is harder to appreciate. If things go well, the restrictions required by UEFI secure boot will come to be seen like other capability-based restrictions in Linux: occasionally obnoxious, but good for the long-term stability of the system and ultimately circumventable if need be.
At LinuxCon 2012, Bradley Kuhn, executive director of the Software Freedom Conservancy (SFC), presented a session on funding free software development. SFC's primary mission is to provide organizational and legal support to free software projects, but it has also been successful at raising funds to support development time — a task that many projects find difficult.
Kuhn started the discussion with an account of his introduction to free software, which began when he accidentally hit a key sequence in Emacs that brought up the text of Richard Stallman's GNU Manifesto. Reading the Manifesto was inspirational, said Kuhn, who has subsequently pursued a career in free software — even serving as director of the Free Software Foundation (FSF).
But on this occasion, he told the story not just as an introduction, but also to point out an oft-overlooked section of the document. Toward the end of the Manifesto, Stallman discusses several possible alternatives to the proprietary software funding model. Stallman argues that (contrary to the common objection that "no one will code for free") free software will always have developers, they will just earn smaller salaries than they would writing proprietary software. He cites examples of people who take jobs writing software in not-for-profit situations like MIT's Artificial Intelligence Lab, and says that free software is no different. Developers do tend to move to higher-paying jobs when they can work on the same projects, he said, but there are many who write free software out of commitment to the ideals.
Stallman suggests several alternative funding models under which developers could make money working on free software. One is a "Software Tax" in which software users each pay a small amount into a general National Science Foundation (NSF)-like fund that makes grants to developers. Another is that hardware manufacturers will underwrite porting efforts; a third is that user groups will form and collect money through dues, then pay developers with it.
Few people remember it, Kuhn said, but in the early days FSF itself functioned much like one of the user groups Stallman describes in the Manifesto. It accepted donations and directly paid developers to work on GNU software. A long list of core projects, including GNU Make, glibc, and GDB, were originally written by paid FSF employees. It was only later, as these original developers took jobs working on free software at companies like Red Hat and Google, that FSF turned its primary attention to advocacy issues.
Today, Kuhn said, the majority of free software is written by for-profit companies. Although that situation is a boon for free software, the resulting code bases tend to drift in the direction of the company's needs. He then quoted Samba's Jeremy Allison (a Google employee) as saying "It's the duty of all Free Software developers to steal as much time as they can from their employers for software freedom." Since not everyone is in a position to "be a Jeremy," Kuhn said, some developers need to be funded by non-profit organizations in order to mitigate the risks of for-profit control.
But proliferation of free software non-profits can be detrimental: it confuses users, and each organization has administrative overhead (boards, officers, and legal filings) that can steal time from development. There are several "umbrella" non-profits that attempt to offload the administrative overhead from the developers, including the Apache Software Foundation (ASF), Software in the Public Interest (SPI), and the SFC.
In addition to the administrative and legal functions of these organizations, each has some mechanism for funding or underwriting software development for its members. Donations to the ASF go into a general fund, from which individual member projects can apply for disbursement for specific work. SFC and SPI use a different model, in which each member project has separate earmarked funds.
Most of SFC's disbursement goes toward funding developer travel to conferences and workshops, Kuhn said. It also handles financial arrangements for conference organizing, Google Summer of Code, and other contracts, but the most interesting thing it does is manage paid contracts for software developers. Typically these contracts are fixed-length affairs that raised targeted funds for the contract through donation drives — as opposed to, for example, earmarking funds that accumulate through an ongoing donation button on the project's web site.
Kuhn recounted several recent success stories from different SFC member projects. The first was the Twisted engine for Python. Back in 2008, the project was confronted with a familiar scenario: it was successful enough that many core developers got high-paying jobs working on Twisted consulting, which in turn led to bit-rot of core functionality. The project decided to hold a fundraising drive, and collected enough donations to pay founder Jean-Paul Calderone to work for two years on bug-squashing, integration, and maintenance of the core — work that was vital to the project, but not exciting enough to attract a full-time position from the typical corporate Twisted user.
In 2010, SFC did a similar fundraising drive to pay Matt Mackall to maintain the Mercurial source code management system. Mackall said he was able to support himself full-time on Linux kernel-space development, but that it was hard to repeatedly "context switch" to Python userspace and work on Mercurial. The SFC fundraising drive funded Mackall full time from April 2010 through June 2012.
The PyPy Python interpreter project launched three successful fundraising initiatives in one year to support specific development projects. The initiatives for PyPy's Py3k implementation of Python 3 and its port of the Numpy scientific computing package each raised $42,000 in drives held a month apart in late 2011. The project has also raised more than $21,000 and counting this year to fund development of software transactional memory support. Kuhn related that he had been concerned at one point that the frequency of the fundraising drives would wear out the potential donor pool, but the project forged ahead, and SFC is now funding four PyPy developers.
A member of the audience asked what SFC thought about using Kickstarter for fundraising, to which Kuhn replied "who is going to Kickstarter for Python stuff who isn't also reading your blog?" PyPy's recent success, he explained, probably owes more to the fact that PyPy is a hot commodity in Python circles right now. It has little trouble finding donors as a result, but by raising the funds through drives hosted at its own site, it avoids having to pay Kickstarter or another broker a potentially hefty cut of the donations.
The tough part, he continued, is what to do when you are no longer on top of the popularity bubble. Free software has a big "I gave at the office" problem, he said. Many of free software's most passionate users (and thus potential donors) already spend their own time working on free software. Consequently, they react to fundraising efforts with questions like "I code all day long, now you want me to give money, too?"
Kuhn did not offer any simple solutions to the ongoing fundraising issue, but clearly there are none. Like Yorba, SFC is interested in exploring the possibility of funding free software projects, which makes Kuhn's report on SFC's successes an interesting counterpart to Yorba director Adam Dingle's examination of other funding methods.
It is clear that SFC's success stories differ from generic Kickstarter or bounty-style drives in a few key respects. First, they are tied to funding work by well-known contributors with good standing in the projects — often key maintainers. Second, they are tied to a development contract of specific length. But they still differ in other important details: although the PyPy initiatives were also tied to a specific feature set, the Twisted and Mercurial drives were done to fund the harder-to-price tasks of bug fixing and routine maintenance. Free software development is not a homogeneous process, so there is certainly no one-size-fits-all answer to the fundraising question. But it is reassuring to know that organizations like SFC (with its commitment to software freedom) can still find success where money is involved.
While the Linux Security Summit (LSS) was held later in the week, it was logically part of the minisummits that accompanied the Kernel Summit—organizer James Morris made a forward-reference report on LSS as part of the minisummit reports. Day one was filled with talks on various topics of interest to the assembled security developers, while day two was mostly devoted to reports from the kernel security subsystems. We plan to write up much of LSS over the coming weeks; the first installment covers a talk given by SELinux developer Dan Walsh on secure Linux containers.
Walsh's opening slide had a picture of a "secure" Linux container (label seen at right)—a plastic "unix ware" storage container—but his talk was a tad more serious. Application sandboxes are becoming more common for isolating general-purpose applications from each other. There are a variety of Linux tools that can be used to create sandboxes, including seccomp, SELinux, the Java virtual machine, and virtualization. The idea behind sandboxing is the age-old concept of "defense in depth".
There is another mechanism that can be used to isolate applications: containers. When most people think of containers, they think of LXC, which is a command-line tool created by IBM. But, the Linux kernel knows nothing about containers, per se, and LXC is built atop Linux namespaces. The secure containers project did not use LXC directly; instead it uses libvirt-lxc.
Using namespaces, child processes can have an entirely different view of the system than does the parent. Namespaces are not all that new, RHEL5 and Fedora 6 used the pam_namespace to partition logins into "secret" vs. "top secret" for example. The SELinux sandbox also used namespaces and was available in RHEL6 and Fedora 8. More recently, Fedora 17 uses systemd which has PrivateTmp and PrivateNetwork directives for unit files that can be used to give services their own view of /tmp or the network. There are 20-30 services in Fedora 17 that are running with their own /tmp, Walsh said.
In addition, Red Hat offers the OpenShift service which allows anyone to have their own Apache webserver for free on Red Hat servers. It is meant to remove the management aspect so that developers can concentrate on developing web applications that can eventually be deployed elsewhere. Since there are many different Apache instances running on the OpenShift servers, sandboxing is used to keep them from interfering with each other.
There are several different kinds of namespaces in Linux. The mount namespace gives processes their own view of the filesystem, while the PID namespace gives them their own set of process IDs. The IPC and Network namespaces allow for private views of those resources, and the UTS namespace allows the processes to have their own host and domain names. The UID namespace is another that is not yet available, and one that concerns Walsh because of its intrusiveness. It would give a private set of UIDs, such that UID 0 inside of the namespace is not the same as root outside.
Secure Linux containers uses libvirt-lxc to set up namespaces that effectively create containers to hold processes that are isolated from those in other containers. Libvirt-lxc has a C API, but also has bindings for several different higher-level languages. It can set up a container, with a firewall, SELinux type enforcement (TE) and multi-category security (MCS), bind mounts that pass through to the host filesystem, and so on. Once that is done, it can start an init process (systemd in this case) inside the container so that it appears to be almost a full Linux system inside the container. In addition, these containers can be managed using control groups (cgroups) so that no one container can monopolize resources like memory or CPU.
But, libvirt-lxc has a complex API that is XML-based. Walsh wanted something simpler, so he created libvirt-sandbox with a key-value based configuration. He intends to replace the SELinux sandbox using libvirt-sandbox, but it is not quite ready for that yet.
To make things even easier, Walsh created a Python script that makes it "dirt simple" for an administrator to build a container or set of containers. He said that Red Hat is famous for building "cool tools that no one uses" because they are too complicated, so he set out to make something very simple to use.
The tool can be used as follows:
virt-sandbox-service create -C -u httpd.service.apache1That call will do multiple things under the covers. It creates a systemd unit file for the container, which means that standard systemd commands can be used to manage it. In addition, if someone puts a GUI on systemd someday, administrators can use that to manage their containers, he said. It also creates the filesystems for the container. It does not use a full chroot(), Walsh said, because he wants to be able to share /usr between containers. For this use case (an Apache web server container), he wants the individual containers to pick up any updates that come from doing a yum update on the host.
It also clones the /var and /etc configuration files into its own copy. In a perfect world, the container would bind mount over /etc, but it can't do that, partly because /etc has so many needed configuration files ("/etc is a cesspool of garbage" was his colorful way of describing that). In addition, it allocates a unique SELinux MCS label that restricts the processes inside the container. "Containers are not for security", he said, because root inside the container can always escape, so the container gets wrapped in SELinux to restrict it.
Once the container has been created, it can be started with:
virt-sandbox-service start apache1Similarly, the stop command can terminate the container. One can also use the connect command to get a shell in the container.
virt-sandbox-service execute -C ifconfig apache1will run a command in the container. For example, there is no separate cron running in each of the containers, instead the execute is used to do things like logrotate from the host's cron.
The systemd unit file that gets created can start and stop multiple container instances with a single command. Beyond that, using the ReloadPropagatedFrom directive in the unit file will allow an update of the host's apache package to restart all of the servers in the containers. So:
systemctl reload httpd.servicewill trigger a reload in all container instances, while:
systemctl start http@.servicewill start up all such services (which means all of the defined containers).
This is all recent work, Walsh said. It works "relatively well", but still needs work. There are other use cases for these containers, beyond just the OpenShift-like example he used. For instance, the Fedora project uses Mock to build packages, and Mock runs as root. That means there are some 3000 Fedora packagers who could do "bad stuff" on the build systems, so putting Mock into a secure container would provide better security. Another possibility would be to run customer processes (e.g. Hadoop) on a GlusterFS node. Another service that Walsh has containerized is MySQL, and more are possible.
Walsh demonstrated virt-sandbox-service at the end of his talk. He demonstrated some of the differences inside and outside of the container, including a surprising answer to getenforce inside the container. It reports that SELinux is disabled, but that is a lie, he said, to stop various scripts from trying to do SELinux things within the container. In addition, he showed that the eth0 device inside the container did not even appear in the host's ifconfig output (nor, of course, did the host's wlan0 appear in the container).
A number of steps have been taken to try to prevent root from breaking out of the container, but there is more to be done. Both mount and mknod will fail inside the container for example. These containers are not as secure as full virtualization, Walsh said, but they are much easier to manage than handling the multiple full operating systems that virtualization requires. For many use cases, secure containers may be the right fit.
Our editorial team and content monitors almost immediately noticed a flood of livid Twitter messages about the ban and attempted to restore the broadcast. Unfortunately, we were not able to lift the ban before the broadcast ended. We had many unhappy viewers as a result, and for that I am truly sorry. As a long-time Firefly, Stargate and Game of Thrones fan among others, I am especially disheartened by this.
|Created:||September 5, 2012||Updated:||September 11, 2012|
|Description:||From the CVE entry:
Auth/Verify/LDAP.pm in Bugzilla 2.x and 3.x before 3.6.11, 3.7.x and 4.0.x before 4.0.8, 4.1.x and 4.2.x before 4.2.3, and 4.3.x before 4.3.3 does not restrict the characters in a username, which might allow remote attackers to inject data into an LDAP directory via a crafted login attempt.
|Created:||September 4, 2012||Updated:||April 5, 2013|
|Description:||From the Mandriva advisory:
A denial of service flaw was found in the way Fetchmail, a remote mail retrieval and forwarding utility, performed base64 decoding of certain NTLM server responses. Upon sending the NTLM authentication request, Fetchmail did not check if the received response was actually part of NTLM protocol exchange, or server-side error message and session abort. A rogue NTML server could use this flaw to cause fetchmail executable crash.
|Package(s):||gimp||CVE #(s):||CVE-2012-2763 CVE-2012-3236|
|Created:||September 4, 2012||Updated:||November 9, 2012|
Buffer overflow in the readstr_upto function in plug-ins/script-fu/tinyscheme/scheme.c in GIMP 2.6.12 and earlier, and possibly 2.6.13, allows remote attackers to execute arbitrary code via a long string in a command to the script-fu server. (CVE-2012-2763)
fits-io.c in GIMP before 2.8.1 allows remote attackers to cause a denial of service (NULL pointer dereference and application crash) via a malformed XTENSION header of a .fit file, as demonstrated using a long string. (CVE-2012-3236)
|Created:||September 5, 2012||Updated:||April 9, 2013|
|Description:||gnome-keyring seems to obey the configuration asking it to stop caching passphrases, but after a while it doesn't cache nor does it ask for the passphrase. See the Red Hat bugzilla for details.|
|Created:||September 4, 2012||Updated:||September 6, 2012|
|Description:||From the Red Hat bugzilla:
A security flaw was found in the XMPP Dialback protocol implementation of jabberd2, OpenSource server implementation of the Jabber protocols (Verify Response and Authorization Response were not checked within XMPP protocol server to server session). A rogue XMPP server could use this flaw to spoof one or more domains, when communicating with vulnerable server implementation, possibly leading into XMPP's Server Dialback protections bypass.
|Package(s):||java-1.6.0-openjdk||CVE #(s):||CVE-2012-0547 CVE-2012-1682|
|Created:||September 4, 2012||Updated:||October 19, 2012|
|Description:||From the Red Hat advisory:
It was discovered that the Beans component in OpenJDK did not perform permission checks properly. An untrusted Java application or applet could use this flaw to use classes from restricted packages, allowing it to bypass Java sandbox restrictions. (CVE-2012-1682)
A hardening fix was applied to the AWT component in OpenJDK, removing functionality from the restricted SunToolkit class that was used in combination with other flaws to bypass Java sandbox restrictions. (CVE-2012-0547)
|Package(s):||java-1.7.0-openjdk||CVE #(s):||CVE-2012-3136 CVE-2012-4681|
|Created:||September 4, 2012||Updated:||April 19, 2013|
|Description:||From the Red Hat advisory:
Multiple improper permission check issues were discovered in the Beans component in OpenJDK. An untrusted Java application or applet could use these flaws to bypass Java sandbox restrictions.
|Package(s):||keystone||CVE #(s):||CVE-2012-3542 CVE-2012-3426|
|Created:||September 4, 2012||Updated:||November 29, 2012|
|Description:||From the Ubuntu advisory:
Dolph Mathews discovered that OpenStack Keystone did not properly restrict to administrative users the ability to update users' tenants. A remote attacker that can reach the administrative API can use this to add any user to any tenant. (CVE-2012-3542)
Derek Higgins discovered that OpenStack Keystone did not properly implement token expiration. A remote attacker could use this to continue to access an account that has been disabled or has a changed password. (CVE-2012-3426)
|Created:||August 30, 2012||Updated:||September 6, 2012|
|Description:||From the Mageia advisory:
This security update for Mariadb corrects a problem that is not yet being publicly disclosed.
In addition, a problem preventing the feedback plugin from working has been corrected.
|Created:||September 6, 2012||Updated:||April 10, 2013|
From the Red Hat bugzilla entry:
Mesa, as used in Google Chrome before 21.0.1183.0 on the Acer AC700, Cr-48, and Samsung Series 5 and 5 550 Chromebook platforms, and the Samsung Chromebox Series 3, allows remote attackers to execute arbitrary code via unspecified vectors that trigger an "array overflow."
|Created:||September 6, 2012||Updated:||September 18, 2012|
From the Debian advisory:
It was discovered that Moin, a Python clone of WikiWiki, incorrectly evaluates ACLs when virtual groups are involved. This may allow certain users to have additional permissions (privilege escalation) or lack expected permissions.
|Created:||August 31, 2012||Updated:||April 10, 2013|
|Description:||From the CVE entry:
OCaml Xml-Light Library before r234 computes hash values without restricting the ability to trigger hash collisions predictably, which allows context-dependent attackers to cause a denial of service (CPU consumption) via unspecified vectors.
|Created:||August 31, 2012||Updated:||September 6, 2012|
|Description:||From the Debian advisory:
It was discovered that otrs2, a ticket request system, contains a cross-site scripting vulnerability when email messages are viewed using Internet Explorer. This update also improves the HTML security filter to detect tag nesting.
|Created:||September 5, 2012||Updated:||October 25, 2012|
|Description:||From the Red Hat advisory:
A flaw was found in the way QEMU handled VT100 terminal escape sequences when emulating certain character devices. A guest user with privileges to write to a character device that is emulated on the host using a virtual console back-end could use this flaw to crash the qemu-kvm process on the host or, possibly, escalate their privileges on the host.
|Created:||August 30, 2012||Updated:||January 17, 2013|
|Description:||From the CVE entry:
The good_client function in rquotad (rquota_svc.c) in Linux DiskQuota (aka quota) before 3.17 invokes the hosts_ctl function the first time without a host name, which might allow remote attackers to bypass TCP Wrappers rules in hosts.deny.
|Created:||August 30, 2012||Updated:||September 6, 2012|
|Description:||From the Debian advisory:
It was discovered that rtfm, the Request Tracker FAQ Manager, contains multiple cross-site scripting vulnerabilities in the topic administration page.
|Package(s):||tor||CVE #(s):||CVE-2012-3517 CVE-2012-3518 CVE-2012-3519|
|Created:||August 30, 2012||Updated:||February 4, 2013|
|Description:||From the CVE entries:
Use-after-free vulnerability in dns.c in Tor before 0.2.2.38 might allow remote attackers to cause a denial of service (daemon crash) via vectors related to failed DNS requests. (CVE-2012-3517)
The networkstatus_parse_vote_from_string function in routerparse.c in Tor before 0.2.2.38 does not properly handle an invalid flavor name, which allows remote attackers to cause a denial of service (out-of-bounds read and daemon crash) via a crafted (1) vote document or (2) consensus document. (CVE-2012-3518)
routerlist.c in Tor before 0.2.2.38 uses a different amount of time for relay-list iteration depending on which relay is chosen, which might allow remote attackers to obtain sensitive information about relay selection via a timing side-channel attack. (CVE-2012-3519)
|Package(s):||typo3-src||CVE #(s):||CVE-2012-3527 CVE-2012-3528 CVE-2012-3529 CVE-2012-3530 CVE-2012-3531|
|Created:||August 31, 2012||Updated:||September 6, 2012|
|Description:||From the Debian advisory:
CVE-2012-3527: An insecure call to unserialize in the help system enables arbitrary code execution by authenticated users.
CVE-2012-3528: The TYPO3 backend contains several cross-site scripting vulnerabilities.
CVE-2012-3529: Authenticated users who can access the configuration module can obtain the encryption key, allowing them to escalate their privileges.
|Package(s):||wireshark||CVE #(s):||CVE-2012-4286 CVE-2012-4294 CVE-2012-4295 CVE-2012-4298|
|Created:||August 30, 2012||Updated:||September 6, 2012|
|Description:||From the CVE entries:
The pcapng_read_packet_block function in wiretap/pcapng.c in the pcap-ng file parser in Wireshark 1.8.x before 1.8.2 allows user-assisted remote attackers to cause a denial of service (divide-by-zero error and application crash) via a crafted pcap-ng file. (CVE-2012-4286)
Buffer overflow in the channelised_fill_sdh_g707_format function in epan/dissectors/packet-erf.c in the ERF dissector in Wireshark 1.8.x before 1.8.2 allows remote attackers to execute arbitrary code via a large speed (aka rate) value. (CVE-2012-4294)
Array index error in the channelised_fill_sdh_g707_format function in epan/dissectors/packet-erf.c in the ERF dissector in Wireshark 1.8.x before 1.8.2 might allow remote attackers to cause a denial of service (application crash) via a crafted speed (aka rate) value. (CVE-2012-4295)
Integer signedness error in the vwr_read_rec_data_ethernet function in wiretap/vwr.c in the Ixia IxVeriWave file parser in Wireshark 1.8.x before 1.8.2 allows user-assisted remote attackers to execute arbitrary code via a crafted packet-trace file that triggers a buffer overflow. (CVE-2012-4298)
|Created:||August 31, 2012||Updated:||January 1, 2013|
|Description:||From the CVE entry:
SQL injection vulnerability in frontends/php/popup_bitem.php in Zabbix 1.8.15rc1 and earlier, and 2.x before 2.0.2rc1, allows remote attackers to execute arbitrary SQL commands via the itemid parameter.
Page editor: Jake Edge
Brief itemsreleased on September 1. "Shortlog appended, as you can see it's just fairly random. I'm hoping we're entering the boring/stable part of the -rc windows, and that things won't really pick up speed just because people are getting home."
Stable updates: no stable updates have been released in the last week, and none are in the review process as of this writing.
Kernel development news
The "regression testing" slot on day 1 of the 2012 Kernel Summit consisted
of presentations from Dave Jones and Mel Gorman. Dave's presentation
described his new fuzz testing tool, while Mel's was concerned with some
steps to improve benchmarking for detecting regressions.
Dave Jones talked about a testing tool that he has been working on for the last 18 months. That tool, Trinity, is a type of system call fuzz tester. Dave noted that fuzz testing is nothing new, and that the Linux community has had fuzz testing projects for around a decade. The problem is that past fuzz testers take a fairly simplistic approach, passing random bit patterns in the system call arguments. This suffices to find the really simple bugs, for example, detecting that a numeric value passed to a file descriptor argument does not correspond to a valid open file descriptor. However, once these simple bugs are fixed, fuzz testers tend to simply encounter the error codes (EINVAL, EBADF, and so on) that system calls (correctly) return when they are given bad arguments.
What distinguishes Trinity is the addition of some domain-specific intelligence. The tool includes annotations that describe the arguments expected by each system call. For example, if a system call expects a file descriptor argument, then rather than passing a random number, Trinity opens a range of different types of files, and passes the resulting descriptors to the system call. This allows fuzz testing to get past the simplest checks performed on system call arguments, and find deeper bugs. Annotations are available to indicate a range of argument types, including memory addresses, pathnames, PIDs, lengths, and so on. Using these annotations, Trinity can generate tests that are better targeted at the argument type (for example, the Trinity web site notes that powers of two plus or minus one are often effective for triggering bugs associated with "length" arguments). The resulting tests performed by Trinity are consequently more sophisticated than traditional fuzz testers, and find new types of errors in system calls.
Ted Ts'o asked whether it's possible to bias the tests performed by Trinity in favor of particular kernel subsystems. In response, Dave noted that Trinity can be directed to open the file descriptors that it uses for testing off a particular filesystem (for example, an ext4 partition).
Dave stated that Trinity is run regularly against the linux-next tree as well as against Linus's tree. He noted that Trinity has found bugs in the networking code, filesystem code, and many other parts of the kernel. One of the goals of his talk was simply to encourage other developers to start employing Trinity to test their subsystems and architectures. Trinity currently supports the x86, ia64, powerpc, and sparc architectures.
Mel Gorman's talk slot was mainly concerned with improving the discovery of performance regressions. He noted that, in the past, "we talked about benchmarking for patches when they get merged. But there's been much inconsistency over time." In particular, he called out the practice of writing commit changelog entries that simply give benchmark statistics from running a particular benchmarking tool as being nearly useless for detecting regressions.
Mel would like to see more commit changelogs that provide enough information to perform reproducible benchmarks. Leading by example, Mel uses his own benchmarking framework, MMTests, and he has posted historical results from kernels 2.6.32 through to 3.4. What he would like to see is changelog entries that, in addition to giving benchmark results, identify the benchmark framework they use and include (pointers to) the specific configuration used with the framework. (The configuration could be in the changelog, or if too large, it could be stored in some reasonably stable location such as the kernel Bugzilla.)
H. Peter Anvin responded that "I hope you know how hard it is for submitters to give us real numbers at all." But this didn't deter Mel from reiterating his desire for sufficient information to reproduce benchmarking tests; he noted that many regressions take a long time to be discovered, which increases the importance of being able to reproduce past tests.
Ted Ts'o observed that there seemed to be a need for a per-subsystem approach to benchmarking. He then asked whether individual subsystems would even be able come to consensus on what would be a reasonable set of metrics, and noted that those metrics should not take too long to run (since metrics that take a long time to execute are likely not to executed in practice). Mel offered that, if necessary, he would volunteer to help write configuration scripts for kernel subsystems. From there, discussion moved into a few other related topics, without reaching any firm resolutions. However, performance regressions are a subject of great concern to kernel developers, and the topic of reproducible benchmarking is one that will likely be revisited soon.
The "distributions and upstream" session of day 1 of the 2012 Kernel Summit focused on a question enunciated by Ted Ts'o: "From an upstream perspective, how can we better help distros?" Responding to that question were two distributor representatives: Ben Hutchings for Debian and Dave Jones for Fedora.
Ben Hutchings asked that, when considering merging a new feature, kernel developers not accept the argument that "this feature is expensive, but that's okay because we'll make it an option". He pointed out that this argument is based on a logical fallacy, since in nearly every case distributions will enable the option, because some users will need it. As an example, Ben mentioned memory cgroups (memcg), which, in their initial release, were rather expensive for performance.
A second point that Ben made was that there are still features that distributions are adding that are not being merged upstream. As an example from last year, he mentioned Android. As a current example, he noted the union mounts feature, which is still not upstream. Inasmuch as keeping features such as these outside of the mainline kernel creates more work for distributions, he would like to see such features more actively merged.
Dave Jones made three points. The first of these was that a lot of Kconfig help texts are "really awful". As a consequence, distribution maintainers have to read the code in order to work out if a feature should be enabled.
Dave's second point is that it would be useful to have an explicit list of regressions at around the -rc3 or -rc4 point in the release cycle. His problem is that regressions often become visible only much later. Finally, Dave noted that Fedora sees a lot of reports from lockdep that no other distributions seem to see. The basic problem underlying both of these points is of course lack of early testing, and at this point Ted Ts'o mused: "can we make it easier for users to run the kernel-of-the-day [in particular, -rc1 and rc2 kernels] and allow them to easily fall back to a stable kernel if it doesn't work out?" There was however no conclusive response in the ensuing discussion.
Returning to the general subject of Kconfig, Matthew Garrett echoed and elaborated on one of points made by Ben Hutchings, noting that Kconfig is important for kernel developers (so that they can strip down a kernel for fast builds). However, because distributors will nearly always enable configuration options (as described above), kernel developers need to ask themselves, "If you don't expect an option to be enabled [by distributors], then why is the option even present?". In passing, Andrea Arcangeli noted one of his pet irritations—one with which most people who have ever built a kernel will be familiar. When running make oldconfig, it is very easy to overstep as one types Enter to accept the default "no" for most options; one suddenly realizes that the answer to an earlier question should have been "yes". At that point of course, there is no way to go back, and one must instead restart from the beginning. (Your editor observes that improving this small problem could be a nice way for a budding kernel hacker to get their hands dirty.)
The lightning talks on day 1 of the 2012 Kernel Summit were over in, one could say, a flash. There were just two very brief discussions.
Paul McKenney noted that a small number of read-copy update (RCU) users have for some time requested the ability to offload RCU callbacks. Normally, RCU callbacks are invoked on the CPU that registered them. This works well in most cases, but it can result in unwelcome variations in the execution times of user processes running on the same CPU. This kind of variation (also known as operating system jitter) can be reduced by offloading the callbacks—arranging for that CPU's RCU callbacks to be invoked on some other CPU. Paul asked if the ability to offload RCU callbacks was of interest to others in the room. A number of developers responded in the affirmative.
Dan Carpenter noted the existence of Smatch, his static analysis tool that detects various kinds of errors in C source code, pointing out that by now "many of you have received emails from me". (The emails that he referred to contained kernel patches and lists of bugs or potential bugs in kernel code. In the summary of his LPC 2011 presentation, Dan noted that Smatch has resulted in hundreds of kernel patches.) Dan's main point was simply to request other ideas from kernel developers on what checks to add to Smatch; he noted that there is a mailing list, firstname.lastname@example.org, to which suggestions can be sent.
The presentation given by Fengguang Wu on day 1 of the 2012 Kernel Summit was about testing for build and boot regressions in the Linux kernel. In the presentation, Fengguang described the test framework that he has established to detect and report these regressions in a more timely fashion.
To summarize the problem that Fengguang is trying to resolve, it's simplest to look at things from the perspective of a maintainer making periodic kernel releases. The most obvious example is of course the mainline tree maintained by Linus, which goes through a series of release candidates on the way to the release of a stable kernel. The linux-next tree maintained by Stephen Rothwell is another example. Many other developers depend on these releases. If for some reason, those kernel releases don't successfully build and boot, then the daily work of other kernel developers is impaired while they resolve the problem.
Of course, Linus and Stephen strive to ensure that these kinds of build and boot errors don't occur: before making kernel releases, they do local testing on their development systems, and ensure that the kernel builds, boots, and runs for them. The problem comes in when one considers the variety of hardware architectures and configuration options that Linux provides. No single developer can test all combinations of architectures and options, which means that, for some combinations, there are inevitably build and boot errors in the mainline -rc and linux-next releases. These sorts of regressions appear even in the final releases performed by Linus; Fengguang noted the results found by Geert Uytterhoeven, who reported that (for example) in the Linux 3.4 release, his testing found around 100 build error messages resulting from regressions. (Those figures are exaggerated because some errors occur on obscure platforms that see less maintainer attention. But they include a number of regressions on mainstream platforms that have the potential to disrupt the work of many kernel developers.) Furthermore, even when a build problem appears in a series of kernel commits but is later fixed before a mainline -rc release, this still creates a problem: developers performing bisects to discover the causes of other kernel bugs will encounter the build failures during the bisection process.
As Fengguang noted, the problem is that it takes some time for these regressions to be detected. By that time, it may be difficult to determine what kernel change caused the problem and who it should be reported to. Many such reports on the kernel mailing list get no response, since it can be hard to diagnose user-reported problems. Furthermore, the developer responsible for the problem may have moved on to other activities and may no longer be "hot" on the details of work that they did quite some time ago. As a result, there is duplicated effort and lost time as the affected developers resolve the problems themselves.
According to Fengguang, these sorts of regressions are an inevitable part of the development process. Even the best of kernel developers may sometimes fail to test for regressions. When such regressions occur, the best way to ensure they are resolved is to quickly and accurately determine the cause of the regression and promptly notify the developer who caused the regression.
Fengguang's solution to this problem is to automate a solution that detects these regressions and then informs kernel developers by email that their commit X triggered bug Y. Crucially, the email reports are generated nearly immediately (1-hour response time) after commits are merged into the tested repositories. (For this reason, Fengguang calls his system a "0-day kernel test" system.) Since the relevant developer is informed quickly, it's more likely they'll be "hot" on the technical details, and able to fix the problem quickly.
Fengguang's test framework at the Intel Open Source Technology Center consists of a server farm that includes five build servers (three Sandy Bridge and two Itanium systems). On these systems, kernels are built inside chroot jails. The built kernel images are then boot tested inside over 100 KVM instances on another eight test boxes. The system builds and boots each tested kernel configuration, on a commit-by-commit basis for a range of kernel configurations. (The system reuses build outputs from previous commits so as to expedite the build testing. Thus, the build time for the first commit of an allmodconfig build is typically ten minutes, but subsequent commits require two minutes to build on average.)
Tests are currently run against Linus's tree, linux-next, and more than 180 trees owned by individual kernel maintainers and developers. (Running tests against individual maintainers trees helps ensure that problems are fixed before they taint Linus's tree and linux-next.) Together, these trees produce 40 new branch heads and 400 new commits on an average working day. Each day, the system build tests 200 of the new commits. (The system allows trees to be categorized as "rebasable" or "non-rebasable". The latter are usually big subsystem trees for which the maintainers take responsibility to do bisectability tests before publishing commits. Rebaseable trees are tested on a commit-by-commit basis. For non-rebaseable trees, only the branch head is built; only if that fails does the system go though the intervening commits to locate the source of the error. This is why not all 400 of the daily commits are tested.)
The current machine power allows the build test system to test 140 kernel configurations (as well as running sparse and coccinelle) for each commit. Around half of these configurations are randconfig, which are regenerated each day in order to increase test coverage over time. (randconfig builds the kernel with randomized configuration options, so as to find test unusual kernel configurations.) Most of the built kernels are boot tested, including the randconfig ones. Boot tests for the head commits are repeated multiple times to increase the chance of catching less-reproducible regressions. In the end, 30,000 kernels are boot tested in each day. In the process, the system catches 4 new static errors or warnings per day, and 1 boot error every second day.
The responses from the kernel developers in the room were extremely positive to this new system. Andrew Morton noted he'd received a number of useful reports from the tool. "All contained good information, and all corresponded to issues I felt should be fixed." Others echoed Andrew's comments.
One developer in the room asked what he should do if he has a scratch branch that is simply too broken to be tested. Fengguang replied that his build system maintains a blacklist, and specific branches can be added to that blacklist on request. In addition, a developer can include a line containing the string Dont-Auto-Build in a commit message; this causes the build system to skip testing of the whole branch.
Many problems in the system have already been fixed as a consequence of developer feedback: the build test system is fairly mature; the boot test system is already reasonably usable, but has room for further improvement. Fengguang is seeking further input from kernel developers on how his system could be improved. In particular, he is asking kernel developers for runtime stress and functional test scripts for their subsystems. (Currently the boot test system runs a limited set of tools—trinity, xfstests, and a handful of memory management tests—for catching runtime regressions.)
Fengguang's system has already clearly had a strong positive impact on the day-to-day life of kernel developers. With further feedback, the system is likely to provide even more benefit.
Anyone who has paid even slight attention to the progress of the mainlining of the Android modifications to the Linux kernel will be aware that the process has had its ups and downs. An initial attempt to mainline the changes via the staging tree ended in failure when the code was removed in kernel 2.6.33 in late 2010. Nevertheless, at the 2011 Kernel Summit, kernel developers indicated a willingness to mainline code from Android, and starting with Linux 3.3, various Android pieces were brought back into the staging tree. (On the Android side this was guided by the Android Mainlining Project.) The purpose of John Stultz's presentation on day 1 of the 2012 Kernel Summit was to review the current status of upstreaming of the Android code and outline the work yet to be done.
John began by reviewing the progress in recent kernel releases. Linux 3.3 reintroduced a number of pieces to staging, including ashmem, binder, logger, and the low-memory killer. With the Linux 3.3 release, it became possible to boot Android on a vanilla kernel. Linux 3.4 added some further pieces to the staging tree and also saw a lot of cleanup of the previously merged code. Subsequent kernels have seen further Android code move to the staging tree, including the wakeup_source feature and the Android Gadget driver. In addition, some code in the staging tree has been converted to use upstream kernel features; for example, Android's alarm-dev feature was converted to use the alarm timers feature added to Linux in kernel 3.0.
As of now (i.e., after the closure of the 3.6 merge window), there still remain some major features to merge, including the ION memory allocator. In addition, various Android pieces still remain in the staging tree (for example, the low-memory killer, ashmem, binder, and logger), and these need to be reworked (or replaced), so that the equivalent functionality is provided in the mainline kernel. However, one has the impression that these technical issues will all be solved, since there's been a general improvement in relations on both sides of the Android/upstream fence; John noted that these days there is much less friction between the two sides, more Android developers are participating in the Linux community, and the Linux community seems more accepting of Android as a project. Nevertheless, John noted a few things that could still be improved on the Android side. In particular, for many releases, the Android developers provided updated code branches for each kernel release, but in more recent times they have skipped doing this for some kernel releases.
Following John's presentation, there was relatively little discussion, which is perhaps an indication of the fact that kernel developers are reasonably satisfied with the current status and momentum of Android upstreaming. Matthew Garrett asked if John has any feeling about whether other projects are making use of the upstreamed Android code. In response, John noted that Android code is being used as the default Board Support Package for some projects, such as Firefox OS. He also mentioned that the volatile ranges code that he is currently developing has a number of potential uses outside of Android.
Matthew was also curious to know if is there anything that the Linux kernel developers could do to help make the design process for features that are going into Android more open. Right now, most Android features are developed in-house, but perhaps a more open-developed solution might have satisfied other users' requirements. There was some back and forth as to how practical any other kind of model would be, especially given the focus of vendors on product deadlines; the implicit conclusion was that anything other than the status quo was unlikely.
Overall, the current status of Android upstreaming is very positive, and certainly rather different from the situation a couple of years ago.
From several accounts, day one of this year's Kernel Summit was largely argument-free. There were plenty of discussions, even minor disagreements, but nothing approaching some of the battles of yore. Day three looked like it might provide an exception to that pattern with a discussion of two different patch sets that are both targeted at cryptographically signing kernel modules. In the end, though, the pattern continued, with an interesting, but tame, session.
Kernel modules are inserted into the running kernel, so a rogue module could be used to compromise the kernel in ways that are hard to detect. One way to prevent that from happening is to require that kernel modules be cryptographically signed using keys that are explicitly allowed by the administrator. Before loading the module, the kernel can check the signature and refuse to load any that can't be verified. Those modules could come from a distribution or be built with a custom kernel. Since modules can be loaded based on a user action (e.g. attaching a device or using a new network protocol) or come from a third-party (e.g. binary kernel modules), ensuring that only approved modules can be loaded is a commonly requested feature.
Rusty Russell, who maintains the kernel module subsystem, called the meeting to try to determine how to proceed on module signing. David Howells has one patch set that is based on what has been in RHEL for some time, while Dmitry Kasatkin posted another that uses the digital signature support added to the kernel for integrity management. Howells's patches have been around, in various forms, since 2004, while Kasatkin's are relatively new.
Russell prefaced the discussion with an admonishment that he was not interested in discussing the "politics, ethics, or morality" of module signing. He invited anyone who did want to debate those topics to a meeting at 8pm, which was shortly after he had to leave for his plane. The reason we will be signing modules, he said, is because Linus Torvalds wants to be able to sign his modules.
Kasatkin's approach would put the module signature in the extended attributes (xattrs) of the module file, Russell began, but Kasatkin said that choice was only a convenience. His patches are now independent of the integrity measurement architecture (IMA) and the extended verification module (EVM), both of which use xattrs. He originally used xattrs because of the IMA/EVM origin of the signature code he is using, and he did not want to change the module contents. Since then, he noted a response from Russell to Howells's approach and has changed his patches to add the module signature to the end of the file.
That led Russell into a bit of a historical journey. The original patches from Howells put the signature into an ELF section in the module file. But, because there was interest in having the same signature on both stripped and unstripped module files, there was a need to skip over some parts of the module file when calculating the hash that goes into the signature.
The amount of code needed to parse ELF was "concerning", Russell said. Currently, there are some simple sanity checks in the module-loading code, without any checks for malicious code because the belief was that you had to be root to load a module. While that is still true, the advent of things like secure boot and IMA/EVM has made checking for malicious code a priority. But Russell wants to ensure that the code doing that checking is as simple as possible to verify, which was not true when putting module signatures into ELF sections.
Greg Kroah-Hartman pointed out that you have to do ELF parsing to load the module anyway. There is a difference, though. If the module is being checked for maliciousness, that parsing happens after the signature is checked. Any parsing that is done before that verification is potentially handling untrusted input.
Russell would rather see the signature appended to the module file in some form. It could be a fixed-length signature block, as suggested by Torvalds, or there could be some kind of "magic string" followed by a signature. That would allow for multiple signatures on a module. Another suggestion was to change the load_module() system call so that the signature was passed in, which would "punt" the problem to user space "that I don't maintain anymore", Russell said.
Russell's suggestion was to just do a simple backward search from the end of the module file to find the magic string, but Howells was not happy with that approach for performance reasons. Instead, Howells added a 5-digit ASCII number for the length of the signature, which Russell found a bit inelegant. Looking for the magic string "doesn't take that long", he said, and module loading is not that performance-critical.
There were murmurs of discontent in the room about that last statement. There are those who are very sensitive about module loading times because it impacts boot speed. But, Russell said that he could live with ASCII numbers, as long as there was no need to parse ELF sections in the verification code. He does like the fact that modules can be signed in the shell, which is the reason behind the ASCII length value.
There are Red Hat customers asking for SHA-512 digests signed with 4K RSA keys, Howells said, but that may change down the road. That could make picking a size for a fixed-length signature block difficult. But, as Ted Ts'o pointed out, doing a search for the magic string is in the noise in comparison to doing RSA with 4K keys. The kernel crypto subsystem can use hardware acceleration to make that faster, Howells said. But, Russell was not convinced that the performance impact of searching for the magic string was significant and would like to see some numbers.
James Bottomley asked where the keys for signing would come from. Howells responded that the kernel build process can create a key. The public part would go into the kernel for verification purposes, while the private part would be used for signing. After the signing is done, that ephemeral private key could be discarded. There is also the option to specify a key pair to use.
Torvalds said that it was "stupid" to have stripped modules with the same signature as the unstripped versions. The build process should just generate signatures for both. Having logic to skip over various pieces of the module just adds a new attack point. Another alternative is to only generate signatures for the stripped modules as the others are only used for debugging and aren't loaded anyway, so they can be unsigned, he said. Russell agreed, suggesting that the build process could just call out to something to do the signing.
For binary modules, such as the NVIDIA graphics drivers, users would have to add the NVIDIA public key to the kernel ring, Peter Jones said.
Kees Cook brought up an issue that is, currently at least, specific to Chrome OS. In Chrome OS, there is a trusted root partition, so knowing the origin of a module would allow those systems to make decisions about whether or not to load them. Right now, the interface doesn't provide that information, so Cook suggested changing the load_module() system call (or adding a new one) that passed a file descriptor for the module file. Russell agreed that an additional interface was probably in order to solve that problem.
In the end, Russell concluded that there was a reasonable amount of agreement about how to approach module signing. He planned to look at the two patch sets, try to find the commonality between the two, and "apply something". In fact, he made a proposal, based partly on Howells's approach, on September 4. It appends the signature to the module file after a magic string as Russell has been advocating. As he said when wrapping up the discussion, his patch can provide a starting point to solving this longstanding problem.
Catalin Marinas led a discussion of kernel support for 64-bit ARM processors as part of day two of the ARM minisummit. He concentrated on the status of the in-flight patches to add that support, while pointing to his LinuxCon talk later in the week for more details about the architecture itself.
A second round of the ARM-64 patches was posted to the linux-kernel mailing list in mid-August. After some complaints about the "aarch64" name for the architecture, it was changed to "arm64", at least for the kernel source directory. That name will really only be seen by kernel developers as uname will still report "aarch64", in keeping with the ELF triplet used by the binaries built with GCC.
Some of the lessons learned from the ARM 32-bit support have been reflected in arm64. It will target a single kernel image by default, for example. That means that device tree support is mandatory for AArch64 platforms. Since there are not, as yet, any AArch64 platforms, the patches contain simplified platform code based on that of the Versatile Express.
There are two targets for AArch64 devices: embedded and server. It is possible that ACPI support will be required for the servers. As far as Marinas knows, there is no ACPI implementation out there, but it is not clear what Microsoft is doing in that area.
The code for generic timers and the generic interrupt controller (GIC) lives under the drivers directory. That code could be shared with arch/arm, but there is a need to #ifdef the inline assembly code.
There is an intent to push back on the system-on-a-chip (SoC) vendors regarding things like firmware initialization, boot protocol, and a standardized secure mode API. SoC vendors (and thus, their ARM sub-trees) should be providing the standard interfaces, rather than heading out on their own. The ARM maintainers can choose not to accept ports that do not conform.
That may work for devices targeted at Linux, but there may be SoC vendors who initially target another operating system, as Olof Johannson noted. There will likely need to be some give and take for things such as the boot protocol when Windows, iOS, or OS X targeted devices are submitted. Marinas said that the aim would be for standardization, but they "may have to cope" with other choices at times.
The first code from SoC vendors is not expected before the end of the year, Marinas said. Arnd Bergmann half-jokingly suggested that he would be happy to get a leaked version of that code at any time. The first SoCs might well just be existing 32-bit ARMv7 SoCs with an AArch64 CPU (aka ARMv8) dropped in. That may be the path for embedded applications, though the vendors targeting the server market are likely to be starting from scratch.
That led to a discussion of how to push the arm64 patches forward. Marinas would like to push the core architecture code forward, while working to clean up the example SoC code. He would like to target the 3.8 kernel for the core. Bergmann was strongly in favor of getting it all into linux-next soon, and targeting a merge for the 3.7 development cycle.
Marinas is concerned that including the SoC code will delay inclusion as it will require more review. He also wants to make sure that there is a clean base for those who want to use it as a basis for their own SoC code. That should take two weeks or so, Marinas said. He hopes to get it into linux-next sometime after 3.7-rc1, but Bergmann encouraged a faster approach. There is nothing very risky about doing so, Johannson pointed out, as a new architecture cannot break any existing code.
There is some concern about the 2MB limit on device tree binary (dtb) files because some network controllers (and other devices) may have firmware blobs larger than that. Bergmann noted that those blobs may not be able to be shipped in the kernel, but could be put into firmware and loaded from there. It turns out that the flattened device tree format already has a length entry in its header that can be used to support multiple dtbs, which will allow the 2MB limit to be worked around.
The existing arm64 emulation does not have any DMA, so support for that feature is currently untested. In addition, some SoCs are likely to only support 32-bit DMA. Bergmann suggested an architecture-independent implementation that used dma_ops pointers to provide both coherent and non-coherent versions, but Marinas would like to do something simpler (i.e. coherent only) to start with. Since the "hardware" currently lacks DMA, "all DMA is coherent" seems like a reasonable model, Bergmann said. Since no one will be affected by any bugs in the code, he suggested getting it into linux-next as soon as possible.
Tony Lindgren asked if ARM maintainer Russell King had any comments on the patches. Marinas said that there were not many, at least so far. Bergmann said that he didn't think King was convinced that having a separate arm64 directory (as opposed to adding 64-bit support to the existing arm directory) was the right approach.
Many of the decisions were made for ARM 15 years ago, Marinas said, and some of those make it messy to drop arm64 on top of arm. Some day, when the arm tree only supports ARMv7, it may make sense to merge with arm64. The assembly code cannot be shared, because they are two different architectures, Bergmann said. In addition, the system calls cannot be shared and the platform code is going to be done very differently for arm64, he said.
But, there is room for sharing some things between the two trees, Marinas said. That includes some of the device tree files, perf, the generic timer, the GIC driver code, as well as KVM and Xen if and when they are merged. In theory, the ptrace() and signal-handling code could be shared as well.
Progress is clearly being made for arm64, and we will have to wait and see how quickly it can make its way into the mainline.
The ARM big.LITTLE architecture is an asymmetric multi-processor platform, with powerful and power-hungry processors coupled with less-powerful (in both senses) CPUs using the same instruction set. Big.LITTLE presents some challenges for the Linux scheduler. Paul McKenney gave a readout of the status of big.LITTLE support at the ARM minisummit, which he really meant to serve as an "advertisement" for the scheduling micro-conference at the Linux Plumbers Conference that started the next day.
The idea behind big.LITTLE is to do frequency and voltage scaling by other means, he said. Because of limitations imposed by physics, there is a floor to frequency and voltage scaling on any given processor, but that can be worked around by adding another processor with fewer transistors. That's what has been done with big.LITTLE.
There are basically two ways to expose the big.LITTLE system to Linux. The first is to treat each pair as a single CPU, switching between them "almost transparently". That has the advantage that it requires almost no changes to the kernel and applications don't know that anything has changed. But, there is a delay involved in making the switch, which isn't taken into account by the power management code, so the power savings aren't as large as they could be. In addition, that approach requires paired CPUs (i.e. one of each size), but some vendors are interested in having one little and many big CPUs in their big.LITTLE systems.
The other way to handle big.LITTLE is to expose all of the processors to Linux, so that the scheduler can choose where to run its tasks. That requires more knowledge of the behavior of processes, so Paul Turner has a patch set that gathers that kind of information. Turner said that the scheduler currently takes averages on a per-CPU basis, but when processes move between CPUs, some information is lost. His changes cause the load average to move with the processes, which will allow the scheduler to make better decisions.
Turner's patches are on their third revision, and have been "baking on our systems at Google" for a few months. There are no real to-dos outstanding, he said. Peter Zijlstra said that he had wanted to merge the previous revision, but that there was "some funky math" in the patches, which has since been changed. Turner said that he measured a 3-4% performance increase using the patches, which means we get "more accurate tracking at lower cost". It seems likely that the patches will be merged soon.
McKenney said that Turner's patches have been adapted by Morten Rasmussen to be used on big.LITTLE systems. The measurements are used to try to determine where a task should be run. Over time, though, the task's behavior can change, so the scheduler checks to see if that has happened and if the placement still makes sense. There are still questions about when "race to idle" versus spreading tasks around makes the most sense, and there have been some related discussions of that recently on the linux-kernel mailing list.
Currently, the CPU hotplug support is less than ideal for removing CPUs that have gone idle. But Thomas Gleixner is reworking things to "make hotplug suck less", McKenney said. For heavy workloads, the process of offlining a processor can take multiple seconds. After Gleixner's rework, that drops to 300ms for an order of magnitude decrease. Part of the solution is to remove stop_machine() calls from the offlining path. There are multiple reasons for making hotplug work better, McKenney said, including improving read-copy update (RCU), reducing realtime disruption, and providing a low-cost way to clear things off of a CPU for a short time. He also noted that it is not an ARM-only problem that is being solved here, as x86 suffers from significant hotplug delays too.
The session finished up with a brief discussion of how to describe the architecture of a big.LITTLE system to the kernel. Currently, each platform has its own way of describing the processors and caches in its header files, but a more general way, perhaps using device tree or some kind of runtime detection mechanism, is desired.
Generic DMA engines are present in many ARM platforms to enable devices to move data between main memory and device-specific regions. Arnd Bergmann led a discussion about the DMA engine APIs as part of the last day of the ARM minisummit. DMA is the last ARM subsystem that does not have generic device tree bindings, he said, so he hoped the assembled developers could agree on some. Without those bindings, the code that uses DMA is forced to be platform-specific, which impedes progress toward the goal of building a single kernel image for multiple ARM platforms.
Bergmann said that there are many things currently blocked by the lack of device tree bindings for DMA. Those bindings need to describe the kinds of DMA channels available in the hardware, along with their attributes. Two proposals have been made to add support for the generic DMA engines. Jon Hunter has a patch set that implements a particular set of bindings, but he couldn't attend the meeting, so Bergmann presented them. The other patches were from DMA engine maintainer Vinod Koul.
The differences between the two are a bit hard to decipher. Both approaches attempt to keep any information about how to set up DMA channels from both the device driver using them and from the DMA engine driver that provides them. That knowledge would reside in the DMA engine core. With Koul's patches, there would be a global lookup table that would be populated by the platform-specific code from various sources (device tree, ACPI, etc.). That table would list the connections between devices and DMA engine drivers. Hunter's patches solve the problem simply for the device tree case, without requiring interaction with the platform-specific code.
The discussion got technically quite deep, as Bergmann admitted with a grin after the session, but the upshot is that the two approaches are not completely at odds. At the end of the session, it was agreed that both patches could be merged ("more or less", Koul said). The DMA engine core would be able to find the connection in either the device tree or via the lookup table, but will use the same device driver interfaces either way. Bergmann said that he hoped to see something in the 3.7 kernel. In between those two discussions, some things about the device tree bindings were hammered out as well.
One of the first problems noted with the bindings described in Hunter's patch was the use of numerical values (derived from flag bits) to describe attributes of DMA channels. "These magic numbers are not a readability triumph", Mark Brown said. He went on to suggest adding some kind of preprocessor support to the device tree compiler (dtc), which turns the text representation into a flattened device tree binary (dtb). That would make the flags readable, Tony Lindgren said, but he wondered if such a preprocessor was "years off".
One way around the magic number problem is to use names instead, though dealing with strings in device tree is difficult, Bergmann said. Some platforms have complicated arrangements of controllers and DMA engines, he said, using an example of an MMC (memory card) controller with two channels, one of which is connected to three different DMA engines. In order to make the request API for a DMA channel relatively simple, it would make sense to name each channel, someone suggested. One problem there is that most devices (80% perhaps) either have a single channel or just one for each direction, Bergmann said. Forcing those devices to explicitly name them adds complexity.
But most were in favor of using the names. In addition to naming the channels, standardizing the property names would make it easier to scan the whole device tree for properties of interest. Allowing devices to come up with their own property names will make that impossible. Also, when new functional units that implement DMA get added to a platform, standardized names will make it easier to incorporate them into existing device trees. So, names for each of a device's channels, along with a standard set of property names, would seem to be in the cards.
This was the last non-hacking session in the ARM minisummit, which seemed to be a great success overall. Some issues that had been lingering were discussed and resolved—or at least plans to do so were made. In addition, the status of some newer features (e.g. big.LITTLE and AArch64) was presented, so that questions could be raised and answered in real time, rather than over a sometimes slow mailing list or IRC channel pipe. Beyond the discussions, both afternoons featured hacking sessions where it sounds like some real work got done.
[ I would like to thank Will Deacon and Arnd Bergmann for reviewing parts of the ARM minisummit coverage, though any remaining errors are mine, of course. ]
Patches and updates
Core kernel code
Filesystems and block I/O
Virtualization and containers
Page editor: Jonathan Corbet
Distributionsnew application upload process highlights its vision of the desktop and what they think needs to be done to make things happen there.
Serious Linux users tend not to think of availability of software as a problem; distribution repositories typically carry tens of thousands of packages, after all, and any of those packages can be installed with a single command. The problem with distribution repositories, from Ubuntu's point of view, is that they can be stale and inaccessible to application developers. The packages in the repository tend to date from before a given distribution release's freeze date; by the time an actual distribution gets onto a user's machine, the applications found there may be well behind the curve. In some cases, applications may have lost their relevance entirely; as Steve Langasek put it:
Beyond that, getting a package into a distribution's repository is not something just anybody can do; developers must either become a maintainer for a specific distribution or rely on somebody else to create and add a package for their application. And, in most distributions, there is no place in the repository at all for proprietary applications.
Ubuntu's owner Canonical sees these problems as significant shortcomings that are holding back the creation of applications for the Linux desktop; that, in turn, impedes the development and adoption of Linux as a whole. So, a few years back, Canonical set out to remedy these problems through the creation of the Ubuntu Software Centre (USC), a repository by which developers could get applications to their users quickly. The USC is not tied to the distribution release cycle; applications added there become available to users immediately. There is a mechanism for the handling of payments, allowing proprietary applications to be sold to users. A glance through the USC shows a long list of applications (some of which are non-free) and other resources like fonts and electronic books. Guides to nearby beer festivals are, alas, still in short supply.
Naturally, Canonical does not want to provide an unsupervised means by which arbitrary software can be installed on its users' systems. Experience shows that it would not take long for malware authors, spammers, and others to make their presence felt. So the process for putting an application into the USC involves a review step. For paid applications, for which Canonical takes a 20% share of the price, there appears to be a fully-funded mechanism that can review and place applications quickly. For free applications, instead, review is done by a voluntary board and that group, it seems, has been having a hard time keeping up with the workload. The result is long delays in getting applications into the USC, discouraged developers, and frustration all around.
The new upload process proposal aims to improve the situation for free applications; Canonical does not seem to intend to change the process for paid applications. There are a number of changes intended to make life easier for everybody involved, but the key would appear to be this:
In other words, they want to make the process as automatic as possible, but not so automatic that Bad Things make it into the USC.
The first step requires developers to register with the USC, then request access to upload one or more specific packages. Getting that access will require convincing Canonical that they hold the copyrights to the code or are otherwise authorized to do the upload; it will apparently not be possible for third parties to upload software without explicit permission, even if the software is licensed in a way that would allow that to happen. A review board will look at the uploader's application and approve it if that seems warranted.
Once a developer has approval, there are a few more steps involved in putting an application into the USC. The first is to package it appropriately with the Quickly tool and submit it for an upload. That is mostly basic packaging work. Uploads through this mechanism will be done in source form; binaries will, it seems, be built within the USC itself.
But, before the application can be made available, it must be accompanied by a security policy. The mechanism is superficially similar to the privilege scheme used by Android, but the USC bases its security on the AppArmor mandatory access control mechanism instead. The creation of a full AppArmor profile can be an involved process; Canonical has tried to make things simpler by automating most of the work. The uploader need only declare the specific access privileges needed by the application. These include access to the X server, access to the network, the ability to print, and use of the camera. Interestingly, access to spelling checkers requires an explicit privilege.
All (free) USC applications will run within their own sandbox with limited access to the rest of the system. Only files and directories found in a whitelist will be accessible, for example. Applications will be prevented from listening to (or interfering with) any other application's X server or D-Bus communications. There will be a "helper" mechanism by which applications can request access to non-whitelisted files; the process will, inevitably, involve putting up a dialog and requiring the user to allow the access to proceed. That, naturally, will put some constraints on what these applications can usefully do; it is hard to imagine a new compiler working well in this environment, for example. The payoff is that, with these restrictions in place, it should not be possible for any given application to damage the system or expose information that the user does not want disclosed.
And, with all that structure in place, Canonical feels that it is safe to allow applications into the USC without the need for a manual review. That should enable applications to get to users more quickly while taking much of the load off the people who are currently reviewing uploads.
Current USC practice requires all files to be installed under /opt; this rule complies with the filesystem hierarchy standard and prevents file conflicts with the rest of the distribution. The problem, according to David Planella (one of the authors of the proposal), is that a lot of things just don't work when installed under /opt:
In other words, the /opt restriction was seen as making life difficult for developers and Ubuntu lacks the resources and will to fix the problems; the restriction has thus been removed in the proposal. With Ubuntu, Debian, and USC packages all installing files into the same directory hierarchy, an eventual conflict seems certain. There has been talk of forcing each USC package to use its own subdirectory under /usr, a solution that, evidently, is easier than /opt, but nothing has been settled as of this writing.
Presumably some solution will be found and something resembling this proposal will eventually be put into place. The result should be a leaner, faster USC that makes it possible to get applications to users quickly. Whether that will lead to the fabled Year of the Linux Desktop remains to be seen. The "app store" model has certainly helped to make other platforms more attractive; if its absence has been one of the big problems for Linux, we should find out fairly soon.
Now, there are many reasons for that: difficulty of publishing is far from the only one. But it would be a subtle error to think that an application not existing for Ubuntu at all means that difficulty of publishing is unimportant. It may be one of the reasons nobody bothered to develop the application in the first place.
openSUSEAs part of the Tumbleweed lifecycle, with the 12.2 release of openSUSE, the openSUSE:Tumbleweed repo is now empty so that you can start out with a "clean" 12.2 release. It will stay that way for a few weeks for things to settle down with 12.2, and then will start to add packages back to it (new kernel, KDE 4.9, etc.) as time permits."
Ubuntu familyDespite our best intentions and the Ubuntu App Review Board's epic efforts, we're currently putting a strain on reviewers (who cannot keep up with the incoming stream of apps) and providing an unsatisfactory experience for app authors (who have to endure long delays to be able to upload their apps)." In response, they have come up with a detailed proposal for a new process; comments are sought. "We should not rely on manual reviews of software before inclusion. Manual reviews have been found to cause a significant bottleneck in the MyApps queue and they won’t scale effectively as we grow and open up Ubuntu to thousands of apps."
Newsletters and articles of interest
Page editor: Rebecca Sobol
GStreamer is a framework designed for application development, but the memory and processing demands of multimedia mean that it leans heavily on the support of the operating system's underlying media layers. At the 2012 GStreamer Conference, representatives from Video4Linux, ALSA, and Wayland were on hand to report on recent developments and ongoing work in the world of Linux media capture, sound, and display technology.
Hans Verkuil presented a session on the Video4Linux (V4L) subsystem, which primarily handles video input, along with related matters. The major change in the V4L arena, he said, has been the emergence of the system-on-chip (SoC). In the desktop paradigm of years past, V4L had relatively simple hardware to deal with: video capture cards and webcams, the majority of which had similar capabilities. SoCs are markedly different; many including discrete components like hardware decoders and video scalers, and the system provides a flexible AV pipeline — with multiple ways to route through the on-board components depending on the processing needed.
Initially most SoC vendors wrote their own, proprietary modules to make up for the features V4L lacked, he said, but V4L has caught up. The core framework now includes a v4l2_subdev structure to communicate with sub-devices like decoders and scalers. Although these devices can vary from board to board in theory, he said, in practice most vendors tend to stick with the same parts over many hardware generations. There is also a new Media Controller API to handle managing multi-function devices (including USB webcams that include an integrated microphone, in addition to the flexible SoC routing mentioned above) and the 3.1 kernel introduced a new control framework that provides a consistent interface for brightness, contrast, frame rate, and other settings.
V4L's roots were in the standard-definition era, so the project has also struggled to make life easier for HDTV users. The initial attempt was the Presets API in kernel 2.6.33, which provided fixed settings for video in a handful of HDTV formats (720p30, 1080p60, etc.). That API eventually proved too coarse for vendors, and was replaced in kernel 3.5 with the Timings API, which allows custom modeline-like video settings. The Event API is another recent addition, significantly improved in 3.1, which allows code to subscribe to immediate notification on events like the connection or disconnection of an input port.
The videobuf2 framework is another major overhaul; the previous incarnation of the framework (which provides an abstraction layer between applications and video device drivers) did not conform to V4L's own API and provided a memory management framework so flawed that most drivers did not even use it. The new framework separates buffer operations from memory management operations, and by removing the need for each driver to implement its own memory management, should simplify device driver code significantly.
Other noteworthy changes include support for the H.264 codec, new input cropping controls, and the long-awaited ability for radio tuners to tune multiple frequency bands (such as FM and AM). Radio Data System (RDS) support has also been upgraded, and now includes Traffic Message Channel (TMC) coding used in many urban areas. Cisco hired a student for the summer to write a new RDS library to replace the older, broken one. Finally, a contiguous memory allocator was written by Samsung and others for kernel 3.5, which helps video hardware allocate the large chunks of physically contiguous memory they need for direct memory access.
There is further work still in the pipeline, of course, and Verkuil mentioned three topics of importance to GStreamer. The first is buffer sharing; video decoding pipelines would prefer to avoid copying large buffers whenever possible, but currently V4L's video buffers are specific to an individual video node. Integrating V4L with DMAbuf is probably the solution, he said, and is likely to arrive in kernel 3.8. The second is better support for newer video connector types like HDMI and DisplayPort — in particular hot-pluggability and signal detection, for use by embedded devices that need to set up these connections without user intervention. Finally, he hopes to complete a V4L compliance testing tool, which he describes as 90% finished. The tool is used to test device drivers against the API, and drivers are required to pass its test before they get into the kernel. Verkuil said that the tool is actually stricter than the published API, because it checks for a number of optional features which are easy to implement, and can annoy users if they are left out.
Takashi Iwai presented an update on the ALSA subsystem. In recent years, ALSA has not seen as many major changes as the various video subsystems have, but there are still plenty of challenges. The first is that, like video, more and more hardware devices now support decoding compressed audio in hardware. Kernel 3.3 added an API for offloading audio decoding to a hardware device, though the bigger improvement is likely to be kernel 3.7's merger of compressed audio hardware decoding for the ALSA System on Chip (ASoC) layer.
ASoC accounts for the majority of ALSA code (both in terms of lines and number of commits), Iwai said, followed by the HD-audio layer used in the majority of modern laptops. The third-largest component is USB-audio, which provides a single generic driver used by all USB audio devices. But while USB devices can share a common driver, the HD-audio layer covers roughly 4000 devices, each of which has a different configuration (in regard to which pin performs which function). It is not possible for the ALSA project to maintain and update 4000 separate configuration files, he said, so it instead relies on user reports to discover differences between hardware. That is a pain point, but most of the time hardware vendors use a consistent configuration so most devices work without configuration.
Ongoing work in ALSA includes the Use Case Manager (UCM) abstraction layer, a high-level device management layer that describes hardware routing and configuration for common tasks like "phone call" or "music playback." Jack detection is another continuing development. Currently there is no API to detect whether or not a connector has a jack plugged in, so multiple methods are in use, including Android's external connector class extcon and ALSA's general controls API.
Also still in the works is improved power management, both for HD-audio devices and for hardware decoders. Improvements are expected to land with kernel 3.7. HD-audio devices might also benefit from the ability to "patch" device firmware and change the pin configuration, so that recompiling the driver can be avoided.
The biggest outstanding issue at present is a channel mapping API, which encodes the surround-sound position associated with the speaker attached to each output channel (e.g., Front Left, Center, Right Rear, Low-Frequency Effects). Each needs to receive its own PCM audio stream, but there are multiple standards on the market, and the problem becomes even trickier when the system needs to combine channels for a setup with fewer speakers. There is a proposal in the works, which was discussed at length later in the week at the Linux Plumbers Conference audio mini-summit.
Kristian Høgsberg presented an update on the Wayland display protocol and how it will differ from X. The session was not overly GStreamer-specific, but more of an introduction to Wayland. Since Wayland is not being used in the wild yet, preparing GStreamer developers in advance should simplify the eventual transition.
Høgsberg related the reasons for Wayland's creation — namely that as separate window managers and compositors have become the norm on Linux desktops, the X server itself is increasingly doing little but acting as a middleman. Many of the earlier functions of the X server have been moved out into separate libraries, such as Freetype, Fontconfig, Qt, and GTK+. Other key functions, such as mode-setting and input devices, are handled at lower levels, and many applications use Cairo or OpenGL to paint their window contents. Compositing was the final blow, however: in a compositing desktop, each window gets a private buffer of its own, which is drawn to the screen by the compositor. In this situation, X does nothing but add cost: another copy operation for the buffer, and more memory.
He described the basics of the Wayland protocol, which he said he expected to reach 1.0 status before the end of the year. That event will not mark Wayland's world domination, however. Weston, the reference compositor, already runs on most video hardware, but the major desktop projects and distributions will each implement their own Wayland support in their existing compositors (e.g., Mutter or KWin), and that is when the majority of users will first encounter Wayland.
The more practical section of the talk followed, an explanation of how Wayland handles video content. An application allocates a pixel buffer and shares it with the compositor; the compositor then attaches the buffer to an output "surface." Whenever a new frame is drawn to the screen, the compositor sends a notification to the application, which can then send the next frame. The big difference is that Wayland always works with complete frames. In contrast, X is fundamentally a stream protocol: it sends a series of events that must be de-queued and processed.
Video support is really only a matter of extending the color spaces that Wayland understands, he said. A video buffer may contain YUV data, for example. Wayland needs to be able to put YUV data into a rendering surface, and to composite RGB and YUV data together (such as in a video overlay).
This is still a work-in-progress, with a variety of options under consideration. One would allow only RGB buffers, and require client applications to handle the conversion, which could be costly in CPU usage. Another is to decode the frames directly into OpenGL textures and let OpenGL worry about the conversions. A third is to allocate shared memory YUV buffers then require the compositor to copy them into OpenGL textures, and perform the conversion at composite-time. The entire puzzle is further complicated when one adds in the possibility of hardware-decoded video content, which is increasingly common. If the possibilities sound a tad confusing, do not worry — Høgsberg said the project still finds it unclear which approach would be best.
GStreamer's video acceleration API (VA-API) plugin already supports Wayland, so whichever path Wayland takes as it finalizes 1.0, GStreamer support should follow in short order. Of course, GStreamer itself is also preparing for its 1.0 release. But as the Wayland, ALSA, and Video4Linux talks demonstrate, multimedia support on Linux is in an ever-changing state.
Newsletters and articles
Page editor: Nathan Willis
Brief itemsThe Document Foundation will primarily focus on the ODF Technical Committees, to represent the largest independent free software community focused on the development and the promotion of "the best free office suite" based on the Open Document Format. LibreOffice is available in over 100 native language versions, more than twice than any comparable software, and is therefore the most sophisticated, feature rich, complete and widespread ODF implementation worldwide."
Articles of interest
Calls for PresentationsThe call for papers is public, meaning that all proposals get published on the website for others to vote and comment on. This approach allows the organizers to pick subjects that have most interest in the community. The comments are only visible to speakers and organizers to avoid influencing the votes."
Upcoming EventsLPI will also present on the subject of "Catching the Wave of Open Source Careers" during OLF's Career Track."
|DjangoCon US||Washington, DC, USA|
|Hardening Server Indonesia Linux Conference 2012||Malang, Indonesia|
|International Conference on Open Source Systems||Hammamet, Tunisia|
|Debian FTPMaster sprint||Fulda, Germany|
|Debian Bug Squashing Party||Berlin, Germany|
|KPLI Meeting Indonesia Linux Conference 2012||Malang, Indonesia|
|PyTexas 2012||College Station, TX, USA|
|Bitcoin Conference||London, UK|
|SNIA Storage Developers' Conference||Santa Clara, CA, USA|
|Postgres Open||Chicago, IL, USA|
|SUSECon||Orlando, Florida, US|
|Automotive Linux Summit 2012||Gaydon/Warwickshire, UK|
|2012 X.Org Developer Conference||Nürnberg, Germany|
|openSUSE Summit||Orlando, FL, USA|
|September 21||Kernel Recipes||Paris, France|
|GNU Radio Conference||Atlanta, USA|
|OpenCms Days||Cologne, Germany|
|PuppetConf||San Francisco, US|
|PyCon UK 2012||Coventry, West Midlands, UK|
|September 28||LPI Forum||Warsaw, Poland|
|Ohio LinuxFest 2012||Columbus, OH, USA|
|PyCon India 2012||Bengaluru, India|
|Velocity Europe||London, England|
|PyCon South Africa 2012||Cape Town, South Africa|
|GNOME Boston Summit 2012||Cambridge, MA, USA|
|Korea Linux Forum 2012||Seoul, South Korea|
|Open Source Developer's Conference / France||Paris, France|
|Debian Bug Squashing Party in Utrecht||Utrecht, Netherlands|
|October 13||2012 Columbus Code Camp||Columbus, OH, USA|
|Debian BSP in Alcester (Warwickshire, UK)||Alcester, Warwickshire, UK|
|PyCon Ireland 2012||Dublin, Ireland|
|FUDCon:Paris 2012||Paris, France|
|OpenStack Summit||San Diego, CA, USA|
|Linux Driver Verification Workshop||Amirandes,Heraklion, Crete|
|LibreOffice Conference||Berlin, Germany|
|MonkeySpace||Boston, MA, USA|
|14th Real Time Linux Workshop||Chapel Hill, NC, USA|
|PyCon Ukraine 2012||Kyiv, Ukraine|
|Gentoo miniconf||Prague, Czech Republic|
|PyCarolinas 2012||Chapel Hill, NC, USA|
|LinuxDays||Prague, Czech Republic|
|openSUSE Conference 2012||Prague, Czech Republic|
|PyCon Finland 2012||Espoo, Finland|
|PostgreSQL Conference Europe||Prague, Czech Republic|
|Droidcon London||London, UK|
|PyData NYC 2012||New York City, NY, USA|
|Firebird Conference 2012||Luxembourg, Luxembourg|
|October 27||pyArkansas 2012||Conway, AR, USA|
|October 27||Central PA Open Source Conference||Harrisburg, PA, USA|
|October 27||Linux Day 2012||Hundreds of cities, Italy|
|Technical Dutch Open Source Event||Eindhoven, Netherlands|
|Ubuntu Developer Summit - R||Copenhagen, Denmark|
|Linaro Connect||Copenhagen, Denmark|
|PyCon DE 2012||Leipzig, Germany|
|October 30||Ubuntu Enterprise Summit||Copenhagen, Denmark|
|MeetBSD California 2012||Sunnyvale, California, USA|
|OpenFest 2012||Sofia, Bulgaria|
|Embedded Linux Conference Europe||Barcelona, Spain|
|LinuxCon Europe||Barcelona, Spain|
|ApacheCon Europe 2012||Sinsheim, Germany|
|Apache OpenOffice Conference-Within-a-Conference||Sinsheim, Germany|
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds