LWN.net Weekly Edition for September 7, 2012

While the Linux Security Summit (LSS) was held later in the week, it was logically part of the minisummits that accompanied the Kernel Summit—organizer James Morris made a forward-reference report on LSS as part of the minisummit reports. Day one was filled with talks on various topics of interest to the assembled security developers, while day two was mostly devoted to reports from the kernel security subsystems. We plan to write up much of LSS over the coming weeks; the first installment covers a talk given by SELinux developer Dan Walsh on secure Linux containers.

Walsh's opening slide had a picture of a "secure" Linux container (label seen at right)—a plastic "unix ware" storage container—but his talk was a tad more serious. Application sandboxes are becoming more common for isolating general-purpose applications from each other. There are a variety of Linux tools that can be used to create sandboxes, including seccomp, SELinux, the Java virtual machine, and virtualization. The idea behind sandboxing is the age-old concept of "defense in depth".

There is another mechanism that can be used to isolate applications: containers. When most people think of containers, they think of LXC, which is a command-line tool created by IBM. But, the Linux kernel knows nothing about containers, per se, and LXC is built atop Linux namespaces. The secure containers project did not use LXC directly; instead it uses libvirt-lxc.

Using namespaces, child processes can have an entirely different view of the system than does the parent. Namespaces are not all that new, RHEL5 and Fedora 6 used the pam_namespace to partition logins into "secret" vs. "top secret" for example. The SELinux sandbox also used namespaces and was available in RHEL6 and Fedora 8. More recently, Fedora 17 uses systemd which has PrivateTmp and PrivateNetwork directives for unit files that can be used to give services their own view of /tmp or the network. There are 20-30 services in Fedora 17 that are running with their own /tmp, Walsh said.

In addition, Red Hat offers the OpenShift service which allows anyone to have their own Apache webserver for free on Red Hat servers. It is meant to remove the management aspect so that developers can concentrate on developing web applications that can eventually be deployed elsewhere. Since there are many different Apache instances running on the OpenShift servers, sandboxing is used to keep them from interfering with each other.

There are several different kinds of namespaces in Linux. The mount namespace gives processes their own view of the filesystem, while the PID namespace gives them their own set of process IDs. The IPC and Network namespaces allow for private views of those resources, and the UTS namespace allows the processes to have their own host and domain names. The UID namespace is another that is not yet available, and one that concerns Walsh because of its intrusiveness. It would give a private set of UIDs, such that UID 0 inside of the namespace is not the same as root outside.

Secure Linux containers uses libvirt-lxc to set up namespaces that effectively create containers to hold processes that are isolated from those in other containers. Libvirt-lxc has a C API, but also has bindings for several different higher-level languages. It can set up a container, with a firewall, SELinux type enforcement (TE) and multi-category security (MCS), bind mounts that pass through to the host filesystem, and so on. Once that is done, it can start an init process (systemd in this case) inside the container so that it appears to be almost a full Linux system inside the container. In addition, these containers can be managed using control groups (cgroups) so that no one container can monopolize resources like memory or CPU.

But, libvirt-lxc has a complex API that is XML-based. Walsh wanted something simpler, so he created libvirt-sandbox with a key-value based configuration. He intends to replace the SELinux sandbox using libvirt-sandbox, but it is not quite ready for that yet.

To make things even easier, Walsh created a Python script that makes it "dirt simple" for an administrator to build a container or set of containers. He said that Red Hat is famous for building "cool tools that no one uses" because they are too complicated, so he set out to make something very simple to use.

The tool can be used as follows:

    virt-sandbox-service create -C -u httpd.service.apache1

That call will do multiple things under the covers. It creates a systemd unit file for the container, which means that standard systemd commands can be used to manage it. In addition, if someone puts a GUI on systemd someday, administrators can use that to manage their containers, he said. It also creates the filesystems for the container. It does not use a full chroot(), Walsh said, because he wants to be able to share /usr between containers. For this use case (an Apache web server container), he wants the individual containers to pick up any updates that come from doing a yum update on the host.

It also clones the /var and /etc configuration files into its own copy. In a perfect world, the container would bind mount over /etc, but it can't do that, partly because /etc has so many needed configuration files ("/etc is a cesspool of garbage" was his colorful way of describing that). In addition, it allocates a unique SELinux MCS label that restricts the processes inside the container. "Containers are not for security", he said, because root inside the container can always escape, so the container gets wrapped in SELinux to restrict it.

Once the container has been created, it can be started with:

    virt-sandbox-service start apache1

Similarly, the stop command can terminate the container. One can also use the connect command to get a shell in the container.

    virt-sandbox-service execute -C ifconfig apache1

will run a command in the container. For example, there is no separate cron running in each of the containers, instead the execute is used to do things like logrotate from the host's cron.

The systemd unit file that gets created can start and stop multiple container instances with a single command. Beyond that, using the ReloadPropagatedFrom directive in the unit file will allow an update of the host's apache package to restart all of the servers in the containers. So:

    systemctl reload httpd.service

will trigger a reload in all container instances, while:

    systemctl start http@.service

will start up all such services (which means all of the defined containers).

This is all recent work, Walsh said. It works "relatively well", but still needs work. There are other use cases for these containers, beyond just the OpenShift-like example he used. For instance, the Fedora project uses Mock to build packages, and Mock runs as root. That means there are some 3000 Fedora packagers who could do "bad stuff" on the build systems, so putting Mock into a secure container would provide better security. Another possibility would be to run customer processes (e.g. Hadoop) on a GlusterFS node. Another service that Walsh has containerized is MySQL, and more are possible.

Walsh demonstrated virt-sandbox-service at the end of his talk. He demonstrated some of the differences inside and outside of the container, including a surprising answer to getenforce inside the container. It reports that SELinux is disabled, but that is a lie, he said, to stop various scripts from trying to do SELinux things within the container. In addition, he showed that the eth0 device inside the container did not even appear in the host's ifconfig output (nor, of course, did the host's wlan0 appear in the container).

A number of steps have been taken to try to prevent root from breaking out of the container, but there is more to be done. Both mount and mknod will fail inside the container for example. These containers are not as secure as full virtualization, Walsh said, but they are much easier to manage than handling the multiple full operating systems that virtualization requires. For many use cases, secure containers may be the right fit.

Comments (7 posted)

Security quotes of the week

Very unfortunately at 7:43 p.m. Pacific time, the channel was automatically banned in the middle of an acceptance speech by author Neil Gaiman due to “copyright infringement.” This occurred because our 3rd party automated infringement system, Vobile, detected content in the stream that it deemed to be copyrighted. Vobile is a system that rights holders upload their content for review on many video sites around the web. The video clips shown prior to Neil’s speech automatically triggered the 3rd party system at the behest of the copyright holder.

Our editorial team and content monitors almost immediately noticed a flood of livid Twitter messages about the ban and attempted to restore the broadcast. Unfortunately, we were not able to lift the ban before the broadcast ended. We had many unhappy viewers as a result, and for that I am truly sorry. As a long-time Firefly, Stargate and Game of Thrones fan among others, I am especially disheartened by this.

-- Brad Hunstable explains why Ustream stopped showing the live stream of the Hugo Awards

Here's an example of what has happened to me (and many other people). I uploaded a video of mine that included a segment of old, definitely public domain material. Shortly thereafter, my entire vid was flagged by YouTube's Content ID. Why? It took some digging to figure out, but it turns out a Content ID partner had uploaded a video of their own that happened to include a section of the same public domain material I had used. This apparently made it look like my video was infringing, since Content ID assumed the section of my vid that matched their vid was in violation. Wrong! But Content ID partners get the assumption of being correct, and there's no way for an average user to assert that something is public domain a priori. I was able to get this reversed by careful explanation on the appropriate forms, but I wonder how many people would just throw up their arms and say, "To hell with it!" and not bother?

-- Lauren Weinstein

Comments (16 posted)

The new Java 0Day examined (The H)

Here's an article in The H explaining how the latest (still unpatched, apparently known to Oracle since April) Java vulnerability works. "Oracle has not yet released an official statement concerning the critical vulnerability. At this article's time of publication, the company still offered Java version 7 update 6 to download; like all older series 7 versions, this release is vulnerable to attacks via the vector described above. Users who have a vulnerable version installed on their systems are advised to disable the browser plugin that provides Java support."

Comments (44 posted)

Oracle patches critical Java bugs used to commandeer computers (ars technica)

Ars technica reports that Oracle has issued an update for critical vulnerabilities in Java. "The vulnerabilities addressed in the update include those designated as CVE-2012-4681. Among those Oracle credited was Adam Gowdiak of Poland-based Security Explorations, who said he alerted Oracle engineers to the vulnerabilities in April. A brief analysis of the patch by the Immunity security firm found that at least two other vulnerabilities are fixed as well. A post on Oracle's security blog said the patch addressed three "distinct but related vulnerabilities and one security-in-depth issue affecting Java running in desktop browsers." The flaws also included CVE-2012-1682, and CVE-2012-3136."

Comments (7 posted)

bugzilla: LDAP data injection

Package(s):

bugzilla

CVE #(s):

CVE-2012-3981

Created:

September 5, 2012

Updated:

September 11, 2012

Description:

From the CVE entry:

Auth/Verify/LDAP.pm in Bugzilla 2.x and 3.x before 3.6.11, 3.7.x and 4.0.x before 4.0.8, 4.1.x and 4.2.x before 4.2.3, and 4.3.x before 4.3.3 does not restrict the characters in a username, which might allow remote attackers to inject data into an LDAP directory via a crafted login attempt.

Alerts:

Mageia	MGASA-2013-0117	bugzilla	2013-04-18
Mandriva	MDVSA-2013:066	bugzilla	2013-04-08
Fedora	FEDORA-2012-13171	bugzilla	2012-09-10
Fedora	FEDORA-2012-13163	bugzilla	2012-09-10
Mageia	MGASA-2012-0255	bugzilla	2012-09-04

fetchmail: denial of service

Package(s):

fetchmail

CVE #(s):

CVE-2012-3482

Created:

September 4, 2012

Updated:

April 5, 2013

Description:

From the Mandriva advisory:

A denial of service flaw was found in the way Fetchmail, a remote mail retrieval and forwarding utility, performed base64 decoding of certain NTLM server responses. Upon sending the NTLM authentication request, Fetchmail did not check if the received response was actually part of NTLM protocol exchange, or server-side error message and session abort. A rogue NTML server could use this flaw to cause fetchmail executable crash.

Alerts:

Mandriva	MDVSA-2013:037	fetchmail	2013-04-05
Fedora	FEDORA-2012-14462	fetchmail	2012-10-02
Fedora	FEDORA-2012-14451	fetchmail	2012-10-02
Mageia	MGASA-2012-0259	fetchmail	2012-09-07
Mandriva	MDVSA-2012:149	fetchmail	2012-09-01

gimp: multiple vulnerabilities

Package(s):

gimp

CVE #(s):

CVE-2012-2763 CVE-2012-3236

Created:

September 4, 2012

Updated:

November 9, 2012

Description:

From the

Buffer overflow in the readstr_upto function in plug-ins/script-fu/tinyscheme/scheme.c in GIMP 2.6.12 and earlier, and possibly 2.6.13, allows remote attackers to execute arbitrary code via a long string in a command to the script-fu server. (CVE-2012-2763)

fits-io.c in GIMP before 2.8.1 allows remote attackers to cause a denial of service (NULL pointer dereference and application crash) via a malformed XTENSION header of a .fit file, as demonstrated using a long string. (CVE-2012-3236)

Alerts:

Mandriva	MDVSA-2013:082	gimp	2013-04-09
Mageia	MGASA-2012-0286	gimp	2012-10-06
Mageia	MGASA-2012-0327	gimp	2012-11-09
Gentoo	201209-23	gimp	2012-09-28
Ubuntu	USN-1559-1	gimp	2012-09-10
openSUSE	openSUSE-SU-2012:1131-1	gimp	2012-09-07
openSUSE	openSUSE-SU-2012:1080-1	gimp	2012-09-03

gnome-keyring: improper caching of passwords/passphrase

Package(s):

gnome-keyring

CVE #(s):

CVE-2012-3466

Created:

September 5, 2012

Updated:

April 9, 2013

Description:

gnome-keyring seems to obey the configuration asking it to stop caching passphrases, but after a while it doesn't cache nor does it ask for the passphrase. See the Red Hat bugzilla for details.

Alerts:

Mandriva	MDVSA-2013:084	gnome-keyring	2013-04-09
Mageia	MGASA-2012-0262	gnome-keyring	2012-09-09
openSUSE	openSUSE-SU-2012:1121-1	gnome-keyring	2012-09-06
Fedora	FEDORA-2012-12368	gnome-keyring	2012-09-04

jabberd: domain spoofing

Package(s):

jabberd

CVE #(s):

CVE-2012-3525

Created:

September 4, 2012

Updated:

September 6, 2012

Description:

From the Red Hat bugzilla:

A security flaw was found in the XMPP Dialback protocol implementation of jabberd2, OpenSource server implementation of the Jabber protocols (Verify Response and Authorization Response were not checked within XMPP protocol server to server session). A rogue XMPP server could use this flaw to spoof one or more domains, when communicating with vulnerable server implementation, possibly leading into XMPP's Server Dialback protections bypass.

Alerts:

Fedora	FEDORA-2012-12481	jabberd	2012-09-03
Fedora	FEDORA-2012-12487	jabberd	2012-09-03

java: multiple vulnerabilities

Package(s):

java-1.6.0-openjdk

CVE #(s):

CVE-2012-0547 CVE-2012-1682

Created:

September 4, 2012

Updated:

October 19, 2012

Description:

From the Red Hat advisory:

It was discovered that the Beans component in OpenJDK did not perform permission checks properly. An untrusted Java application or applet could use this flaw to use classes from restricted packages, allowing it to bypass Java sandbox restrictions. (CVE-2012-1682)

A hardening fix was applied to the AWT component in OpenJDK, removing functionality from the restricted SunToolkit class that was used in combination with other flaws to bypass Java sandbox restrictions. (CVE-2012-0547)

Alerts:

Gentoo	201406-32	icedtea-bin	2014-06-29
Gentoo	201401-30	oracle-jdk-bin	2014-01-26
Red Hat	RHSA-2012:1392-01	java-1.6.0-sun	2012-10-18
Red Hat	RHSA-2012:1466-01	java-1.6.0-ibm	2012-11-15
Mandriva	MDVSA-2012:150-1	java-1.6.0-openjdk	2012-10-05
SUSE	SUSE-SU-2012:1231-1	IBM Java	2012-09-25
Fedora	FEDORA-2012-13127	java-1.6.0-openjdk	2012-09-19
Red Hat	RHSA-2012:1289-01	java-1.7.0-ibm	2012-09-18
openSUSE	openSUSE-SU-2012:1175-1	java-1_6_0-openjdk	2012-09-14
Scientific Linux	SL-java-20121030	java-1.6.0-sun	2012-10-30
SUSE	SUSE-SU-2012:1148-1	OpenJDK	2012-09-12
openSUSE	openSUSE-SU-2012:1154-1	java-1_7_0-openjdk	2012-09-12
Mandriva	MDVSA-2012:150	java-1.6.0-openjdk	2012-09-10
Mageia	MGASA-2012-0260	java-1.7.0-openjdk	2012-09-08
Oracle	ELSA-2012-1222	java-1.6.0-openjdk	2012-09-04
Mageia	MGASA-2012-0252	java-1.6.0-openjdk	2012-09-04
Scientific Linux	SL-java-20120904	java-1.7.0-openjdk	2012-09-04
Scientific Linux	SL-java-20120904	java-1.6.0-openjdk	2012-09-04
Scientific Linux	SL-java-20120904	java-1.6.0-openjdk	2012-09-04
Oracle	ELSA-2012-1223	java-1.7.0-openjdk	2012-09-03
Oracle	ELSA-2012-1221	java-1.6.0-openjdk	2012-09-03
Ubuntu	USN-1553-1	openjdk-6	2012-09-03
CentOS	CESA-2012:1223	java-1.7.0-openjdk	2012-09-03
CentOS	CESA-2012:1222	java-1.6.0-openjdk	2012-09-03
CentOS	CESA-2012:1221	java-1.6.0-openjdk	2012-09-03
Red Hat	RHSA-2012:1225-01	java-1.7.0-oracle	2012-09-04
Red Hat	RHSA-2012:1223-01	java-1.7.0-openjdk	2012-09-03
Red Hat	RHSA-2012:1222-01	java-1.6.0-openjdk	2012-09-03
Red Hat	RHSA-2012:1221-01	java-1.6.0-openjdk	2012-09-03

java: multiple vulnerabilities

Package(s):

java-1.7.0-openjdk

CVE #(s):

CVE-2012-3136 CVE-2012-4681

Created:

September 4, 2012

Updated:

April 19, 2013

Description:

From the Red Hat advisory:

Multiple improper permission check issues were discovered in the Beans component in OpenJDK. An untrusted Java application or applet could use these flaws to bypass Java sandbox restrictions.

Alerts:

Gentoo	201401-30	oracle-jdk-bin	2014-01-26
Fedora	FEDORA-2013-5922	java-1.7.0-openjdk	2013-04-19
CentOS	CESA-2013:0165	java-1.7.0-openjdk	2013-01-16
CentOS	CESA-2013:0165	java-1.7.0-openjdk	2013-01-16
Fedora	FEDORA-2012-16346	java-1.7.0-openjdk	2012-10-18
Mandriva	MDVSA-2012:150-1	java-1.6.0-openjdk	2012-10-05
SUSE	SUSE-SU-2012:1398-1	OpenJDK	2012-10-24
SUSE	SUSE-SU-2012:1231-1	IBM Java	2012-09-25
Red Hat	RHSA-2012:1289-01	java-1.7.0-ibm	2012-09-18
Scientific Linux	SL-java-20120912	java-1.6.0-sun	2012-09-12
openSUSE	openSUSE-SU-2012:1154-1	java-1_7_0-openjdk	2012-09-12
Mandriva	MDVSA-2012:150	java-1.6.0-openjdk	2012-09-10
Mageia	MGASA-2012-0260	java-1.7.0-openjdk	2012-09-08
Scientific Linux	SL-java-20120904	java-1.7.0-openjdk	2012-09-04
Oracle	ELSA-2012-1223	java-1.7.0-openjdk	2012-09-03
CentOS	CESA-2012:1223	java-1.7.0-openjdk	2012-09-03
Fedora	FEDORA-2012-13138	java-1.7.0-openjdk	2012-09-03
Fedora	FEDORA-2012-13131	java-1.7.0-openjdk	2012-09-03
Red Hat	RHSA-2012:1225-01	java-1.7.0-oracle	2012-09-04
Red Hat	RHSA-2012:1223-01	java-1.7.0-openjdk	2012-09-03

keystone: multiple vulnerabilities

Package(s):

keystone

CVE #(s):

CVE-2012-3542 CVE-2012-3426

Created:

September 4, 2012

Updated:

November 29, 2012

Description:

From the Ubuntu advisory:

Dolph Mathews discovered that OpenStack Keystone did not properly restrict to administrative users the ability to update users' tenants. A remote attacker that can reach the administrative API can use this to add any user to any tenant. (CVE-2012-3542)

Derek Higgins discovered that OpenStack Keystone did not properly implement token expiration. A remote attacker could use this to continue to access an account that has been disabled or has a changed password. (CVE-2012-3426)

Alerts:

Red Hat	RHSA-2012:1378-01	openstack-keystone	2012-10-16
Fedora	FEDORA-2012-13075	openstack-keystone	2012-10-03
Ubuntu	USN-1641-1	keystone	2012-11-28
Ubuntu	USN-1552-1	keystone	2012-09-03

mariadb: unspecified vulnerability

Package(s):

mariadb

CVE #(s):

Created:

August 30, 2012

Updated:

September 6, 2012

Description:

From the Mageia advisory:

This security update for Mariadb corrects a problem that is not yet being publicly disclosed.

In addition, a problem preventing the feedback plugin from working has been corrected.

Alerts:

Mageia

MGASA-2012-0244

mariadb

2012-08-30

mesa: code execution

Package(s):

Mesa

CVE #(s):

CVE-2012-2864

Created:

September 6, 2012

Updated:

April 10, 2013

Description:

From the Red Hat bugzilla entry:

Mesa, as used in Google Chrome before 21.0.1183.0 on the Acer AC700, Cr-48, and Samsung Series 5 and 5 550 Chromebook platforms, and the Samsung Chromebox Series 3, allows remote attackers to execute arbitrary code via unspecified vectors that trigger an "array overflow."

Alerts:

Gentoo	201404-06	mesa	2014-04-08
Mandriva	MDVSA-2013:103	mesa	2013-04-10
Ubuntu	USN-1623-1	mesa	2012-11-05
Mageia	MGASA-2012-0264	mesa	2012-09-10
openSUSE	openSUSE-SU-2012:1120-1	Mesa	2012-09-06

moin: privilege escalation

Package(s):

moin

CVE #(s):

CVE-2012-4404

Created:

September 6, 2012

Updated:

September 18, 2012

Description:

From the Debian advisory:

It was discovered that Moin, a Python clone of WikiWiki, incorrectly evaluates ACLs when virtual groups are involved. This may allow certain users to have additional permissions (privilege escalation) or lack expected permissions.

Alerts:

Ubuntu	USN-1604-1	moin	2012-10-11
Fedora	FEDORA-2012-13400	moin	2012-09-17
Fedora	FEDORA-2012-13408	moin	2012-09-17
Debian	DSA-2538-1	moin	2012-09-05

ocaml-xml-light: denial of service

Package(s):

ocaml-xml-light

CVE #(s):

CVE-2012-3514

Created:

August 31, 2012

Updated:

April 10, 2013

Description:

From the CVE entry:

OCaml Xml-Light Library before r234 computes hash values without restricting the ability to trigger hash collisions predictably, which allows context-dependent attackers to cause a denial of service (CPU consumption) via unspecified vectors.

Alerts:

Mandriva	MDVSA-2013:107	ocaml-xml-light	2013-04-10
Mageia	MGASA-2012-0266	ocaml-xml-light	2012-09-13
Fedora	FEDORA-2012-12500	ocaml-xml-light	2012-08-31

otrs2: cross-site scripting

Package(s):

otrs2

CVE #(s):

CVE-2012-2582

Created:

August 31, 2012

Updated:

September 6, 2012

Description:

From the Debian advisory:

It was discovered that otrs2, a ticket request system, contains a cross-site scripting vulnerability when email messages are viewed using Internet Explorer. This update also improves the HTML security filter to detect tag nesting.

Alerts:

Mandriva	MDVSA-2013:112	otrs	2013-04-10
Mageia	MGASA-2012-0322	otrs	2012-11-06
openSUSE	openSUSE-SU-2012:1105-2	otrs	2012-09-04
openSUSE	openSUSE-SU-2012:1105-1	otrs	2012-09-04
Debian	DSA-2536-1	otrs2	2012-08-30

qemu-kvm: privilege escalation

Package(s):

qemu-kvm

CVE #(s):

CVE-2012-3515

Created:

September 5, 2012

Updated:

October 25, 2012

Description:

From the Red Hat advisory:

A flaw was found in the way QEMU handled VT100 terminal escape sequences when emulating certain character devices. A guest user with privileges to write to a character device that is emulated on the host using a virtual console back-end could use this flaw to crash the qemu-kvm process on the host or, possibly, escalate their privileges on the host.

Alerts:

Gentoo	201604-03	xen	2016-04-05
Gentoo	201309-24	xen	2013-09-27
Mandriva	MDVSA-2013:121	qemu	2013-04-10
Fedora	FEDORA-2012-15606	qemu	2012-10-17
Fedora	FEDORA-2012-15740	qemu	2012-10-13
SUSE	SUSE-SU-2012:1203-2	qemu	2012-10-25
SUSE	SUSE-SU-2012:1320-1	qemu	2012-10-09
openSUSE	openSUSE-SU-2012:1573-1	XEN	2012-11-26
openSUSE	openSUSE-SU-2012:1572-1	XEN	2012-11-26
Ubuntu	USN-1590-1	qemu-kvm	2012-10-02
SUSE	SUSE-SU-2012:1203-1	qemu	2012-09-18
SUSE	SUSE-SU-2012:1205-1	kvm	2012-09-18
SUSE	SUSE-SU-2012:1202-1	kvm	2012-09-18
Fedora	FEDORA-2012-13443	xen	2012-09-17
SUSE	SUSE-SU-2012:1162-1	Xen	2012-09-13
openSUSE	openSUSE-SU-2012:1172-1	Xen	2012-09-14
openSUSE	openSUSE-SU-2012:1174-1	Xen	2012-09-14
openSUSE	openSUSE-SU-2012:1170-1	qemu	2012-09-14
openSUSE	openSUSE-SU-2012:1153-1	kvm	2012-09-12
SUSE	SUSE-SU-2012:1135-1	Xen	2012-09-07
Mageia	MGASA-2012-0263	qemu-kvm	2012-09-09
Debian	DSA-2545-1	qemu	2012-09-08
Debian	DSA-2543-1	xen-qemu-dm-4.0	2012-09-08
Debian	DSA-2542-1	qemu-kvm	2012-09-08
SUSE	SUSE-SU-2012:1133-1	Xen	2012-09-07
Scientific Linux	SL-xen-20120905	xen	2012-09-05
Scientific Linux	SL-kvm-20120905	kvm	2012-09-05
Scientific Linux	SL-qemu-20120905	qemu-kvm	2012-09-05
Oracle	ELSA-2012-1235	kvm	2012-09-05
Oracle	ELSA-2012-1234	qemu-kvm	2012-09-05
Oracle	ELSA-2012-1236	xen	2012-09-05
CentOS	CESA-2012:1234	qemu-kvm	2012-09-05
CentOS	CESA-2012:1235	kvm	2012-09-05
CentOS	CESA-2012:1236	xen	2012-09-05
Red Hat	RHSA-2012:1236-01	xen	2012-09-05
Red Hat	RHSA-2012:1235-01	kvm	2012-09-05
Red Hat	RHSA-2012:1234-01	qemu-kvm	2012-09-05

quota: bypass TCP Wrappers rules

Package(s):

quota

CVE #(s):

CVE-2012-3417

Created:

August 30, 2012

Updated:

January 17, 2013

Description:

From the CVE entry:

The good_client function in rquotad (rquota_svc.c) in Linux DiskQuota (aka quota) before 3.17 invokes the hosts_ctl function the first time without a host name, which might allow remote attackers to bypass TCP Wrappers rules in hosts.deny.

Alerts:

CentOS	CESA-2013:0120	quota	2013-01-09
Oracle	ELSA-2013-0120	quota	2013-01-12
Scientific Linux	SL-quot-20130116	quota	2013-01-16
openSUSE	openSUSE-SU-2012:1058-1	quota	2012-08-29

rtfm: cross-site scripting

Package(s):

rtfm

CVE #(s):

CVE-2012-2768

Created:

August 30, 2012

Updated:

September 6, 2012

Description:

From the Debian advisory:

It was discovered that rtfm, the Request Tracker FAQ Manager, contains multiple cross-site scripting vulnerabilities in the topic administration page.

Alerts:

Debian

DSA-2535-1

rtfm

2012-08-29

tor: multiple vulnerabilities

Package(s):

tor

CVE #(s):

CVE-2012-3517 CVE-2012-3518 CVE-2012-3519

Created:

August 30, 2012

Updated:

February 4, 2013

Description:

From the CVE entries:

Use-after-free vulnerability in dns.c in Tor before 0.2.2.38 might allow remote attackers to cause a denial of service (daemon crash) via vectors related to failed DNS requests. (CVE-2012-3517)

The networkstatus_parse_vote_from_string function in routerparse.c in Tor before 0.2.2.38 does not properly handle an invalid flavor name, which allows remote attackers to cause a denial of service (out-of-bounds read and daemon crash) via a crafted (1) vote document or (2) consensus document. (CVE-2012-3518)

routerlist.c in Tor before 0.2.2.38 uses a different amount of time for relay-list iteration depending on which relay is chosen, which might allow remote attackers to obtain sensitive information about relay selection via a timing side-channel attack. (CVE-2012-3519)

Alerts:

Mandriva	MDVSA-2013:132	tor	2013-04-10
Fedora	FEDORA-2012-14650	tor	2013-02-03
Gentoo	201301-03	tor	2013-01-08
Mageia	MGASA-2012-0276	tor	2012-09-30
Debian	DSA-2548-1	tor	2012-09-13
openSUSE	openSUSE-SU-2012:1068-1	tor	2012-08-30

typo3-src: multiple vulnerabilities

Package(s):

typo3-src

CVE #(s):

CVE-2012-3527 CVE-2012-3528 CVE-2012-3529 CVE-2012-3530 CVE-2012-3531

Created:

August 31, 2012

Updated:

September 6, 2012

Description:

From the Debian advisory:

CVE-2012-3527: An insecure call to unserialize in the help system enables arbitrary code execution by authenticated users.

CVE-2012-3528: The TYPO3 backend contains several cross-site scripting vulnerabilities.

CVE-2012-3529: Authenticated users who can access the configuration module can obtain the encryption key, allowing them to escalate their privileges.

CVE-2012-3530: The RemoveXSS HTML sanitizer did not remove several HTML5 JavaScript, thus failing to mitigate the impact of cross-site scripting vulnerabilities.

Alerts:

Debian

DSA-2537-1

typo3-src

2012-08-30

wireshark: multiple vulnerabilities

Package(s):

wireshark

CVE #(s):

CVE-2012-4286 CVE-2012-4294 CVE-2012-4295 CVE-2012-4298

Created:

August 30, 2012

Updated:

September 6, 2012

Description:

From the CVE entries:

The pcapng_read_packet_block function in wiretap/pcapng.c in the pcap-ng file parser in Wireshark 1.8.x before 1.8.2 allows user-assisted remote attackers to cause a denial of service (divide-by-zero error and application crash) via a crafted pcap-ng file. (CVE-2012-4286)

Buffer overflow in the channelised_fill_sdh_g707_format function in epan/dissectors/packet-erf.c in the ERF dissector in Wireshark 1.8.x before 1.8.2 allows remote attackers to execute arbitrary code via a large speed (aka rate) value. (CVE-2012-4294)

Array index error in the channelised_fill_sdh_g707_format function in epan/dissectors/packet-erf.c in the ERF dissector in Wireshark 1.8.x before 1.8.2 might allow remote attackers to cause a denial of service (application crash) via a crafted speed (aka rate) value. (CVE-2012-4295)

Integer signedness error in the vwr_read_rec_data_ethernet function in wiretap/vwr.c in the Ixia IxVeriWave file parser in Wireshark 1.8.x before 1.8.2 allows user-assisted remote attackers to execute arbitrary code via a crafted packet-trace file that triggers a buffer overflow. (CVE-2012-4298)

Alerts:

Gentoo	GLSA 201308-05:02	wireshark	2013-08-30
Gentoo	201308-05	wireshark	2013-08-28
openSUSE	openSUSE-SU-2012:1067-1	wireshark	2012-08-30

zabbix: SQL injection

Package(s):

zabbix

CVE #(s):

CVE-2012-3435

Created:

August 31, 2012

Updated:

January 1, 2013

Description:

From the CVE entry:

SQL injection vulnerability in frontends/php/popup_bitem.php in Zabbix 1.8.15rc1 and earlier, and 2.x before 2.0.2rc1, allows remote attackers to execute arbitrary SQL commands via the itemid parameter.

Alerts:

Gentoo	201311-15	zabbix	2013-11-25
Mageia	MGASA-2012-0370	zabbix	2012-12-31
Debian	DSA-2539-1	zabbix	2012-09-06
Fedora	FEDORA-2012-12488	zabbix	2012-08-31
Fedora	FEDORA-2012-12496	zabbix	2012-08-31

Kernel release status

The current development kernel is 3.6-rc4, released on September 1. "Shortlog appended, as you can see it's just fairly random. I'm hoping we're entering the boring/stable part of the -rc windows, and that things won't really pick up speed just because people are getting home."

Stable updates: no stable updates have been released in the last week, and none are in the review process as of this writing.

Quotes of the week

As every parent knows, a tidy bedroom is very different from a messy one. The number of items in the room may be exactly the same, but the difference between orderly and disorderly arrangements is immediately apparent. Now imagine a house with millions of rooms, each of which is either tidy or messy. A robot in the house can inspect each room to see which state it is in. It can also turn a tidy room into a messy one (by throwing things on the floor at random) and a messy room into a tidy one (by tidying it up). This, in essence, is how a new class of memory chip works.

— The Economist on phase-change memory

"RFC" always worries me. I read it as "Really Flakey Code"

— Andrew Morton

Sorry for the late response, was too busy drinking with other kernel developers in San Diego and laughing at all you that are still doing real work.

— Steven Rostedt

Yes I have now read kernel bugzilla, every open bug (and closed over half of them). An interesting read, mysteries that Sherlock Holmes would puzzle over, a length that wanted a good editor urgently, an interesting line in social commentary, the odd bit of unnecessary bad language. As a read it is however overall not well explained or structured.

— Alan Cox

Comments (2 posted)

KS2012: Regression testing

By Michael Kerrisk
August 30, 2012

The "regression testing" slot on day 1 of the 2012 Kernel Summit consisted of presentations from Dave Jones and Mel Gorman. Dave's presentation described his new fuzz testing tool, while Mel's was concerned with some steps to improve benchmarking for detecting regressions.

Trinity: intelligent fuzz testing

Dave Jones talked about a testing tool that he has been working on for the last 18 months. That tool, Trinity, is a type of system call fuzz tester. Dave noted that fuzz testing is nothing new, and that the Linux community has had fuzz testing projects for around a decade. The problem is that past fuzz testers take a fairly simplistic approach, passing random bit patterns in the system call arguments. This suffices to find the really simple bugs, for example, detecting that a numeric value passed to a file descriptor argument does not correspond to a valid open file descriptor. However, once these simple bugs are fixed, fuzz testers tend to simply encounter the error codes (EINVAL, EBADF, and so on) that system calls (correctly) return when they are given bad arguments.

What distinguishes Trinity is the addition of some domain-specific intelligence. The tool includes annotations that describe the arguments expected by each system call. For example, if a system call expects a file descriptor argument, then rather than passing a random number, Trinity opens a range of different types of files, and passes the resulting descriptors to the system call. This allows fuzz testing to get past the simplest checks performed on system call arguments, and find deeper bugs. Annotations are available to indicate a range of argument types, including memory addresses, pathnames, PIDs, lengths, and so on. Using these annotations, Trinity can generate tests that are better targeted at the argument type (for example, the Trinity web site notes that powers of two plus or minus one are often effective for triggering bugs associated with "length" arguments). The resulting tests performed by Trinity are consequently more sophisticated than traditional fuzz testers, and find new types of errors in system calls.

Ted Ts'o asked whether it's possible to bias the tests performed by Trinity in favor of particular kernel subsystems. In response, Dave noted that Trinity can be directed to open the file descriptors that it uses for testing off a particular filesystem (for example, an ext4 partition).

Dave stated that Trinity is run regularly against the linux-next tree as well as against Linus's tree. He noted that Trinity has found bugs in the networking code, filesystem code, and many other parts of the kernel. One of the goals of his talk was simply to encourage other developers to start employing Trinity to test their subsystems and architectures. Trinity currently supports the x86, ia64, powerpc, and sparc architectures.

Benchmarking for regressions

Mel Gorman's talk slot was mainly concerned with improving the discovery of performance regressions. He noted that, in the past, "we talked about benchmarking for patches when they get merged. But there's been much inconsistency over time." In particular, he called out the practice of writing commit changelog entries that simply give benchmark statistics from running a particular benchmarking tool as being nearly useless for detecting regressions.

Mel would like to see more commit changelogs that provide enough information to perform reproducible benchmarks. Leading by example, Mel uses his own benchmarking framework, MMTests, and he has posted historical results from kernels 2.6.32 through to 3.4. What he would like to see is changelog entries that, in addition to giving benchmark results, identify the benchmark framework they use and include (pointers to) the specific configuration used with the framework. (The configuration could be in the changelog, or if too large, it could be stored in some reasonably stable location such as the kernel Bugzilla.)

H. Peter Anvin responded that "I hope you know how hard it is for submitters to give us real numbers at all." But this didn't deter Mel from reiterating his desire for sufficient information to reproduce benchmarking tests; he noted that many regressions take a long time to be discovered, which increases the importance of being able to reproduce past tests.

Ted Ts'o observed that there seemed to be a need for a per-subsystem approach to benchmarking. He then asked whether individual subsystems would even be able come to consensus on what would be a reasonable set of metrics, and noted that those metrics should not take too long to run (since metrics that take a long time to execute are likely not to executed in practice). Mel offered that, if necessary, he would volunteer to help write configuration scripts for kernel subsystems. From there, discussion moved into a few other related topics, without reaching any firm resolutions. However, performance regressions are a subject of great concern to kernel developers, and the topic of reproducible benchmarking is one that will likely be revisited soon.

KS2012: Distributions and upstream

By Michael Kerrisk
September 5, 2012

The "distributions and upstream" session of day 1 of the 2012 Kernel Summit focused on a question enunciated by Ted Ts'o: "From an upstream perspective, how can we better help distros?" Responding to that question were two distributor representatives: Ben Hutchings for Debian and Dave Jones for Fedora.

Ben Hutchings asked that, when considering merging a new feature, kernel developers not accept the argument that "this feature is expensive, but that's okay because we'll make it an option". He pointed out that this argument is based on a logical fallacy, since in nearly every case distributions will enable the option, because some users will need it. As an example, Ben mentioned memory cgroups (memcg), which, in their initial release, were rather expensive for performance.

A second point that Ben made was that there are still features that distributions are adding that are not being merged upstream. As an example from last year, he mentioned Android. As a current example, he noted the union mounts feature, which is still not upstream. Inasmuch as keeping features such as these outside of the mainline kernel creates more work for distributions, he would like to see such features more actively merged.

Dave Jones made three points. The first of these was that a lot of Kconfig help texts are "really awful". As a consequence, distribution maintainers have to read the code in order to work out if a feature should be enabled.

Dave's second point is that it would be useful to have an explicit list of regressions at around the -rc3 or -rc4 point in the release cycle. His problem is that regressions often become visible only much later. Finally, Dave noted that Fedora sees a lot of reports from lockdep that no other distributions seem to see. The basic problem underlying both of these points is of course lack of early testing, and at this point Ted Ts'o mused: "can we make it easier for users to run the kernel-of-the-day [in particular, -rc1 and rc2 kernels] and allow them to easily fall back to a stable kernel if it doesn't work out?" There was however no conclusive response in the ensuing discussion.

Returning to the general subject of Kconfig, Matthew Garrett echoed and elaborated on one of points made by Ben Hutchings, noting that Kconfig is important for kernel developers (so that they can strip down a kernel for fast builds). However, because distributors will nearly always enable configuration options (as described above), kernel developers need to ask themselves, "If you don't expect an option to be enabled [by distributors], then why is the option even present?". In passing, Andrea Arcangeli noted one of his pet irritations—one with which most people who have ever built a kernel will be familiar. When running make oldconfig, it is very easy to overstep as one types Enter to accept the default "no" for most options; one suddenly realizes that the answer to an earlier question should have been "yes". At that point of course, there is no way to go back, and one must instead restart from the beginning. (Your editor observes that improving this small problem could be a nice way for a budding kernel hacker to get their hands dirty.)

Comments (19 posted)

KS2012: Lightning talks

By Michael Kerrisk
September 5, 2012

The lightning talks on day 1 of the 2012 Kernel Summit were over in, one could say, a flash. There were just two very brief discussions.

Paul McKenney noted that a small number of read-copy update (RCU) users have for some time requested the ability to offload RCU callbacks. Normally, RCU callbacks are invoked on the CPU that registered them. This works well in most cases, but it can result in unwelcome variations in the execution times of user processes running on the same CPU. This kind of variation (also known as operating system jitter) can be reduced by offloading the callbacks—arranging for that CPU's RCU callbacks to be invoked on some other CPU. Paul asked if the ability to offload RCU callbacks was of interest to others in the room. A number of developers responded in the affirmative.

Dan Carpenter noted the existence of Smatch, his static analysis tool that detects various kinds of errors in C source code, pointing out that by now "many of you have received emails from me". (The emails that he referred to contained kernel patches and lists of bugs or potential bugs in kernel code. In the summary of his LPC 2011 presentation, Dan noted that Smatch has resulted in hundreds of kernel patches.) Dan's main point was simply to request other ideas from kernel developers on what checks to add to Smatch; he noted that there is a mailing list, smatch@vger.kernel.org, to which suggestions can be sent.

KS2012: Kernel build/boot testing

By Michael Kerrisk
September 5, 2012

The presentation given by Fengguang Wu on day 1 of the 2012 Kernel Summit was about testing for build and boot regressions in the Linux kernel. In the presentation, Fengguang described the test framework that he has established to detect and report these regressions in a more timely fashion.

To summarize the problem that Fengguang is trying to resolve, it's simplest to look at things from the perspective of a maintainer making periodic kernel releases. The most obvious example is of course the mainline tree maintained by Linus, which goes through a series of release candidates on the way to the release of a stable kernel. The linux-next tree maintained by Stephen Rothwell is another example. Many other developers depend on these releases. If for some reason, those kernel releases don't successfully build and boot, then the daily work of other kernel developers is impaired while they resolve the problem.

Of course, Linus and Stephen strive to ensure that these kinds of build and boot errors don't occur: before making kernel releases, they do local testing on their development systems, and ensure that the kernel builds, boots, and runs for them. The problem comes in when one considers the variety of hardware architectures and configuration options that Linux provides. No single developer can test all combinations of architectures and options, which means that, for some combinations, there are inevitably build and boot errors in the mainline -rc and linux-next releases. These sorts of regressions appear even in the final releases performed by Linus; Fengguang noted the results found by Geert Uytterhoeven, who reported that (for example) in the Linux 3.4 release, his testing found around 100 build error messages resulting from regressions. (Those figures are exaggerated because some errors occur on obscure platforms that see less maintainer attention. But they include a number of regressions on mainstream platforms that have the potential to disrupt the work of many kernel developers.) Furthermore, even when a build problem appears in a series of kernel commits but is later fixed before a mainline -rc release, this still creates a problem: developers performing bisects to discover the causes of other kernel bugs will encounter the build failures during the bisection process.

As Fengguang noted, the problem is that it takes some time for these regressions to be detected. By that time, it may be difficult to determine what kernel change caused the problem and who it should be reported to. Many such reports on the kernel mailing list get no response, since it can be hard to diagnose user-reported problems. Furthermore, the developer responsible for the problem may have moved on to other activities and may no longer be "hot" on the details of work that they did quite some time ago. As a result, there is duplicated effort and lost time as the affected developers resolve the problems themselves.

According to Fengguang, these sorts of regressions are an inevitable part of the development process. Even the best of kernel developers may sometimes fail to test for regressions. When such regressions occur, the best way to ensure they are resolved is to quickly and accurately determine the cause of the regression and promptly notify the developer who caused the regression.

Fengguang's solution to this problem is to automate a solution that detects these regressions and then informs kernel developers by email that their commit X triggered bug Y. Crucially, the email reports are generated nearly immediately (1-hour response time) after commits are merged into the tested repositories. (For this reason, Fengguang calls his system a "0-day kernel test" system.) Since the relevant developer is informed quickly, it's more likely they'll be "hot" on the technical details, and able to fix the problem quickly.

Fengguang's test framework at the Intel Open Source Technology Center consists of a server farm that includes five build servers (three Sandy Bridge and two Itanium systems). On these systems, kernels are built inside chroot jails. The built kernel images are then boot tested inside over 100 KVM instances on another eight test boxes. The system builds and boots each tested kernel configuration, on a commit-by-commit basis for a range of kernel configurations. (The system reuses build outputs from previous commits so as to expedite the build testing. Thus, the build time for the first commit of an allmodconfig build is typically ten minutes, but subsequent commits require two minutes to build on average.)

Tests are currently run against Linus's tree, linux-next, and more than 180 trees owned by individual kernel maintainers and developers. (Running tests against individual maintainers trees helps ensure that problems are fixed before they taint Linus's tree and linux-next.) Together, these trees produce 40 new branch heads and 400 new commits on an average working day. Each day, the system build tests 200 of the new commits. (The system allows trees to be categorized as "rebasable" or "non-rebasable". The latter are usually big subsystem trees for which the maintainers take responsibility to do bisectability tests before publishing commits. Rebaseable trees are tested on a commit-by-commit basis. For non-rebaseable trees, only the branch head is built; only if that fails does the system go though the intervening commits to locate the source of the error. This is why not all 400 of the daily commits are tested.)

The current machine power allows the build test system to test 140 kernel configurations (as well as running sparse and coccinelle) for each commit. Around half of these configurations are randconfig, which are regenerated each day in order to increase test coverage over time. (randconfig builds the kernel with randomized configuration options, so as to find test unusual kernel configurations.) Most of the built kernels are boot tested, including the randconfig ones. Boot tests for the head commits are repeated multiple times to increase the chance of catching less-reproducible regressions. In the end, 30,000 kernels are boot tested in each day. In the process, the system catches 4 new static errors or warnings per day, and 1 boot error every second day.

The responses from the kernel developers in the room were extremely positive to this new system. Andrew Morton noted he'd received a number of useful reports from the tool. "All contained good information, and all corresponded to issues I felt should be fixed." Others echoed Andrew's comments.

One developer in the room asked what he should do if he has a scratch branch that is simply too broken to be tested. Fengguang replied that his build system maintains a blacklist, and specific branches can be added to that blacklist on request. In addition, a developer can include a line containing the string Dont-Auto-Build in a commit message; this causes the build system to skip testing of the whole branch.

Many problems in the system have already been fixed as a consequence of developer feedback: the build test system is fairly mature; the boot test system is already reasonably usable, but has room for further improvement. Fengguang is seeking further input from kernel developers on how his system could be improved. In particular, he is asking kernel developers for runtime stress and functional test scripts for their subsystems. (Currently the boot test system runs a limited set of tools—trinity, xfstests, and a handful of memory management tests—for catching runtime regressions.)

Fengguang's system has already clearly had a strong positive impact on the day-to-day life of kernel developers. With further feedback, the system is likely to provide even more benefit.

Comments (5 posted)

KS2012: Status of Android upstreaming

By Michael Kerrisk
September 5, 2012

Anyone who has paid even slight attention to the progress of the mainlining of the Android modifications to the Linux kernel will be aware that the process has had its ups and downs. An initial attempt to mainline the changes via the staging tree ended in failure when the code was removed in kernel 2.6.33 in late 2010. Nevertheless, at the 2011 Kernel Summit, kernel developers indicated a willingness to mainline code from Android, and starting with Linux 3.3, various Android pieces were brought back into the staging tree. (On the Android side this was guided by the Android Mainlining Project.) The purpose of John Stultz's presentation on day 1 of the 2012 Kernel Summit was to review the current status of upstreaming of the Android code and outline the work yet to be done.

John began by reviewing the progress in recent kernel releases. Linux 3.3 reintroduced a number of pieces to staging, including ashmem, binder, logger, and the low-memory killer. With the Linux 3.3 release, it became possible to boot Android on a vanilla kernel. Linux 3.4 added some further pieces to the staging tree and also saw a lot of cleanup of the previously merged code. Subsequent kernels have seen further Android code move to the staging tree, including the wakeup_source feature and the Android Gadget driver. In addition, some code in the staging tree has been converted to use upstream kernel features; for example, Android's alarm-dev feature was converted to use the alarm timers feature added to Linux in kernel 3.0.

As of now (i.e., after the closure of the 3.6 merge window), there still remain some major features to merge, including the ION memory allocator. In addition, various Android pieces still remain in the staging tree (for example, the low-memory killer, ashmem, binder, and logger), and these need to be reworked (or replaced), so that the equivalent functionality is provided in the mainline kernel. However, one has the impression that these technical issues will all be solved, since there's been a general improvement in relations on both sides of the Android/upstream fence; John noted that these days there is much less friction between the two sides, more Android developers are participating in the Linux community, and the Linux community seems more accepting of Android as a project. Nevertheless, John noted a few things that could still be improved on the Android side. In particular, for many releases, the Android developers provided updated code branches for each kernel release, but in more recent times they have skipped doing this for some kernel releases.

Following John's presentation, there was relatively little discussion, which is perhaps an indication of the fact that kernel developers are reasonably satisfied with the current status and momentum of Android upstreaming. Matthew Garrett asked if John has any feeling about whether other projects are making use of the upstreamed Android code. In response, John noted that Android code is being used as the default Board Support Package for some projects, such as Firefox OS. He also mentioned that the volatile ranges code that he is currently developing has a number of potential uses outside of Android.

Matthew was also curious to know if is there anything that the Linux kernel developers could do to help make the design process for features that are going into Android more open. Right now, most Android features are developed in-house, but perhaps a more open-developed solution might have satisfied other users' requirements. There was some back and forth as to how practical any other kind of model would be, especially given the focus of vendors on product deadlines; the implicit conclusion was that anything other than the status quo was unlikely.

Overall, the current status of Android upstreaming is very positive, and certainly rather different from the situation a couple of years ago.

Comments (2 posted)

KS2012: Module signing

By Jake Edge
September 6, 2012

From several accounts, day one of this year's Kernel Summit was largely argument-free. There were plenty of discussions, even minor disagreements, but nothing approaching some of the battles of yore. Day three looked like it might provide an exception to that pattern with a discussion of two different patch sets that are both targeted at cryptographically signing kernel modules. In the end, though, the pattern continued, with an interesting, but tame, session.

Kernel modules are inserted into the running kernel, so a rogue module could be used to compromise the kernel in ways that are hard to detect. One way to prevent that from happening is to require that kernel modules be cryptographically signed using keys that are explicitly allowed by the administrator. Before loading the module, the kernel can check the signature and refuse to load any that can't be verified. Those modules could come from a distribution or be built with a custom kernel. Since modules can be loaded based on a user action (e.g. attaching a device or using a new network protocol) or come from a third-party (e.g. binary kernel modules), ensuring that only approved modules can be loaded is a commonly requested feature.

Rusty Russell, who maintains the kernel module subsystem, called the meeting to try to determine how to proceed on module signing. David Howells has one patch set that is based on what has been in RHEL for some time, while Dmitry Kasatkin posted another that uses the digital signature support added to the kernel for integrity management. Howells's patches have been around, in various forms, since 2004, while Kasatkin's are relatively new.

Russell prefaced the discussion with an admonishment that he was not interested in discussing the "politics, ethics, or morality" of module signing. He invited anyone who did want to debate those topics to a meeting at 8pm, which was shortly after he had to leave for his plane. The reason we will be signing modules, he said, is because Linus Torvalds wants to be able to sign his modules.

Kasatkin's approach would put the module signature in the extended attributes (xattrs) of the module file, Russell began, but Kasatkin said that choice was only a convenience. His patches are now independent of the integrity measurement architecture (IMA) and the extended verification module (EVM), both of which use xattrs. He originally used xattrs because of the IMA/EVM origin of the signature code he is using, and he did not want to change the module contents. Since then, he noted a response from Russell to Howells's approach and has changed his patches to add the module signature to the end of the file.

That led Russell into a bit of a historical journey. The original patches from Howells put the signature into an ELF section in the module file. But, because there was interest in having the same signature on both stripped and unstripped module files, there was a need to skip over some parts of the module file when calculating the hash that goes into the signature.

The amount of code needed to parse ELF was "concerning", Russell said. Currently, there are some simple sanity checks in the module-loading code, without any checks for malicious code because the belief was that you had to be root to load a module. While that is still true, the advent of things like secure boot and IMA/EVM has made checking for malicious code a priority. But Russell wants to ensure that the code doing that checking is as simple as possible to verify, which was not true when putting module signatures into ELF sections.

Greg Kroah-Hartman pointed out that you have to do ELF parsing to load the module anyway. There is a difference, though. If the module is being checked for maliciousness, that parsing happens after the signature is checked. Any parsing that is done before that verification is potentially handling untrusted input.

Russell would rather see the signature appended to the module file in some form. It could be a fixed-length signature block, as suggested by Torvalds, or there could be some kind of "magic string" followed by a signature. That would allow for multiple signatures on a module. Another suggestion was to change the load_module() system call so that the signature was passed in, which would "punt" the problem to user space "that I don't maintain anymore", Russell said.

Russell's suggestion was to just do a simple backward search from the end of the module file to find the magic string, but Howells was not happy with that approach for performance reasons. Instead, Howells added a 5-digit ASCII number for the length of the signature, which Russell found a bit inelegant. Looking for the magic string "doesn't take that long", he said, and module loading is not that performance-critical.

There were murmurs of discontent in the room about that last statement. There are those who are very sensitive about module loading times because it impacts boot speed. But, Russell said that he could live with ASCII numbers, as long as there was no need to parse ELF sections in the verification code. He does like the fact that modules can be signed in the shell, which is the reason behind the ASCII length value.

There are Red Hat customers asking for SHA-512 digests signed with 4K RSA keys, Howells said, but that may change down the road. That could make picking a size for a fixed-length signature block difficult. But, as Ted Ts'o pointed out, doing a search for the magic string is in the noise in comparison to doing RSA with 4K keys. The kernel crypto subsystem can use hardware acceleration to make that faster, Howells said. But, Russell was not convinced that the performance impact of searching for the magic string was significant and would like to see some numbers.

James Bottomley asked where the keys for signing would come from. Howells responded that the kernel build process can create a key. The public part would go into the kernel for verification purposes, while the private part would be used for signing. After the signing is done, that ephemeral private key could be discarded. There is also the option to specify a key pair to use.

Torvalds said that it was "stupid" to have stripped modules with the same signature as the unstripped versions. The build process should just generate signatures for both. Having logic to skip over various pieces of the module just adds a new attack point. Another alternative is to only generate signatures for the stripped modules as the others are only used for debugging and aren't loaded anyway, so they can be unsigned, he said. Russell agreed, suggesting that the build process could just call out to something to do the signing.

For binary modules, such as the NVIDIA graphics drivers, users would have to add the NVIDIA public key to the kernel ring, Peter Jones said.

Kees Cook brought up an issue that is, currently at least, specific to Chrome OS. In Chrome OS, there is a trusted root partition, so knowing the origin of a module would allow those systems to make decisions about whether or not to load them. Right now, the interface doesn't provide that information, so Cook suggested changing the load_module() system call (or adding a new one) that passed a file descriptor for the module file. Russell agreed that an additional interface was probably in order to solve that problem.

In the end, Russell concluded that there was a reasonable amount of agreement about how to approach module signing. He planned to look at the two patch sets, try to find the commonality between the two, and "apply something". In fact, he made a proposal, based partly on Howells's approach, on September 4. It appends the signature to the module file after a magic string as Russell has been advocating. As he said when wrapping up the discussion, his patch can provide a starting point to solving this longstanding problem.

Comments (11 posted)

KS2012: ARM: AArch64

By Jake Edge
September 5, 2012

Catalin Marinas led a discussion of kernel support for 64-bit ARM processors as part of day two of the ARM minisummit. He concentrated on the status of the in-flight patches to add that support, while pointing to his LinuxCon talk later in the week for more details about the architecture itself.

A second round of the ARM-64 patches was posted to the linux-kernel mailing list in mid-August. After some complaints about the "aarch64" name for the architecture, it was changed to "arm64", at least for the kernel source directory. That name will really only be seen by kernel developers as uname will still report "aarch64", in keeping with the ELF triplet used by the binaries built with GCC.

Some of the lessons learned from the ARM 32-bit support have been reflected in arm64. It will target a single kernel image by default, for example. That means that device tree support is mandatory for AArch64 platforms. Since there are not, as yet, any AArch64 platforms, the patches contain simplified platform code based on that of the Versatile Express.

There are two targets for AArch64 devices: embedded and server. It is possible that ACPI support will be required for the servers. As far as Marinas knows, there is no ACPI implementation out there, but it is not clear what Microsoft is doing in that area.

The code for generic timers and the generic interrupt controller (GIC) lives under the drivers directory. That code could be shared with arch/arm, but there is a need to #ifdef the inline assembly code.

There is an intent to push back on the system-on-a-chip (SoC) vendors regarding things like firmware initialization, boot protocol, and a standardized secure mode API. SoC vendors (and thus, their ARM sub-trees) should be providing the standard interfaces, rather than heading out on their own. The ARM maintainers can choose not to accept ports that do not conform.

That may work for devices targeted at Linux, but there may be SoC vendors who initially target another operating system, as Olof Johannson noted. There will likely need to be some give and take for things such as the boot protocol when Windows, iOS, or OS X targeted devices are submitted. Marinas said that the aim would be for standardization, but they "may have to cope" with other choices at times.

The first code from SoC vendors is not expected before the end of the year, Marinas said. Arnd Bergmann half-jokingly suggested that he would be happy to get a leaked version of that code at any time. The first SoCs might well just be existing 32-bit ARMv7 SoCs with an AArch64 CPU (aka ARMv8) dropped in. That may be the path for embedded applications, though the vendors targeting the server market are likely to be starting from scratch.

That led to a discussion of how to push the arm64 patches forward. Marinas would like to push the core architecture code forward, while working to clean up the example SoC code. He would like to target the 3.8 kernel for the core. Bergmann was strongly in favor of getting it all into linux-next soon, and targeting a merge for the 3.7 development cycle.

Marinas is concerned that including the SoC code will delay inclusion as it will require more review. He also wants to make sure that there is a clean base for those who want to use it as a basis for their own SoC code. That should take two weeks or so, Marinas said. He hopes to get it into linux-next sometime after 3.7-rc1, but Bergmann encouraged a faster approach. There is nothing very risky about doing so, Johannson pointed out, as a new architecture cannot break any existing code.

There is some concern about the 2MB limit on device tree binary (dtb) files because some network controllers (and other devices) may have firmware blobs larger than that. Bergmann noted that those blobs may not be able to be shipped in the kernel, but could be put into firmware and loaded from there. It turns out that the flattened device tree format already has a length entry in its header that can be used to support multiple dtbs, which will allow the 2MB limit to be worked around.

The existing arm64 emulation does not have any DMA, so support for that feature is currently untested. In addition, some SoCs are likely to only support 32-bit DMA. Bergmann suggested an architecture-independent implementation that used dma_ops pointers to provide both coherent and non-coherent versions, but Marinas would like to do something simpler (i.e. coherent only) to start with. Since the "hardware" currently lacks DMA, "all DMA is coherent" seems like a reasonable model, Bergmann said. Since no one will be affected by any bugs in the code, he suggested getting it into linux-next as soon as possible.

Tony Lindgren asked if ARM maintainer Russell King had any comments on the patches. Marinas said that there were not many, at least so far. Bergmann said that he didn't think King was convinced that having a separate arm64 directory (as opposed to adding 64-bit support to the existing arm directory) was the right approach.

Many of the decisions were made for ARM 15 years ago, Marinas said, and some of those make it messy to drop arm64 on top of arm. Some day, when the arm tree only supports ARMv7, it may make sense to merge with arm64. The assembly code cannot be shared, because they are two different architectures, Bergmann said. In addition, the system calls cannot be shared and the platform code is going to be done very differently for arm64, he said.

But, there is room for sharing some things between the two trees, Marinas said. That includes some of the device tree files, perf, the generic timer, the GIC driver code, as well as KVM and Xen if and when they are merged. In theory, the ptrace() and signal-handling code could be shared as well.

Progress is clearly being made for arm64, and we will have to wait and see how quickly it can make its way into the mainline.

KS2012: ARM: A big.LITTLE update

By Jake Edge
September 5, 2012

The ARM big.LITTLE architecture is an asymmetric multi-processor platform, with powerful and power-hungry processors coupled with less-powerful (in both senses) CPUs using the same instruction set. Big.LITTLE presents some challenges for the Linux scheduler. Paul McKenney gave a readout of the status of big.LITTLE support at the ARM minisummit, which he really meant to serve as an "advertisement" for the scheduling micro-conference at the Linux Plumbers Conference that started the next day.

The idea behind big.LITTLE is to do frequency and voltage scaling by other means, he said. Because of limitations imposed by physics, there is a floor to frequency and voltage scaling on any given processor, but that can be worked around by adding another processor with fewer transistors. That's what has been done with big.LITTLE.

There are basically two ways to expose the big.LITTLE system to Linux. The first is to treat each pair as a single CPU, switching between them "almost transparently". That has the advantage that it requires almost no changes to the kernel and applications don't know that anything has changed. But, there is a delay involved in making the switch, which isn't taken into account by the power management code, so the power savings aren't as large as they could be. In addition, that approach requires paired CPUs (i.e. one of each size), but some vendors are interested in having one little and many big CPUs in their big.LITTLE systems.

The other way to handle big.LITTLE is to expose all of the processors to Linux, so that the scheduler can choose where to run its tasks. That requires more knowledge of the behavior of processes, so Paul Turner has a patch set that gathers that kind of information. Turner said that the scheduler currently takes averages on a per-CPU basis, but when processes move between CPUs, some information is lost. His changes cause the load average to move with the processes, which will allow the scheduler to make better decisions.

Turner's patches are on their third revision, and have been "baking on our systems at Google" for a few months. There are no real to-dos outstanding, he said. Peter Zijlstra said that he had wanted to merge the previous revision, but that there was "some funky math" in the patches, which has since been changed. Turner said that he measured a 3-4% performance increase using the patches, which means we get "more accurate tracking at lower cost". It seems likely that the patches will be merged soon.

McKenney said that Turner's patches have been adapted by Morten Rasmussen to be used on big.LITTLE systems. The measurements are used to try to determine where a task should be run. Over time, though, the task's behavior can change, so the scheduler checks to see if that has happened and if the placement still makes sense. There are still questions about when "race to idle" versus spreading tasks around makes the most sense, and there have been some related discussions of that recently on the linux-kernel mailing list.

Currently, the CPU hotplug support is less than ideal for removing CPUs that have gone idle. But Thomas Gleixner is reworking things to "make hotplug suck less", McKenney said. For heavy workloads, the process of offlining a processor can take multiple seconds. After Gleixner's rework, that drops to 300ms for an order of magnitude decrease. Part of the solution is to remove stop_machine() calls from the offlining path. There are multiple reasons for making hotplug work better, McKenney said, including improving read-copy update (RCU), reducing realtime disruption, and providing a low-cost way to clear things off of a CPU for a short time. He also noted that it is not an ARM-only problem that is being solved here, as x86 suffers from significant hotplug delays too.

The session finished up with a brief discussion of how to describe the architecture of a big.LITTLE system to the kernel. Currently, each platform has its own way of describing the processors and caches in its header files, but a more general way, perhaps using device tree or some kind of runtime detection mechanism, is desired.

KS2012: ARM: DMA issues

By Jake Edge
September 6, 2012

Generic DMA engines are present in many ARM platforms to enable devices to move data between main memory and device-specific regions. Arnd Bergmann led a discussion about the DMA engine APIs as part of the last day of the ARM minisummit. DMA is the last ARM subsystem that does not have generic device tree bindings, he said, so he hoped the assembled developers could agree on some. Without those bindings, the code that uses DMA is forced to be platform-specific, which impedes progress toward the goal of building a single kernel image for multiple ARM platforms.

Bergmann said that there are many things currently blocked by the lack of device tree bindings for DMA. Those bindings need to describe the kinds of DMA channels available in the hardware, along with their attributes. Two proposals have been made to add support for the generic DMA engines. Jon Hunter has a patch set that implements a particular set of bindings, but he couldn't attend the meeting, so Bergmann presented them. The other patches were from DMA engine maintainer Vinod Koul.

The differences between the two are a bit hard to decipher. Both approaches attempt to keep any information about how to set up DMA channels from both the device driver using them and from the DMA engine driver that provides them. That knowledge would reside in the DMA engine core. With Koul's patches, there would be a global lookup table that would be populated by the platform-specific code from various sources (device tree, ACPI, etc.). That table would list the connections between devices and DMA engine drivers. Hunter's patches solve the problem simply for the device tree case, without requiring interaction with the platform-specific code.

The discussion got technically quite deep, as Bergmann admitted with a grin after the session, but the upshot is that the two approaches are not completely at odds. At the end of the session, it was agreed that both patches could be merged ("more or less", Koul said). The DMA engine core would be able to find the connection in either the device tree or via the lookup table, but will use the same device driver interfaces either way. Bergmann said that he hoped to see something in the 3.7 kernel. In between those two discussions, some things about the device tree bindings were hammered out as well.

One of the first problems noted with the bindings described in Hunter's patch was the use of numerical values (derived from flag bits) to describe attributes of DMA channels. "These magic numbers are not a readability triumph", Mark Brown said. He went on to suggest adding some kind of preprocessor support to the device tree compiler (dtc), which turns the text representation into a flattened device tree binary (dtb). That would make the flags readable, Tony Lindgren said, but he wondered if such a preprocessor was "years off".

One way around the magic number problem is to use names instead, though dealing with strings in device tree is difficult, Bergmann said. Some platforms have complicated arrangements of controllers and DMA engines, he said, using an example of an MMC (memory card) controller with two channels, one of which is connected to three different DMA engines. In order to make the request API for a DMA channel relatively simple, it would make sense to name each channel, someone suggested. One problem there is that most devices (80% perhaps) either have a single channel or just one for each direction, Bergmann said. Forcing those devices to explicitly name them adds complexity.

But most were in favor of using the names. In addition to naming the channels, standardizing the property names would make it easier to scan the whole device tree for properties of interest. Allowing devices to come up with their own property names will make that impossible. Also, when new functional units that implement DMA get added to a platform, standardized names will make it easier to incorporate them into existing device trees. So, names for each of a device's channels, along with a standard set of property names, would seem to be in the cards.

This was the last non-hacking session in the ARM minisummit, which seemed to be a great success overall. Some issues that had been lingering were discussed and resolved—or at least plans to do so were made. In addition, the status of some newer features (e.g. big.LITTLE and AArch64) was presented, so that questions could be raised and answered in real time, rather than over a sometimes slow mailing list or IRC channel pipe. Beyond the discussions, both afternoons featured hacking sessions where it sounds like some real work got done.

[ I would like to thank Will Deacon and Arnd Bergmann for reviewing parts of the ARM minisummit coverage, though any remaining errors are mine, of course. ]

Linus Torvalds Linux 3.6-rc4 ?

Steven Rostedt 3.2.28-rt42 ?

Steven Rostedt 3.0.42-rt63 ?

Steven Rostedt 3.0.41-rt62 ?

Rob Herring Initial multi-platform support ?

Barry Song ARM: PRIMA2: initial support for SiRFmarco dual-core SoC ?

Arnd Bergmann ARM: multiplatform: rename all mach headers ?

Loic Pallardy Add ST-Ericsson U9540 support ?

Gerald Schaefer thp: transparent hugepages on s390 ?

Paul E. McKenney [PATCH tip/core/rcu 0/23] Improvements to RT response on big systems and expedited functions ?

Paul E. McKenney idle-related changes ?

Paul E. McKenney Add callback-free CPUs ?

Anton Vorontsov Deferrable timers support for timerfd API ?

Glauber Costa forced comounts for cgroups. ?

Lai Jiangshan workqueue: reimplement unbind/rebind ?

Shubham Goyal [LTP] [ANNOUNCE] The Linux Test Project has been released for SEPTEMBER 2012. ?

Patil, Rachna Support for TSC/ADC MFD driver ?

Alexandre Courbot Runtime Interpreted Power Sequences ?

Yinghai Lu PCI: pci root bus hotplug support - part1 ?

Yann Cantin new USB eBeam input driver ?

Alan Ott Driver for Microchip MRF24J40 ?

Dong Aisheng add syscon driver based on regmap for general registers access ?

Fernando Guzman Lugo remoteproc: introduce rproc recovery ?

Thomas Petazzoni net: Network driver for the Armada 370 and Armada XP ARM Marvell SoCs ?

Antti Palosaari Elonics E4000 silicon tuner driver ?

Sylwester Nawrocki [PATCH v2] V4L: Add driver for S3C244X/S3C64XX SoC series camera interface ?

Naresh Kumar Inna csiostor: Chelsio FCoE offload driver submission ?

Lukas Czerner Add invalidatepage_range address space operation ?

Michel Lespinasse use interval trees for anon rmap ?

wency@cn.fujitsu.com memory-hotplug: hot-remove physical memory ?

Dan Magenheimer staging: ramster: move to new zcache2 code base ?

H.K. Jerry Chu tcp: TCP Fast Open, Server Side ?

Julian Anastasov Interface for TCP Metrics ?

Rusty Russell module: signature infrastructure ?

Matthew Garrett First attempt at kernel secure boot support ?

Kees Cook security: allow Yama to be unconditionally stacked ?

Casey Schaufler LSM: Multiple concurrent LSMs ?

Tomoki Sekiyama [RFC v2 PATCH 00/21] KVM: x86: CPU isolation and direct interrupts delivery to guests ?

Karel Zak util-linux v2.22 ?

Improving Ubuntu's application upload process

By Jonathan Corbet
September 5, 2012

There has been a surplus of articles recently on how Linux "lost" the desktop. Fingers have been pointed in all directions, from the Windows monopoly to competing desktop projects to Linus Torvalds's management style. Over in the Ubuntu camp, though, there does not appear to be any sense that the desktop has been lost; they are still working hard to win it. Ubuntu's recently-proposed new application upload process highlights its vision of the desktop and what they think needs to be done to make things happen there.

The problem

Serious Linux users tend not to think of availability of software as a problem; distribution repositories typically carry tens of thousands of packages, after all, and any of those packages can be installed with a single command. The problem with distribution repositories, from Ubuntu's point of view, is that they can be stale and inaccessible to application developers. The packages in the repository tend to date from before a given distribution release's freeze date; by the time an actual distribution gets onto a user's machine, the applications found there may be well behind the curve. In some cases, applications may have lost their relevance entirely; as Steve Langasek put it:

Why put an upstream through a process optimized for long-term integration of the distribution when all they care about is getting an app out to users that gives them information about this month's beer festival?

Beyond that, getting a package into a distribution's repository is not something just anybody can do; developers must either become a maintainer for a specific distribution or rely on somebody else to create and add a package for their application. And, in most distributions, there is no place in the repository at all for proprietary applications.

Ubuntu's owner Canonical sees these problems as significant shortcomings that are holding back the creation of applications for the Linux desktop; that, in turn, impedes the development and adoption of Linux as a whole. So, a few years back, Canonical set out to remedy these problems through the creation of the Ubuntu Software Centre (USC), a repository by which developers could get applications to their users quickly. The USC is not tied to the distribution release cycle; applications added there become available to users immediately. There is a mechanism for the handling of payments, allowing proprietary applications to be sold to users. A glance through the USC shows a long list of applications (some of which are non-free) and other resources like fonts and electronic books. Guides to nearby beer festivals are, alas, still in short supply.

Naturally, Canonical does not want to provide an unsupervised means by which arbitrary software can be installed on its users' systems. Experience shows that it would not take long for malware authors, spammers, and others to make their presence felt. So the process for putting an application into the USC involves a review step. For paid applications, for which Canonical takes a 20% share of the price, there appears to be a fully-funded mechanism that can review and place applications quickly. For free applications, instead, review is done by a voluntary board and that group, it seems, has been having a hard time keeping up with the workload. The result is long delays in getting applications into the USC, discouraged developers, and frustration all around.

Automatic review

The new upload process proposal aims to improve the situation for free applications; Canonical does not seem to intend to change the process for paid applications. There are a number of changes intended to make life easier for everybody involved, but the key would appear to be this:

We should not rely on manual reviews of software before inclusion. Manual reviews have been found to cause a significant bottleneck in the MyApps queue and they won’t scale effectively as we grow and open up Ubuntu to thousands of apps.

In other words, they want to make the process as automatic as possible, but not so automatic that Bad Things make it into the USC.

The first step requires developers to register with the USC, then request access to upload one or more specific packages. Getting that access will require convincing Canonical that they hold the copyrights to the code or are otherwise authorized to do the upload; it will apparently not be possible for third parties to upload software without explicit permission, even if the software is licensed in a way that would allow that to happen. A review board will look at the uploader's application and approve it if that seems warranted.

Once a developer has approval, there are a few more steps involved in putting an application into the USC. The first is to package it appropriately with the Quickly tool and submit it for an upload. That is mostly basic packaging work. Uploads through this mechanism will be done in source form; binaries will, it seems, be built within the USC itself.

But, before the application can be made available, it must be accompanied by a security policy. The mechanism is superficially similar to the privilege scheme used by Android, but the USC bases its security on the AppArmor mandatory access control mechanism instead. The creation of a full AppArmor profile can be an involved process; Canonical has tried to make things simpler by automating most of the work. The uploader need only declare the specific access privileges needed by the application. These include access to the X server, access to the network, the ability to print, and use of the camera. Interestingly, access to spelling checkers requires an explicit privilege.

All (free) USC applications will run within their own sandbox with limited access to the rest of the system. Only files and directories found in a whitelist will be accessible, for example. Applications will be prevented from listening to (or interfering with) any other application's X server or D-Bus communications. There will be a "helper" mechanism by which applications can request access to non-whitelisted files; the process will, inevitably, involve putting up a dialog and requiring the user to allow the access to proceed. That, naturally, will put some constraints on what these applications can usefully do; it is hard to imagine a new compiler working well in this environment, for example. The payoff is that, with these restrictions in place, it should not be possible for any given application to damage the system or expose information that the user does not want disclosed.

And, with all that structure in place, Canonical feels that it is safe to allow applications into the USC without the need for a manual review. That should enable applications to get to users more quickly while taking much of the load off the people who are currently reviewing uploads.

Naming issues

From the discussion on the mailing lists, it would seem that the biggest concern has to do with the naming of packages and files. If an application in the USC uses a name (of either a package or a file) that is later used by a package in the Ubuntu or Debian repositories, a conflict will result that is unlikely to make the user happy. Addressing this problem could turn out to be one of the bigger challenges that Ubuntu has to face.

Current USC practice requires all files to be installed under /opt; this rule complies with the filesystem hierarchy standard and prevents file conflicts with the rest of the distribution. The problem, according to David Planella (one of the authors of the proposal), is that a lot of things just don't work when installed under /opt:

We are assuming that build systems and libraries are flexible enough to cater for an alternative installation prefix, and that it will all just work at runtime. Unfortunately, this has proven not to be the case. And I think the amount of coordination and work that'd be required to provide solid /opt support in Ubuntu would be best put in other parts of the spec, such as its central one: sandboxing.

In other words, the /opt restriction was seen as making life difficult for developers and Ubuntu lacks the resources and will to fix the problems; the restriction has thus been removed in the proposal. With Ubuntu, Debian, and USC packages all installing files into the same directory hierarchy, an eventual conflict seems certain. There has been talk of forcing each USC package to use its own subdirectory under /usr, a solution that, evidently, is easier than /opt, but nothing has been settled as of this writing.

Presumably some solution will be found and something resembling this proposal will eventually be put into place. The result should be a leaner, faster USC that makes it possible to get applications to users quickly. Whether that will lead to the fabled Year of the Linux Desktop remains to be seen. The "app store" model has certainly helped to make other platforms more attractive; if its absence has been one of the big problems for Linux, we should find out fairly soon.

Comments (77 posted)

Distribution quotes of the week

"Gentoo" is actually an old Navaho word meaning "some assembly required, batteries not included". You thought it was a penguin?

-- Ryan Hill

Nearly two decades have passed since operating systems were judged primarily by their office suites and solitaire games. Photo retouching, online note syncing, genealogy, kiosk-style UI for the elderly, music notation, home accounting, calendaring, paying taxes, making greeting cards, chess, Web design, screencasting, CAD, school timetabling, wedding planning, screenwriting. For thousands of "long tail" genres like those, competing OSes have multiple applications to choose from -- but the published choices in Ubuntu are either non-existent or, not to put too fine a point on it, terrible.

Now, there are many reasons for that: difficulty of publishing is far from the only one. But it would be a subtle error to think that an application not existing for Ubuntu at all means that difficulty of publishing is unimportant. It may be one of the reasons nobody bothered to develop the application in the first place.

-- Matthew Paul Thomas

Comments (10 posted)

Buildroot 2012.08 released

Buildroot is a tool for creating complete embedded Linux systems. A legal information reporting infrastructure has been added to version 2012.08, along with other significant changes. "Integration of a legal information reporting infrastructure, which allows to generate detailed informations about the licenses and source code of all components of a system generated by Buildroot. License information will progressively be added on packages. As of this release, 77 packages have gained license information." Other changes include the addition of new hardware support, an external toolchain update, and more.

LFS 7.2 is released

Linux From Scratch 7.2 has been released. This major release includes toolchain updates to glibc-2.16.0 and gcc-4.7.1. "In total, 28 packages were updated from LFS-7.1 and changes to bootscripts and text have been made throughout the book."

openSUSE 12.2 released

The openSUSE 12.2 release is now available. "The latest release of the world’s most powerful and flexible Linux Distribution brings you speed-ups across the board with a faster storage layer in Linux 3.4 and accelerated functions in glibc and Qt, giving a more fluid and responsive desktop. The infrastructure below openSUSE has evolved, bringing in mature new technologies like GRUB2 and Plymouth and the first steps in the direction of a revised and simplified UNIX file system hierarchy. Users will also notice the added polish to existing features bringing an improved user experience all over. The novel Btrfs file system comes with improved error handling and recovery tools, GNOME 3.4, developing rapidly, brings smooth scrolling to all applications and features a reworked System Settings and Contacts manager while XFCE has an enhanced application finder."

Comments (16 posted)

Open webOS beta release

The Open webOS August edition, otherwise known as the beta release, is now available. "Today’s release provides—not one—but two build environments. Our desktop build provides the ideal development environment for enhancing the webOS user experience with new features and integrating state of the art open source technologies. Developers can now use all their desktop tools on powerful development machines. Our OpenEmbedded build provides the ideal development environment for porting webOS to new and exciting devices."

Comments (5 posted)

Qubes 1.0 released

Version 1.0 of the Qubes security-oriented desktop distribution has been released. "Generally Qubes OS is an advanced tool for implementing Security by Isolation approach on your desktop, using domains implemented as lightweight Xen VMs. It tries to marry two contradictory goals: how to make the isolation between domains as strong as possible, mainly due to clever architecture that minimizes the amount of trusted code, and how to make this isolation as seamless and easy as possible." LWN looked at Qubes in 2010.

Comments (12 posted)

ROSA test releases

ROSA has two releases available for testing. ROSA Enterprise Linux Server beta includes the ROSA Directory Server, ROSA Server Setup and OpenStack 2012.1 Essex. An alpha version of ROSA Desktop 2012 is also available.

Tumbleweed is now empty for 12.2

Tumbleweed is a rolling repository for openSUSE. As v12.2 has just been released that repository is now empty. "As part of the Tumbleweed lifecycle, with the 12.2 release of openSUSE, the openSUSE:Tumbleweed repo is now empty so that you can start out with a "clean" 12.2 release. It will stay that way for a few weeks for things to settle down with 12.2, and then will start to add packages back to it (new kernel, KDE 4.9, etc.) as time permits."

Ubuntu's new app developer upload process proposal

The Ubuntu folks have run into an interesting problem with their application upload process: "Despite our best intentions and the Ubuntu App Review Board's epic efforts, we're currently putting a strain on reviewers (who cannot keep up with the incoming stream of apps) and providing an unsatisfactory experience for app authors (who have to endure long delays to be able to upload their apps)." In response, they have come up with a detailed proposal for a new process; comments are sought. "We should not rely on manual reviews of software before inclusion. Manual reviews have been found to cause a significant bottleneck in the MyApps queue and they won’t scale effectively as we grow and open up Ubuntu to thousands of apps."

Full Story (comments: 22)

Distribution newsletters

Debian Project News (September 3)
DistroWatch Weekly, Issue 472 (September 3)
Fedora Weekly News, Issue 296 (September 1)
Maemo Weekly News (September 3)
Ubuntu Weekly Newsletter, Issue 281 (September 2)

Garrett: UEFI Secure boot in Fedora: status update

Matthew Garrett has a progress report on implementing secure boot in Fedora. "The infrastructure for signing the bootloader binaries is now implemented. pesign is in the archive and being used to sign shim, grub2 and the kernel. At the moment they're all being signed by test keys, and the private key is actually in the pesign package. This is, obviously, not intended for production use - it's just to ensure that we can build correctly signed images. We've proof-of-concepted signing via cryptographic hardware and will shortly be deploying new build systems dedicated to building the signed binaries. These won't be general access systems and will have a lightly modified mock configuration to ensure that the crypto hardware is available to the build chroots, but otherwise there's nothing special about them."

Comments (25 posted)

SystemRescueCd 3.0.0 introduces support for modules (The H)

The H looks at new features in SystemRescueCd 3.0.0. "Version 3.0.0 of SystemRescueCd introduces support for modularly extending the system and UEFI booting to the Linux distribution for administering and repairing systems. With SRM modules (System Rescue Modules), additional programs with their dependencies, files and data can each be contained in a single squashfs filesystem with a .srm extension. When the system boots, these .srm files are all mounted automatically. Users can add the reusable .srm files to their SystemRescueCD systems as needed to customise their system."

GStreamer Conf: Linux media subsystems

By Nathan Willis
September 6, 2012

GStreamer is a framework designed for application development, but the memory and processing demands of multimedia mean that it leans heavily on the support of the operating system's underlying media layers. At the 2012 GStreamer Conference, representatives from Video4Linux, ALSA, and Wayland were on hand to report on recent developments and ongoing work in the world of Linux media capture, sound, and display technology.

Video4Linux

Hans Verkuil presented a session on the Video4Linux (V4L) subsystem, which primarily handles video input, along with related matters. The major change in the V4L arena, he said, has been the emergence of the system-on-chip (SoC). In the desktop paradigm of years past, V4L had relatively simple hardware to deal with: video capture cards and webcams, the majority of which had similar capabilities. SoCs are markedly different; many including discrete components like hardware decoders and video scalers, and the system provides a flexible AV pipeline — with multiple ways to route through the on-board components depending on the processing needed.

Initially most SoC vendors wrote their own, proprietary modules to make up for the features V4L lacked, he said, but V4L has caught up. The core framework now includes a v4l2_subdev structure to communicate with sub-devices like decoders and scalers. Although these devices can vary from board to board in theory, he said, in practice most vendors tend to stick with the same parts over many hardware generations. There is also a new Media Controller API to handle managing multi-function devices (including USB webcams that include an integrated microphone, in addition to the flexible SoC routing mentioned above) and the 3.1 kernel introduced a new control framework that provides a consistent interface for brightness, contrast, frame rate, and other settings.

V4L's roots were in the standard-definition era, so the project has also struggled to make life easier for HDTV users. The initial attempt was the Presets API in kernel 2.6.33, which provided fixed settings for video in a handful of HDTV formats (720p30, 1080p60, etc.). That API eventually proved too coarse for vendors, and was replaced in kernel 3.5 with the Timings API, which allows custom modeline-like video settings. The Event API is another recent addition, significantly improved in 3.1, which allows code to subscribe to immediate notification on events like the connection or disconnection of an input port.

The videobuf2 framework is another major overhaul; the previous incarnation of the framework (which provides an abstraction layer between applications and video device drivers) did not conform to V4L's own API and provided a memory management framework so flawed that most drivers did not even use it. The new framework separates buffer operations from memory management operations, and by removing the need for each driver to implement its own memory management, should simplify device driver code significantly.

Other noteworthy changes include support for the H.264 codec, new input cropping controls, and the long-awaited ability for radio tuners to tune multiple frequency bands (such as FM and AM). Radio Data System (RDS) support has also been upgraded, and now includes Traffic Message Channel (TMC) coding used in many urban areas. Cisco hired a student for the summer to write a new RDS library to replace the older, broken one. Finally, a contiguous memory allocator was written by Samsung and others for kernel 3.5, which helps video hardware allocate the large chunks of physically contiguous memory they need for direct memory access.

There is further work still in the pipeline, of course, and Verkuil mentioned three topics of importance to GStreamer. The first is buffer sharing; video decoding pipelines would prefer to avoid copying large buffers whenever possible, but currently V4L's video buffers are specific to an individual video node. Integrating V4L with DMAbuf is probably the solution, he said, and is likely to arrive in kernel 3.8. The second is better support for newer video connector types like HDMI and DisplayPort — in particular hot-pluggability and signal detection, for use by embedded devices that need to set up these connections without user intervention. Finally, he hopes to complete a V4L compliance testing tool, which he describes as 90% finished. The tool is used to test device drivers against the API, and drivers are required to pass its test before they get into the kernel. Verkuil said that the tool is actually stricter than the published API, because it checks for a number of optional features which are easy to implement, and can annoy users if they are left out.

ALSA

Takashi Iwai presented an update on the ALSA subsystem. In recent years, ALSA has not seen as many major changes as the various video subsystems have, but there are still plenty of challenges. The first is that, like video, more and more hardware devices now support decoding compressed audio in hardware. Kernel 3.3 added an API for offloading audio decoding to a hardware device, though the bigger improvement is likely to be kernel 3.7's merger of compressed audio hardware decoding for the ALSA System on Chip (ASoC) layer.

ASoC accounts for the majority of ALSA code (both in terms of lines and number of commits), Iwai said, followed by the HD-audio layer used in the majority of modern laptops. The third-largest component is USB-audio, which provides a single generic driver used by all USB audio devices. But while USB devices can share a common driver, the HD-audio layer covers roughly 4000 devices, each of which has a different configuration (in regard to which pin performs which function). It is not possible for the ALSA project to maintain and update 4000 separate configuration files, he said, so it instead relies on user reports to discover differences between hardware. That is a pain point, but most of the time hardware vendors use a consistent configuration so most devices work without configuration.

Ongoing work in ALSA includes the Use Case Manager (UCM) abstraction layer, a high-level device management layer that describes hardware routing and configuration for common tasks like "phone call" or "music playback." Jack detection is another continuing development. Currently there is no API to detect whether or not a connector has a jack plugged in, so multiple methods are in use, including Android's external connector class extcon and ALSA's general controls API.

Also still in the works is improved power management, both for HD-audio devices and for hardware decoders. Improvements are expected to land with kernel 3.7. HD-audio devices might also benefit from the ability to "patch" device firmware and change the pin configuration, so that recompiling the driver can be avoided.

The biggest outstanding issue at present is a channel mapping API, which encodes the surround-sound position associated with the speaker attached to each output channel (e.g., Front Left, Center, Right Rear, Low-Frequency Effects). Each needs to receive its own PCM audio stream, but there are multiple standards on the market, and the problem becomes even trickier when the system needs to combine channels for a setup with fewer speakers. There is a proposal in the works, which was discussed at length later in the week at the Linux Plumbers Conference audio mini-summit.

Wayland

Kristian Høgsberg presented an update on the Wayland display protocol and how it will differ from X. The session was not overly GStreamer-specific, but more of an introduction to Wayland. Since Wayland is not being used in the wild yet, preparing GStreamer developers in advance should simplify the eventual transition.

Høgsberg related the reasons for Wayland's creation — namely that as separate window managers and compositors have become the norm on Linux desktops, the X server itself is increasingly doing little but acting as a middleman. Many of the earlier functions of the X server have been moved out into separate libraries, such as Freetype, Fontconfig, Qt, and GTK+. Other key functions, such as mode-setting and input devices, are handled at lower levels, and many applications use Cairo or OpenGL to paint their window contents. Compositing was the final blow, however: in a compositing desktop, each window gets a private buffer of its own, which is drawn to the screen by the compositor. In this situation, X does nothing but add cost: another copy operation for the buffer, and more memory.

He described the basics of the Wayland protocol, which he said he expected to reach 1.0 status before the end of the year. That event will not mark Wayland's world domination, however. Weston, the reference compositor, already runs on most video hardware, but the major desktop projects and distributions will each implement their own Wayland support in their existing compositors (e.g., Mutter or KWin), and that is when the majority of users will first encounter Wayland.

The more practical section of the talk followed, an explanation of how Wayland handles video content. An application allocates a pixel buffer and shares it with the compositor; the compositor then attaches the buffer to an output "surface." Whenever a new frame is drawn to the screen, the compositor sends a notification to the application, which can then send the next frame. The big difference is that Wayland always works with complete frames. In contrast, X is fundamentally a stream protocol: it sends a series of events that must be de-queued and processed.

Video support is really only a matter of extending the color spaces that Wayland understands, he said. A video buffer may contain YUV data, for example. Wayland needs to be able to put YUV data into a rendering surface, and to composite RGB and YUV data together (such as in a video overlay).

This is still a work-in-progress, with a variety of options under consideration. One would allow only RGB buffers, and require client applications to handle the conversion, which could be costly in CPU usage. Another is to decode the frames directly into OpenGL textures and let OpenGL worry about the conversions. A third is to allocate shared memory YUV buffers then require the compositor to copy them into OpenGL textures, and perform the conversion at composite-time. The entire puzzle is further complicated when one adds in the possibility of hardware-decoded video content, which is increasingly common. If the possibilities sound a tad confusing, do not worry — Høgsberg said the project still finds it unclear which approach would be best.

GStreamer's video acceleration API (VA-API) plugin already supports Wayland, so whichever path Wayland takes as it finalizes 1.0, GStreamer support should follow in short order. Of course, GStreamer itself is also preparing for its 1.0 release. But as the Wayland, ALSA, and Video4Linux talks demonstrate, multimedia support on Linux is in an ever-changing state.

Quote of the week

I know everything is not perfect in Gnome land and my statements are pointing the good parts only but that's what we have to point out, right? Instead of pointing failures (they are always obvious), people need to point out what others are doing better than them so we can all improve. Whatever people might say about Gnome - as a developer I can only say that they are on the right path and sooner or later the technical excellence will pay off.

— Alexander Kurtakov

Comments (2 posted)

QEMU 1.2 released

Version 1.2 of the QEMU processor emulator has been released. "Even though this was the shortest release cycle in QEMU's history, it contains an impressive 1400 changesets from 180 unique authors." New features include support for LPAE (large physical address extensions) on the ARM Cortex A-15, a new ARM i.MX31 machine type, a way to produce ELF dumps of guest memory, support for PowerPC e5500 cores, better device tree support for PowerPC, and more. See the change log for all the details.

Comments (8 posted)

Qt5 beta released

The beta release of the Qt5 toolkit has been announced. "The Qt project aims to make developers’ life easier by enabling faster creation of great Qt apps and UIs on one or multiple targets. With Qt 5 we aim to make Qt better for addressing the latest UI paradigm shifts that i.e. touch screens and tablets require." See the announcement and the Qt5 feature list for details.

Comments (7 posted)

LilyPond 2.16.0 released

Version 2.16.0 of the LilyPond music typesetting system is out. New features include support for Kievan notation (as shown on the right), a number of improved interfaces, and a lot more; see the new features page for lots of details.

Full Story (comments: 1)

Twisted 12.2.0 released

Version 12.2 of the Twisted framework for Python is available. This release drops support for Python 2.5, provides a TCP endpoint that resolves IPv6 host addresses, and introduces several new compatibility features for integrating with other packages.

Cython 0.17 released

Version 0.17 of the Cython language has been released. Cython is a superset of Python that adds support for C functions and types. Among other things, this release "rounds up some rough edges of the compiler and adds (preliminary) support for CPython 3.3 and PyPy".

IcedTea-Web 1.3 released!

Version 1.3 of IcedTea-Web, the free software Java browser plugin from the IcedTea project, has been released. Highlighted features of this release include cookie-writing support, clarified security warnings, and better handling of applets that refer to missing classes.

Development newsletters from the last week

Caml Weekly News (September 4)
What's cooking in git.git (August 29)
What's cooking in git.git (August 31)
What's cooking in git.git (September 4)
Haskell Weekly News (August 30)
Mozilla Hacks Weekly (August 30)
OpenStack Community Weekly Newsletter (August 31)
Perl Weekly (September 3)
PostgreSQL Weekly News (September 3)
Ruby Weekly (August 30)
Tahoe-LAFS Weekly (September 3)

Day: Taking GNOME 3 to the next level

Allan Day previews the upcoming GNOME 3.6 release. "I’m more excited about this release than any since 3.0. The list of major updates is impressive: new message tray, updated Activities Overview, lock screen, integrated input sources, accessibility on by default, new Nautilus. Then there are all the small changes: new style modal dialogs, bags of improvements to System Settings, a new Empathy buddy list, SkyDrive support, natural scrolling, new backgrounds, an overhauled Baobab… the list goes on and on."

Comments (56 posted)

The new Firefox command-line interface

The folks at Mozilla have concluded that what the Firefox browser really needs is a command-line interface. "The 'pagemod' command lets you quickly make some bulk changes to the page. If you’re looking at a page and there’s something flashing at you, you can nuke it using the 'pagemod remove element' command."

Comments (16 posted)

The Document Foundation joins the OASIS Consortium

The Document Foundation has joined the Organisation for the Advancement of Standards in Information Society (OASIS) consortium. "The Document Foundation will primarily focus on the ODF Technical Committees, to represent the largest independent free software community focused on the development and the promotion of "the best free office suite" based on the Open Document Format. LibreOffice is available in over 100 native language versions, more than twice than any comparable software, and is therefore the most sophisticated, feature rich, complete and widespread ODF implementation worldwide."

FSFE Newsletter - September 2012

The Free Software Foundation Europe Newsletter covers the Free Your Android campaign, Open Standards, and several other topics.

Free Software Supporter -- Issue 53, August 2012

The August edition of the Free Software Foundation's newsletter covers freedom zero (the freedom to run the program as you wish, in order to do what you wish), a DRM-free label, updating the Free Software Directory, and several other topics.

ConFoo 2013, Montreal - Call for speakers

ConFoo 2013 will take place February 25-March 1 in Montreal, Canada. The call for speakers closes September 23. "The call for papers is public, meaning that all proposals get published on the website for others to vote and comment on. This approach allows the organizers to pick subjects that have most interest in the community. The comments are only visible to speakers and organizers to avoid influencing the votes."