LWN.net Weekly Edition for May 5, 2016 [LWN.net]

In search of a home for Thunderbird

By Nathan Willis
May 4, 2016

After nearly a decade of trying, Mozilla is finally making the move of formally spinning off ownership of the Thunderbird email client to a third party. The identity of the new owner is still up for debate; Simon Phipps prepared a report [PDF] analyzing several possible options. But Mozilla does seem intent on divesting itself of the project for real this time. Whoever does take over Thunderbird development, though, will likely face a considerable technical challenge, since much of the application is built on frameworks and components that Mozilla will soon stop developing.

Bird versus fox

To say that Mozilla has had a difficult relationship with Thunderbird would be putting things mildly. The first release was in 2003, with version 1.0 following in late 2004. As soon as 2007, though, Mozilla's Mitchell Baker announced that Mozilla wished to rid itself of Thunderbird and find a new home for the project. Instead, Mozilla ended up separating Thunderbird off into a distinct unit (Mozilla Messaging) under the Mozilla Foundation umbrella. It then reabsorbed that unit in 2011, with Baker noting:

The Thunderbird team has re-made Thunderbird into a modern email client. Thunderbird now has a more modular architecture, vastly modernized codebase, effective add-on mechanisms, a vastly improved user interface, and incremental innovations that continue to evolve and move the product forward. We intend to continue our work with the Thunderbird email product to meet this need.

But, in July 2012, Mozilla began pulling paid developers from Thunderbird and left its development primarily in the hands of community volunteers, with a few Mozilla employees performing QA and build duties to support the Extended Support Release (ESR) program. At the time, Baker offered this justification:

We’ve tried for years to build Thunderbird as a highly innovative offering, where it plays a role in moving modern Internet messaging to a more open, innovative space, and where there is a growing, more active contributor base. To date, we haven’t achieved this.

By 2014, Mozilla had ramped down its involvement to the point where the Thunderbird team lacked any clear leadership, so the developer community voted to establish a Thunderbird Council made up of volunteers.

Most recently, Baker announced in December 2015 that Thunderbird would be formally separated from Mozilla. Phipps was engaged to research the options that he later published in the aforementioned report. In April 2016, Gervase Markham announced that the search for a new home for the project was underway, with Phipps's recommendations serving as a guide.

Lizard tech

For fans of Thunderbird, the repeated back-and-forth from Mozilla leadership can be a source of frustration on its own, but it probably does not help that Mozilla has started multiple other non-browser projects (such as ChatZilla, Raindrop, Grendel, and Firefox Hello) over the years while insisting that Thunderbird was a distraction from Firefox. Although it might seem like Mozilla management displays an inconsistent attitude toward messaging and other non-web application projects, each call for Mozilla to rid itself of Thunderbird has also highlighted the difficulty of maintaining Thunderbird and Firefox in the same engineering and release infrastructure.

In recent years, due in no small part to pressure coming from the rapid release schedule of Google's Chrome, the Firefox development process has shifted considerably. There are new stable releases made approximately every six weeks, and development builds are provided for the next two releases in separate release channels.

In addition, the Firefox codebase itself is changing. The XUL and XPCOM frameworks are on their way out, to be replaced with components and add-ons written in JavaScript. The Gecko rendering engine is also marked for replacement by Servo, and the entire Firefox architecture may be replaced with the multi-process Electrolysis model.

While these changes are exciting news for Firefox, none of them have made their way into Thunderbird. In April, Mozilla's Mark Surman highlighted the divergence issue in a blog post, noting:

Many people who work on Firefox care about Thunderbird and do everything they can to accommodate Thunderbird as they evolve the code base, which slows down Firefox development when it needs to be speeding up. People in the Thunderbird community also remain committed to building on the Firefox codebase. This puts pressure on a small, dedicated group of volunteer coders who struggle to keep up.

Surman also pointed to a new job listing posted by Mozilla for a contractor who would oversee the transition. The posting describes two key responsibilities: to list all significant technical issues facing Thunderbird (including impact assessments) and to compile an outline of the options available to address those issues to move Thunderbird forward.

Former Mozilla developer Daniel Glazman responded to Surman's post on his own blog, with a more blunt assessment of the technical challenges facing Thunderbird developers. He pointed to the job posting's mention of XUL and XPCOM deprecation and said:

In practice, the last line above means for Thunderbird:

rewrite the whole UI and the whole JS layer with it
most probably rewrite the whole SMTP/MIME/POP/IMAP/LDAP/... layer
most probably have a new Add-on layer or, far worse, no more Add-ons

Glazman concluded that it is too soon to select a new host for the Thunderbird project, given that a decision has yet to be made about how to rewrite the application. Furthermore, he pointed out, Mozilla has not yet begun the transition away from XUL and XPCOM in the Firefox codebase. Only when that process starts, he said, will it be possible to assess the complexity of such a move for Thunderbird.

As far as the build infrastructure goes, Markham sent a proposal to the Thunderbird Council in March suggesting a path forward for separating Thunderbird from the Firefox engineering infrastructure. It did not spawn much discussion, but there did not seem to be any objection either.

Out of the nest

For now, Mozilla seems set on finding a new fiscal and organizational sponsor for Thunderbird, with The Document Foundation and the Software Freedom Conservancy (both highlighted in Phipps's report) currently the leading candidates. But the discussion has only just begun on the technical aspects of maintaining and evolving Thunderbird as a standalone application.

Surman contended that the needs of Firefox and Thunderbird are simply too different today for them to be tied to the same codebase and release process. Essentially, the web changes rapidly, while email changes slowly. It is hard to argue with that assertion (setting aside discussions of how email should change), but Thunderbird fans might contend that Mozilla not contributing developer time to the Thunderbird codebase only exacerbates any inherent difference between the browser and email client.

Whether one thinks Mozilla has not adequately supported Thunderbird over the years or has done its level best, the Thunderbird and Firefox projects today are moving in different directions. Given their shared history, it may seem sad to watch them part ways, but perhaps the Thunderbird community can make the most of the opportunity and drive the application forward where Mozilla could (or would) not.

Comments (32 posted)

Caravel data visualization

By Nathan Willis
May 4, 2016

One aspect of the heavily hyped Internet of Things (IoT) that can easily get overlooked is that each of the Things one hooks up to the Internet invariably spews out a near non-stop stream of data. While commercial IoT users—such as utility companies—generally have a well-established grasp of what data interests them and how to process it, the DIY crowd is better served by flexible tools that make exploring and transforming data easy. Airbnb maintains an open-source Python utility called Caravel that provides such tools. There are many alternatives, of course, but Caravel does a good job at ingesting data and smoothly molding it into nice-looking interactive graphs—with a few exceptions.

My own interest in data-visualization tools stems from IoT projects (namely home-automation and automotive sensors), but Caravel itself is in no way limited to such uses. Like most contemporary web-based service providers, Airbnb collects a lot of data about its users and their transactions (in this case, short-term housing rentals, renters, and property owners). The company also prides itself on having a slick-looking web interface, and Caravel reflects that: it sports modern charts and graphs—no crusty old PNGs with jagged lines generated by Graphviz here; everything is done in JavaScript.

In a nutshell, what Caravel provides is a connection layer supporting a variety of database types, the tools to configure the metrics of interest for any tables one wishes to explore, and an interactive utility for creating data visualizations. Several dozen visualization options are built in, and all of the charts the user creates can be saved and put into convenient "dashboards" for regular usage.

On top of all that, Caravel's interface is web-based and is almost entirely point-and-click. Perhaps the closest parallel would be to a tool like Orange, where the goal is to mask over the complexities of SQL and statistics. Caravel does not quite walk the user through adding new data sources or defining metrics, but it does take care of as many of the repetitive steps as it can.

For example, when you add a database table to your Caravel work space, there are rows of checkboxes by every field. If you want to track the minima, maxima, or sums for certain fields, you check them at load time, and those metrics are automatically available on the relevant pages of the application from then on. Similar checkboxes are available for selecting which fields should be used as categorical groups and which should be available for filtering the data set.

The first public release of Caravel was in September 2015. The most recent is version 0.8.9, from April 2016. The code is hosted at GitHub and packages are also available on the Python Package Index (PyPI). For the moment, only 2.7 is supported. On Linux, installation also requires the development packages for libssl and libffi. When Caravel is installed, one only needs to initialize the database and create an administrator account to get started.

A Caravel instance is multi-user, and the system supports an array of permissions and access controls. For testing, though, that is not necessary. Out of the box, the system provides a local web UI and comes pre-loaded with a demo data set. SQLite support is built in, and any other database (local or remote) with SQLAlchemy support can be used as well. Druid database clusters are also supported, and users can define a custom schema for any database that requires one. For those working with large data sets, the good news is that Caravel also supports a number of open-source caching layers, although none of them are required. All of these configuration options are presented in the web UI's "add a database" screen.

The birds-eye view of Caravel usage is that the user adds a new database, then selects and adds each table of interest. From then on, working with Caravel is a matter of using the visualization builder to hone in on a chart or graph that presents some meaningful information. The visualizations include everything from line charts to bubble graphs, box plots to directed graphs, and heatmaps to Sankey diagrams. There are also less scientific options, such as word clouds.

A visualization can be saved as a "slice," and any number of slices can be collected onto the same page as a "dashboard." Dashboards are updated regularly as the database is refreshed, so they can be deployed for internal or public consumption. Finally, although dashboard graphics are interactive JavaScript (with additional information shown where the mouse hovers), all charts and graphs can also be exported as image files.

This set of features is fairly complete, but one might well ask whether the implementation is up to snuff. For the most part, the answer to that question is yes.

Adding new databases and choosing which tables to use borders on trivial, thanks to the well-optimized add-and-edit pages. There are a few caveats, such as the fact that the user cannot simply add all of the tables of interest from a database at once—each table requires a separate round trip through the "add a table" page. And when Caravel does not like something about a table, it is hard to debug.

For example, Caravel includes special treatment for time-series data; the user can mark any field in a table as being of the datetime type and it will be automatically plugged into various time-series charts in the visualization tool. But Caravel could not make sense out of the timestamps in one example data set I downloaded from datahub.io, and there is no easy way to inspect the data directly, nor does there seem to be any way in the UI to transform the timestamps into an acceptable datetime format. Nor even to see what Caravel thinks is wrong with them.

Clearly, this issue falls under a "you must know your data" warning, which is a fair expectation. But the error reporting that Caravel presents yanks the user right out of the UI, displaying a generic, low-level exception warning and a traceback from the Python interpreter.

And this sometimes happens through no fault of the user, like when the user selects a new graph type from the drop-down menu in the visualization builder and the newly-selected graph takes a different number of parameters. By and large, the visualization tool is quite handy—the point-and-click settings and controls are not merely a coat of "UI paint" on the top; they help the user play around with their data sets to find the visualization settings that work best. Thus, it is more disappointing when that friendly interactivity breaks down.

There are a couple of troubling technical limitations to mention. First, users must construct any new metrics of interest (other than sample counts, sums, and minima/maxima) by entering raw SQL expressions. Some additional statistical tools would be handy. Perhaps more fundamental is the fact that Caravel cannot join or query multiple tables; all of the visualizations are therefore limited to what information one can extract from a single table.

It might be interesting to pair Caravel with a tool like OpenRefine that specializes in data transformation, but I suspect that for a great many users, what Caravel can do already will serve them well. It handles the database connectivity in the background, putting the emphasis on exploring and manipulating visualizations. The visualizations and dashboards it provides are top-notch by modern standards, but the fact that they are easy for the user to create is Caravel's real advantage.

Comments (none posted)

Replacing /dev/urandom

By Jonathan Corbet
May 4, 2016

The kernel's random-number generator (RNG) has seen a great deal of attention over the years; that is appropriate, given that its proper functioning is vital to the security of the system as a whole. During that time, it has acquitted itself well. That said, there are some concerns about the RNG going forward that have led to various patches aimed at improving both randomness and performance. Now there are two patch sets that significantly change the RNG's operation to consider.

The first of these comes from Stephan Müller, who has two independent sets of concerns that he is trying to address:

The randomness (entropy) in the RNG, in the end, comes from sources of physical entropy in the outside world. In practice, that means the timing of disk-drive operations, human-input events, and interrupts in general. But the solid-state drives deployed in current systems are far more predictable than rotating drives, many systems are deployed in settings where there are no human-input events at all, and, in any case, the entropy gained from those events duplicates the entropy from interrupts in general. The end result, Stephan fears, is that the current RNG is unable to pick up enough entropy to be truly random, especially early in the bootstrap process.
The RNG has shown some scalability problems on large NUMA systems, especially when faced with workloads that consume large amounts of random data from the kernel. There have been various attempts to improve RNG scalability over the last year, but none have been merged to this point.

Stephan tries to address both problems by throwing out much of the current RNG and replacing it with "a new approach"; see this page for a highly detailed explanation of the goals and implementation of this patch set. It starts by trying to increase the amount of useful entropy that can be obtained from the environment, and from interrupt timing in particular. The current RNG assumes that the timing of a specific interrupt carries little entropy — less than one bit. Stephan's patch, instead, accounts a full bit of entropy from each interrupt. Thus, in a sense, this is an accounting change: there is no more entropy flowing into the system than before, but it is being recognized at a higher rate, allowing early-boot users of random data to proceed.

Other sources of entropy are used as well when they are available; these include a hardware RNG attached to the system or built into the CPU itself (though little entropy is credited for the latter source). Earlier versions of the patch used the CPU jitter RNG (also implemented by Stephan) as another source of entropy, but that was removed at the request of RNG maintainer Ted Ts'o, who is not convinced that differences in execution time are a trustworthy source of entropy.

The hope is that interrupt timings, when added to whatever other sources of entropy are available, will be sufficient to quickly fill the entropy pool and allow the generation of truly random numbers. As with current systems, data read from /dev/random will remove entropy directly from that pool and will not complete until sufficient entropy accumulates there to satisfy the request. The actual random numbers are generated by running data from the entropy pool through the SP800-90A deterministic random bit generator (DRBG).

For /dev/urandom, another SP800-90A DRBG is fed from the primary DRBG described above and used to generate pseudo-random data. Every so often (ten minutes at the outset), this secondary generator is reseeded from the primary. On NUMA systems, there is one secondary generator for each node, keeping the random-data generation node-local and increasing scalability.

There has been a certain amount of discussion of Stephan's proposal, which is now in its third iteration, but Ted has said little beyond questioning the use of the CPU jitter technique. Or, at least, that was true until May 2, when he posted a new RNG of his own. Ted's work takes some clear inspiration from Stephan's patches (and from Andi Kleen's scalability work from last year) but it is, nonetheless, a different approach.

Ted's patch, too, gets rid of the separate entropy pool for /dev/urandom; this time, though, it is replaced by the ChaCha20 stream cipher seeded from the random pool. ChaCha20 is deemed to be secure and, it is thought, will perform better than SP800-9A. There is one ChaCha20 instance for each NUMA node, again, hopefully, helping to improve the scalability of the RNG (though Ted makes it clear that he sees this effort as being beyond the call of duty). There is no longer any attempt to track the amount of entropy stored in the (no-longer-existing) /dev/urandom pool, but each ChaCha20 instance is reseeded every five minutes.

When the system is booting, the new RNG will credit each interrupt's timing data with one bit of entropy, as does Stephan's RNG. Once the RNG is initialized with sufficient entropy, though, the RNG switches to the current system, which accounts far less entropy for each interrupt. This policy reflects Ted's unease with assuming that there is much entropy in interrupt timings; the timing of interrupts might be more predictable than one might think, especially on virtualized systems with no direct connection to real hardware.

Stephan's response to this posting has been gracious: "In general, I have no concerns with this approach either. And thank you that some of my concerns are addressed." That, along with the fact that Ted is the ultimate decision-maker in this case, suggests that his patch set is the one that is more likely to make it into the mainline; it probably will not come down to flipping a coin. It would be most surprising to see that merging happen for 4.7 — something as sensitive as the RNG needs some review and testing time — but it could happen not too long thereafter.

Comments (18 posted)

Security quotes of the week

How to panic a current @grsecurity kernel as any user: $ script /dev/null </dev/zero (seriously, WTF)

— Hector Martin shows how to get banned by grsecurity

Not all leaks are alike, nor are their makers. Gen. David Petraeus, for instance, provided his illicit lover and favorable biographer information so secret it defied classification, including the names of covert operatives and the president’s private thoughts on matters of strategic concern. Petraeus was not charged with a felony, as the Justice Department had initially recommended, but was instead permitted to plead guilty to a misdemeanor. Had an enlisted soldier of modest rank pulled out a stack of highly classified notebooks and handed them to his girlfriend to secure so much as a smile, he’d be looking at many decades in prison, not a pile of character references from a Who’s Who of the Deep State.

— Edward Snowden

So, the guy in the US government is upset that the public is more safe, and the guy that people want to accuse of being a traitor is proud of helping Americans to better protect themselves. Maybe we ought to reverse their roles...

— Mike Masnick on the NSA estimate that the Snowden revelations sped up the adoption of encryption

Comments (6 posted)

Linux Kernel BPF JIT Spraying (grsecurity forums)

Over at the grsecurity forums, Brad Spengler writes about a recently released proof of concept attack on the kernel using JIT spraying. "What happened next was the hardening of the BPF interpreter in grsecurity to prevent such future abuse: the previously-abused arbitrary read/write from the interpreter was now restricted only to the interpreter buffer itself, and the previous warn on invalid BPF instructions was turned into a BUG() to terminate execution of the exploit. I also then developed GRKERNSEC_KSTACKOVERFLOW which killed off the stack overflow class of vulns on x64. A short time later, there was work being done upstream to extend the use of BPF in the kernel. This new version was called eBPF and it came with a vastly expanded JIT. I immediately saw problems with this new version and noticed that it would be much more difficult to protect -- verification was being done against a writable buffer and then translated into another writable buffer in the extended BPF language. This new language allowed not just arbitrary read and write, but arbitrary function calling." The protections in the grsecurity kernel will thus prevent this attack. In addition, the newly released RAP feature for grsecurity, which targets the elimination of return-oriented programming (ROP) vulnerabilities in the kernel, will also ensure that "the fear of JIT spraying goes away completely", he said.

Comments (28 posted)

May Android security bulletin

The Android security bulletin for May is available. It lists 40 different CVE numbers addressed by the May over-the-air update; the bulk of those are at a severity level of "high" or above. "Partners were notified about the issues described in the bulletin on April 04, 2016 or earlier. Source code patches for these issues will be released to the Android Open Source Project (AOSP) repository over the next 48 hours. We will revise this bulletin with the AOSP links when they are available. The most severe of these issues is a Critical security vulnerability that could enable remote code execution on an affected device through multiple methods such as email, web browsing, and MMS when processing media files."

Comments (71 posted)

botan: side channel attack

Package(s):

botan1.10

CVE #(s):

CVE-2015-7827

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Debian advisory:

Use constant time PKCS #1 unpadding to avoid possible side channel attack against RSA decryption.

Alerts:

Mageia	MGASA-2016-0208	botan	2016-05-29
Fedora	FEDORA-2016-fe0d8f126a	botan	2016-05-16
Fedora	FEDORA-2016-f2aae0dbc5	botan	2016-05-15
Debian-LTS	DLA-449-1	botan1.10	2016-04-30
Debian	DSA-3565-1	botan1.10	2016-05-02

Comments (none posted)

botan: insufficient randomness

Package(s):

botan1.10

CVE #(s):

CVE-2014-9742

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Debian LTS advisory:

A bug in Miller-Rabin primality testing was responsible for insufficient randomness.

Alerts:

Debian-LTS

DLA-449-1

botan1.10

2016-04-30

Comments (none posted)

chromium-browser: multiple vulnerabilities

Package(s):

chromium-browser

CVE #(s):

CVE-2016-1660 CVE-2016-1661 CVE-2016-1662 CVE-2016-1663 CVE-2016-1664 CVE-2016-1665 CVE-2016-1666

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Red Hat advisory:

Multiple flaws were found in the processing of malformed web content. A web page containing malicious content could cause Chromium to crash, execute arbitrary code, or disclose sensitive information when visited by the victim.

Alerts:

openSUSE	openSUSE-SU-2016:1655-1	Chromium	2016-06-22
Ubuntu	USN-2960-1	oxide-qt	2016-05-19
Gentoo	201605-02	chromium	2016-05-17
Arch Linux	ASA-201605-7	chromium	2016-05-06
openSUSE	openSUSE-SU-2016:1209-1	Chromium	2016-05-04
openSUSE	openSUSE-SU-2016:1208-1	Chromium	2016-05-04
openSUSE	openSUSE-SU-2016:1207-1	Chromium	2016-05-04
Mageia	MGASA-2016-0160	chromium-browser-stable	2016-04-29
Debian	DSA-3564-1	chromium-browser	2016-05-02
Red Hat	RHSA-2016:0707-01	chromium-browser	2016-05-02

Comments (none posted)

i7z: denial of service

Package(s):

i7z

CVE #(s):

Created:

April 29, 2016

Updated:

May 4, 2016

Description:

From the Fedora advisory:

i7z-gui: Print_Information_Processor(): i7z_GUI killed by SIGSEGV

Alerts:

Fedora

FEDORA-2016-fcfe4c73b0

i7z

2016-04-28

Comments (none posted)

java: three vulnerabilities

Package(s):

java-1.6.0-ibm

CVE #(s):

CVE-2016-0264 CVE-2016-0363 CVE-2016-0376

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Red Hat advisory:

CVE-2016-0264 IBM JDK: buffer overflow vulnerability in the IBM JVM

CVE-2016-0363 IBM JDK: insecure use of invoke method in CORBA component, incorrect CVE-2013-3009 fix

CVE-2016-0376 IBM JDK: insecure deserialization in CORBA, incorrect CVE-2013-5456 fix

Alerts:

SUSE	SUSE-SU-2016:1475-1	java-1_8_0-ibm	2016-06-02
SUSE	SUSE-SU-2016:1458-1	java-1_6_0-ibm	2016-05-31
SUSE	SUSE-SU-2016:1388-1	IBM Java 1.6.0	2016-05-24
SUSE	SUSE-SU-2016:1378-1	java-1_7_0-ibm	2016-05-21
SUSE	SUSE-SU-2016:1379-1	java-1_6_0-ibm	2016-05-21
SUSE	SUSE-SU-2016:1303-1	java-1_6_0-ibm	2016-05-13
SUSE	SUSE-SU-2016:1300-1	java-1_7_1-ibm	2016-05-13
SUSE	SUSE-SU-2016:1299-1	java-1_7_1-ibm	2016-05-13
Red Hat	RHSA-2016:1039-01	java-1.8.0-ibm	2016-05-11
Red Hat	RHSA-2016:0716-01	java-1.8.0-ibm	2016-05-03
Red Hat	RHSA-2016:0701-01	java-1.7.1-ibm	2016-04-29
Red Hat	RHSA-2016:0702-01	java-1.7.0-ibm	2016-04-29
Red Hat	RHSA-2016:0708-01	java-1.6.0-ibm	2016-05-02

Comments (none posted)

jq: two vulnerabilities

Package(s):

jq

CVE #(s):

CVE-2015-8863 CVE-2016-4074

Created:

May 4, 2016

Updated:

December 8, 2016

Description:

From the openSUSE bug report:

CVE-2015-8863: heap buffer overflow in tokenadd() function http://seclists.org/oss-sec/2016/q2/134

CVE-2016-4074: stack exhaustion using jv_dump_term() function http://seclists.org/oss-sec/2016/q2/140

Alerts:

Arch Linux	ASA-201608-10	jq	2016-08-10
Red Hat	RHSA-2016:1106-01	jq	2016-05-25
Red Hat	RHSA-2016:1098-01	jq	2016-05-24
Red Hat	RHSA-2016:1099-01	jq	2016-05-24
openSUSE	openSUSE-SU-2016:1212-1	jq	2016-05-04
openSUSE	openSUSE-SU-2016:1214-1	jq	2016-05-04
Gentoo	201612-20	jq	2016-12-08

Comments (none posted)

kernel: two vulnerabilities

Package(s):

kernel

CVE #(s):

CVE-2016-3961 CVE-2016-3955

Created:

April 28, 2016

Updated:

May 4, 2016

Description:

From the Xen advisory:

CVE-2016-3961: Huge (2Mb) pages are generally unavailable to PV guests. Since x86 Linux pvops-based kernels are generally multi purpose, they would normally be built with hugetlbfs support enabled. Use of that functionality by an application in a PV guest would cause an infinite page fault loop, and an OOPS to occur upon an attempt to terminate the hung application.

Depending on the guest kernel configuration, the OOPS could result in a kernel crash (guest DoS).

From the Red Hat bugzilla entry:

CVE-2016-3955: Linux kernel built with the USB over IP(CONFIG_USBIP_*) support is vulnerable to a buffer overflow issue. It could occur while receiving USB/IP packets, when the size value in the packet is greater actual transfer buffer.

A user/process could use this flaw to crash the remote host via kernel memory corruption or potentially execute arbitrary code.

Alerts:

Ubuntu	USN-3127-2	linux-lts-trusty	2016-11-11
Ubuntu	USN-3127-1	kernel	2016-11-11
SUSE	SUSE-SU-2016:2245-1	kernel	2016-09-06
Ubuntu	USN-3050-1	linux-ti-omap4	2016-08-10
Ubuntu	USN-3049-1	kernel	2016-08-10
Debian	DSA-3607-1	kernel	2016-06-28
Mageia	MGASA-2016-0233	kernel-tmb	2016-06-22
Mageia	MGASA-2016-0232	kernel-linus	2016-06-22
openSUSE	openSUSE-SU-2016:1641-1	kernel	2016-06-21
Debian-LTS	DLA-516-1	kernel	2016-06-17
Mageia	MGASA-2016-0225	kernel	2016-06-13
Ubuntu	USN-2997-1	linux-ti-omap4	2016-06-09
Ubuntu	USN-3004-1	linux-raspi2	2016-06-09
Ubuntu	USN-3007-1	linux-raspi2	2016-06-10
Ubuntu	USN-3005-1	linux-lts-xenial	2016-06-10
Ubuntu	USN-3002-1	linux-lts-wily	2016-06-09
Ubuntu	USN-3001-1	linux-lts-vivid	2016-06-09
Ubuntu	USN-3000-1	linux-lts-utopic	2016-06-09
Ubuntu	USN-2998-1	linux-lts-trusty	2016-06-09
Ubuntu	USN-2996-1	kernel	2016-06-09
Ubuntu	USN-3003-1	kernel	2016-06-09
Ubuntu	USN-3006-1	kernel	2016-06-10
Ubuntu	USN-2989-1	kernel	2016-05-31
Ubuntu	USN-2965-4	linux-snapdragon	2016-05-06
Ubuntu	USN-2965-3	linux-raspi2	2016-05-06
Ubuntu	USN-2965-2	linux-lts-xenial	2016-05-06
Ubuntu	USN-2965-1	kernel	2016-05-06
Fedora	FEDORA-2016-373c063e79	kernel	2016-05-06
SUSE	SUSE-SU-2016:1203-1	kernel	2016-05-03
Fedora	FEDORA-2016-8a1f49149e	kernel	2016-04-27

Comments (none posted)

mercurial: code execution

Package(s):

mercurial

CVE #(s):

CVE-2016-3105

Created:

May 3, 2016

Updated:

May 18, 2016

Description:

From the Slackware advisory:

This update fixes possible arbitrary code execution when converting Git repos. Mercurial prior to 3.8 allowed arbitrary code execution when using the convert extension on Git repos with hostile names. This could affect automated code conversion services that allow arbitrary repository names. This is a further side-effect of Git CVE-2015-7545. Reported and fixed by Blake Burkhart.

Alerts:

openSUSE	openSUSE-SU-2016:1336-1	mercurial	2016-05-18
Mageia	MGASA-2016-0172	mercurial	2016-05-12
Debian-LTS	DLA-459-1	mercurial	2016-05-06
Arch Linux	ASA-201605-10	mercurial	2016-05-06
Debian	DSA-3570-1	mercurial	2016-05-05
Slackware	SSA:2016-123-01	mercurial	2016-05-02
Gentoo	201612-19	mercurial	2016-12-07

Comments (none posted)

minissdpd: denial of service

Package(s):

minissdpd

CVE #(s):

CVE-2016-3178 CVE-2016-3179

Created:

May 4, 2016

Updated:

May 4, 2016

Description:

From the Debian LTS advisory:

The minissdpd daemon contains a improper validation of array index vulnerability (CWE-129) when processing requests sent to the Unix socket at /var/run/minissdpd.sock the Unix socket can be accessed by an unprivileged user to send invalid request causes an out-of-bounds memory access that crashes the minissdpd daemon.

Alerts:

Debian-LTS

DLA-454-1

minissdpd

2016-05-03

Comments (none posted)

ntp: multiple vulnerabilities

Package(s):

ntp

CVE #(s):

CVE-2015-8139 CVE-2015-8140

Created:

April 29, 2016

Updated:

May 4, 2016

Description:

From the SUSE bug reports:

CVE-2015-8139: To prevent off-path attackers from impersonating legitimate peers, clients require that the origin timestamp in a received response packet match the transmit timestamp from its last request to a given peer. Under assumption that only the recipient of the request packet will know the value of the transmit timestamp, this prevents an attacker from forging replies.

CVE-2015-8140: The ntpq protocol is vulnerable to replay attacks. The sequence number being included under the signature fails to prevent replay attacks for two reasons. Commands that don't require authentication can be used to move the sequence number forward, and NTP doesn't actually care what sequence number is used so a packet can be replayed at any time. If, for example, an attacker can intercept authenticated reconfiguration commands that would. for example, tell ntpd to connect with a server that turns out to be malicious and a subsequent reconfiguration directive removed that malicious server, the attacker could replay the configuration command to re-establish an association to malicious server.

Alerts:

Gentoo	201607-15	ntp	2016-07-20
Fedora	FEDORA-2016-c3bd6a3496	ntp	2016-07-02
Fedora	FEDORA-2016-89e0874533	ntp	2016-07-02
Mageia	MGASA-2016-0219	ntp	2016-06-07
openSUSE	openSUSE-SU-2016:1423-1	ntp	2016-05-27
SUSE	SUSE-SU-2016:1311-1	ntp	2016-05-17
openSUSE	openSUSE-SU-2016:1292-1	ntp	2016-05-12
SUSE	SUSE-SU-2016:1247-1	ntp	2016-05-06
SUSE	SUSE-SU-2016:1177-1	ntp	2016-04-28
SUSE	SUSE-SU-2016:1175-1	ntp	2016-04-28

Comments (none posted)

ntp: multiple vulnerabilities

Package(s):

ntp

CVE #(s):

CVE-2016-1551 CVE-2016-1549 CVE-2016-2516 CVE-2016-2517 CVE-2016-2518 CVE-2016-2519 CVE-2016-1547 CVE-2016-1548 CVE-2016-1550

Created:

May 2, 2016

Updated:

May 16, 2016

Description:

From the Slackware advisory:

CVE-2016-1551: Refclock impersonation vulnerability, AKA: refclock-peering

CVE-2016-1549: Sybil vulnerability: ephemeral association attack, AKA: ntp-sybil - MITIGATION ONLY

CVE-2016-2516: Duplicate IPs on unconfig directives will cause an assertion botch

CVE-2016-2517: Remote configuration trustedkey/requestkey values are not properly validated

CVE-2016-2518: Crafted addpeer with hmode > 7 causes array wraparound with MATCH_ASSOC

CVE-2016-2519: ctl_getitem() return value not always checked

CVE-2016-1547: Validate crypto-NAKs, AKA: nak-dos

CVE-2016-1548: Interleave-pivot - MITIGATION ONLY

CVE-2016-1550: Improve NTP security against buffer comparison timing attacks, authdecrypt-timing, AKA: authdecrypt-timing

Alerts:

Ubuntu	USN-3096-1	ntp	2016-10-05
SUSE	SUSE-SU-2016:2094-1	yast2-ntp-client	2016-08-17
Red Hat	RHSA-2016:1552-01	ntp	2016-08-03
SUSE	SUSE-SU-2016:1912-1	ntp	2016-07-29
Debian-LTS	DLA-559-1	ntp	2016-07-25
Debian	DSA-3629-1	ntp	2016-07-25
Gentoo	201607-15	ntp	2016-07-20
Fedora	FEDORA-2016-c3bd6a3496	ntp	2016-07-02
Fedora	FEDORA-2016-89e0874533	ntp	2016-07-02
Scientific Linux	SLSA-2016:1141-1	ntp	2016-06-16
SUSE	SUSE-SU-2016:1568-1	ntp	2016-06-14
SUSE	SUSE-SU-2016:1471-1	ntp	2016-06-01
Oracle	ELSA-2016-1141	ntp	2016-05-31
Oracle	ELSA-2016-1141	ntp	2016-05-31
CentOS	CESA-2016:1141	ntp	2016-05-31
CentOS	CESA-2016:1141	ntp	2016-05-31
Red Hat	RHSA-2016:1141-01	ntp	2016-05-31
openSUSE	openSUSE-SU-2016:1423-1	ntp	2016-05-27
openSUSE	openSUSE-SU-2016:1329-1	ntp	2016-05-18
Mageia	MGASA-2016-0174	ntp	2016-05-13
SUSE	SUSE-SU-2016:1291-1	ntp	2016-05-12
SUSE	SUSE-SU-2016:1278-1	ntp	2016-05-11
Fedora	FEDORA-2016-777d838c1b	ntp	2016-05-12
Fedora	FEDORA-2016-5b2eb0bf9c	ntp	2016-05-10
Slackware	SSA:2016-120-01	ntp	2016-04-29

Comments (none posted)

openssl: multiple vulnerabilities

Package(s):

openssl

CVE #(s):

CVE-2016-2108 CVE-2016-2107 CVE-2016-2105 CVE-2016-2106 CVE-2016-2109

Created:

May 3, 2016

Updated:

June 1, 2016

Description:

From the Ubuntu advisory:

Huzaifa Sidhpurwala, Hanno Böck, and David Benjamin discovered that OpenSSL incorrectly handled memory when decoding ASN.1 structures. A remote attacker could use this issue to cause OpenSSL to crash, resulting in a denial of service, or possibly execute arbitrary code. (CVE-2016-2108)

Juraj Somorovsky discovered that OpenSSL incorrectly performed padding when the connection uses the AES CBC cipher and the server supports AES-NI. A remote attacker could possibly use this issue to perform a padding oracle attack and decrypt traffic. (CVE-2016-2107)

Guido Vranken discovered that OpenSSL incorrectly handled large amounts of input data to the EVP_EncodeUpdate() function. A remote attacker could use this issue to cause OpenSSL to crash, resulting in a denial of service, or possibly execute arbitrary code. (CVE-2016-2105)

Guido Vranken discovered that OpenSSL incorrectly handled large amounts of input data to the EVP_EncryptUpdate() function. A remote attacker could use this issue to cause OpenSSL to crash, resulting in a denial of service, or possibly execute arbitrary code. (CVE-2016-2106)

Brian Carpenter discovered that OpenSSL incorrectly handled memory when ASN.1 data is read from a BIO. A remote attacker could possibly use this issue to cause memory consumption, resulting in a denial of service. (CVE-2016-2109)

As a security improvement, this update also modifies OpenSSL behaviour to reject DH key sizes below 1024 bits, preventing a possible downgrade attack.

Alerts:

openSUSE	openSUSE-SU-2016:2788-1	mysql-community-server	2016-11-12
openSUSE	openSUSE-SU-2016:2769-1	mysql-community-server	2016-11-10
Red Hat	RHSA-2016:2073-01	openssl	2016-10-18
openSUSE	openSUSE-SU-2017:0487-1	openssl	2017-02-17
Oracle	ELSA-2016-3576	openssl	2016-06-21
openSUSE	openSUSE-SU-2016:1566-1	nodejs	2016-06-14
Scientific Linux	SLSA-2016:0996-1	openssl	2016-06-08
Scientific Linux	SLSA-2016:1137-1	openssl	2016-05-31
Oracle	ELSA-2016-1137	openssl	2016-05-31
CentOS	CESA-2016:1137	openssl	2016-05-31
Red Hat	RHSA-2016:1137-01	openssl	2016-05-31
Fedora	FEDORA-2016-e1234b65a2	mingw-openssl	2016-05-21
SUSE	SUSE-SU-2016:1360-1	openssl	2016-05-19
SUSE	SUSE-SU-2016:1290-1	openssl	2016-05-12
openSUSE	openSUSE-SU-2016:1273-1	compat-openssl098	2016-05-11
Fedora	FEDORA-2016-1e39d934ed	openssl	2016-05-10
Red Hat	RHSA-2016:0996-01	openssl	2016-05-10
SUSE	SUSE-SU-2016:1267-1	compat-openssl098	2016-05-09
Scientific Linux	SLSA-2016:0722-1	openssl	2016-05-09
Oracle	ELSA-2016-0722	openssl	2016-05-09
Mageia	MGASA-2016-0169	openssl	2016-05-07
CentOS	CESA-2016:0722	openssl	2016-05-09
Red Hat	RHSA-2016:0722-01	openssl	2016-05-09
openSUSE	openSUSE-SU-2016:1243-1	openssl	2016-05-05
openSUSE	openSUSE-SU-2016:1242-1	libopenssl0_9_8	2016-05-05
SUSE	SUSE-SU-2016:1233-1	openssl	2016-05-04
SUSE	SUSE-SU-2016:1231-1	compat-openssl097g	2016-05-04
openSUSE	openSUSE-SU-2016:1237-1	openssl	2016-05-05
openSUSE	openSUSE-SU-2016:1238-1	openssl	2016-05-05
openSUSE	openSUSE-SU-2016:1240-1	openssl	2016-05-05
openSUSE	openSUSE-SU-2016:1239-1	libopenssl0_9_8	2016-05-05
openSUSE	openSUSE-SU-2016:1241-1	libopenssl0_9_8	2016-05-05
Fedora	FEDORA-2016-05c567df1a	openssl	2016-05-04
SUSE	SUSE-SU-2016:1206-1	openssl1	2016-05-03
SUSE	SUSE-SU-2016:1228-1	openssl	2016-05-04
Slackware	SSA:2016-124-01	openssl	2016-05-03
Debian-LTS	DLA-456-1	openssl	2016-05-03
Debian	DSA-3566-1	openssl	2016-05-03
Arch Linux	ASA-201605-3	openssl	2016-05-04
Arch Linux	ASA-201605-4	lib32-openssl	2016-05-04
Ubuntu	USN-2959-1	openssl	2016-05-03
Gentoo	201612-16	openssl	2016-12-07

Comments (none posted)

openssl: information leak

Package(s):

lib32-openssl openssl

CVE #(s):

CVE-2016-2176

Created:

May 4, 2016

Updated:

May 12, 2016

Description:

From the Arch Linux advisory:

ASN1 Strings that are over 1024 bytes can cause an overread in applications using the X509_NAME_oneline() function on EBCDIC systems. This could result in arbitrary stack data being returned in the buffer.

Alerts:

Slackware	SSA:2016-124-01	openssl	2016-05-03
Debian-LTS	DLA-456-1	openssl	2016-05-03
Debian	DSA-3566-1	openssl	2016-05-03
Arch Linux	ASA-201605-3	openssl	2016-05-04
Arch Linux	ASA-201605-4	lib32-openssl	2016-05-04
Gentoo	201612-16	openssl	2016-12-07

Comments (2 posted)

openvas: cross-site scripting

Package(s):

openvas

CVE #(s):

CVE-2016-1926

Created:

May 2, 2016

Updated:

May 9, 2016

Description:

From the Red Hat bugzilla:

It was reported that openvas-gsa is vulnerable to cross-site scripting due to improper handling of parameters of get_aggregate command. If the attacker has access to a session token of the browser session, the cross site scripting can be executed. Affects versions >= 6.0.0 and < 6.0.8.

Alerts:

Fedora	FEDORA-2016-9851b69dbb	openvas-scanner	2016-05-08
Fedora	FEDORA-2016-9851b69dbb	openvas-manager	2016-05-08
Fedora	FEDORA-2016-9851b69dbb	openvas-libraries	2016-05-08
Fedora	FEDORA-2016-9851b69dbb	openvas-gsa	2016-05-08
Fedora	FEDORA-2016-9851b69dbb	openvas-cli	2016-05-08
Fedora	FEDORA-2016-afdedc8da9	openvas-scanner	2016-05-01
Fedora	FEDORA-2016-afdedc8da9	openvas-manager	2016-05-01
Fedora	FEDORA-2016-afdedc8da9	openvas-libraries	2016-05-01
Fedora	FEDORA-2016-afdedc8da9	openvas-gsa	2016-05-01
Fedora	FEDORA-2016-afdedc8da9	openvas-cli	2016-05-01

Comments (none posted)

oxide-qt: code execution

Package(s):

oxide-qt

CVE #(s):

CVE-2016-1578

Created:

April 28, 2016

Updated:

May 4, 2016

Description:

From the Ubuntu advisory:

A use-after-free was discovered when responding synchronously to permission requests. An attacker could potentially exploit this to cause a denial of service via application crash, or execute arbitrary code with the privileges of the user invoking the program. (CVE-2016-1578)

Alerts:

Ubuntu

USN-2955-1

oxide-qt

2016-04-27

Comments (none posted)

php: multiple vulnerabilities

Package(s):

php

CVE #(s):

CVE-2016-4537 CVE-2016-4538 CVE-2016-4539 CVE-2016-4540 CVE-2016-4541 CVE-2016-4542 CVE-2016-4543 CVE-2016-4544

Created:

May 2, 2016

Updated:

May 19, 2016

Description:

The php package has been updated to version 5.6.21, which fixes several security issues and other bugs. See the upstream ChangeLog for more details.

Alerts:

Red Hat	RHSA-2016:2750-01	rh-php56	2016-11-15
Debian-LTS	DLA-628-1	php5	2016-09-18
SUSE	SUSE-SU-2016:1638-1	php53	2016-06-21
SUSE	SUSE-SU-2016:1581-1	php53	2016-06-14
Debian	DSA-3602-1	php5	2016-06-14
openSUSE	openSUSE-SU-2016:1524-1	php5	2016-06-08
Debian-LTS	DLA-499-1	php5	2016-05-31
Ubuntu	USN-2984-1	php5, php7.0	2016-05-24
openSUSE	openSUSE-SU-2016:1357-1	php5	2016-05-19
Fedora	FEDORA-2016-e205218629	php	2016-05-12
Fedora	FEDORA-2016-f1d98cf017	php	2016-05-02
Slackware	SSA:2016-120-02	php	2016-04-29
Mageia	MGASA-2016-0159	php	2016-04-29
Gentoo	201611-22	php	2016-12-01

Comments (none posted)

php-ZendFramework: multiple vulnerabilities

Package(s):

php-ZendFramework

CVE #(s):

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Mageia advisory:

The php-ZendFramework package has been updated to version 1.12.18 to fix a potential information disclosure and insufficient entropy vulnerability in the word CAPTCHA (ZF2015-09) and several other functions (ZF2016-01).

Alerts:

Mageia

MGASA-2016-0156

php-ZendFramework

2016-04-29

Comments (none posted)

roundcubemail: three vulnerabilities

Package(s):

roundcubemail

CVE #(s):

CVE-2015-8864 CVE-2016-4068 CVE-2016-4069

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Red Hat bugzilla:

(CVE-2015-8864, CVE-2016-4068) Fix XSS issue in SVG images handling

(CVE-2016-4069) Protect download urls against CSRF using unique request tokens

Alerts:

Debian-LTS	DLA-613-1	roundcube	2016-09-08
openSUSE	openSUSE-SU-2016:2127-1	roundcubemail	2016-08-22
openSUSE	openSUSE-SU-2016:2109-1	roundcubemail	2016-08-19
openSUSE	openSUSE-SU-2016:2108-1	roundcubemail	2016-08-19
Debian-LTS	DLA-537-1	roundcube	2016-06-30
Mageia	MGASA-2016-0155	roundcubemail	2016-04-29
Fedora	FEDORA-2016-a9c8f9dcff	roundcubemail	2016-05-01
Fedora	FEDORA-2016-69eb7f9fb2	roundcubemail	2016-04-30

Comments (none posted)

subversion: multiple vulnerabilities

Package(s):

subversion

CVE #(s):

CVE-2016-2167 CVE-2016-2168

Created:

April 29, 2016

Updated:

June 8, 2016

Description:

From the Debian advisory:

CVE-2016-2167 - Daniel Shahaf and James McCoy discovered that an implementation error in the authentication against the Cyrus SASL library would permit a remote user to specify a realm string which is a prefix of the expected realm string and potentially allowing a user to authenticate using the wrong realm.

CVE-2016-2168 - Ivan Zhakov of VisualSVN discovered a remotely triggerable denial of service vulnerability in the mod_authz_svn module during COPY or MOVE authorization check. An authenticated remote attacker could take advantage of this flaw to cause a denial of service (Subversion server crash) via COPY or MOVE requests with specially crafted header.

Alerts:

Gentoo	201610-05	subversion	2016-10-11
Arch Linux	ASA-201606-6	subversion	2016-06-08
Fedora	FEDORA-2016-e024b3e02b	subversion	2016-05-12
openSUSE	openSUSE-SU-2016:1264-1	subversion	2016-05-07
openSUSE	openSUSE-SU-2016:1263-1	subversion	2016-05-07
Mageia	MGASA-2016-0161	subversion	2016-05-05
Slackware	SSA:2016-121-01	subversion	2016-04-30
Debian-LTS	DLA-448-1	subversion	2016-04-30
Debian	DSA-3561-1	subversion	2016-04-29

Comments (none posted)

tardiff: two vulnerabilities

Package(s):

tardiff

CVE #(s):

CVE-2015-0857 CVE-2015-0858

Created:

May 2, 2016

Updated:

July 28, 2016

Description:

From the Debian advisory:

CVE-2015-0857: Rainer Mueller and Florian Weimer discovered that tardiff is prone to shell command injections via shell meta-characters in filenames in tar files or via shell meta-characters in the tar filename itself.

CVE-2015-0858: Florian Weimer discovered that tardiff uses predictable temporary directories for unpacking tarballs. A malicious user can use this flaw to overwrite files with permissions of the user running the tardiff command line tool.

Alerts:

Debian-LTS	DLA-564-1	tardiff	2016-07-28
Debian	DSA-3562-1	tardiff	2016-05-01

Comments (none posted)

ubuntu-core-launcher: code execution

Package(s):

ubuntu-core-launcher

CVE #(s):

CVE-2016-1580

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Ubuntu advisory:

Zygmunt Krynicki discovered that ubuntu-core-launcher did not properly sanitize its input and contained a logic error when determining the mountpoint of bind mounts when using snaps on Ubuntu classic systems (eg, traditional desktop and server). If a user were tricked into installing a malicious snap with a crafted snap name, an attacker could perform a delayed attack to steal data or execute code within the security context of another snap. This issue did not affect Ubuntu Core systems.

Alerts:

Ubuntu

USN-2956-1

ubuntu-core-launcher

2016-04-29

Comments (none posted)

xen: three vulnerabilities

Package(s):

xen

CVE #(s):

CVE-2016-4001 CVE-2016-4002 CVE-2016-4037

Created:

May 2, 2016

Updated:

May 4, 2016

Description:

From the Red Hat bugzilla:

CVE-2016-4001: Qemu emulator built with the Luminary Micro Stellaris Ethernet Controller is vulnerable to a buffer overflow issue. It could occur while receiving network packets in stellaris_enet_receive(), if the guest NIC is configured to accept large(MTU) packets.

A remote user/process could use this flaw to crash the Qemu process on a host, resulting in DoS.

CVE-2016-4002: Qemu emulator built with the MIPSnet controller emulator is vulnerable to a buffer overflow issue. It could occur while receiving network packets in mipsnet_receive(), if the guest NIC is configured to accept large(MTU) packets.

A remote user/process could use this flaw to crash Qemu resulting in DoS; OR potentially execute arbitrary code with privileges of the Qemu process on a host.

CVE-2016-4037: Qemu emulator built with the USB EHCI emulation support is vulnerable to an infinite loop issue. It occurs during communication between host controller interface(EHCI) and a respective device driver. These two communicate via a split isochronous transfer descriptor list(siTD) and an infinite loop unfolds if there is a closed loop in this list.

A privileges used inside guest could use this flaw to consume excessive CPU cycles & resources on the host.

Alerts:

SUSE	SUSE-SU-2016:2725-1	xen	2016-11-04
SUSE	SUSE-SU-2016:2528-1	xen	2016-10-13
SUSE	SUSE-SU-2016:2533-1	xen	2016-10-13
openSUSE	openSUSE-SU-2016:2497-1	xen	2016-10-11
openSUSE	openSUSE-SU-2016:2494-1	xen	2016-10-11
Gentoo	201609-01	qemu	2016-09-25
SUSE	SUSE-SU-2016:2100-1	xen	2016-08-18
SUSE	SUSE-SU-2016:2093-1	xen	2016-08-17
SUSE	SUSE-SU-2016:1785-1	kvm	2016-07-11
openSUSE	openSUSE-SU-2016:1750-1	qemu	2016-07-06
Fedora	FEDORA-2016-ea3002b577	qemu	2016-07-02
Fedora	FEDORA-2016-73853a7a16	qemu	2016-07-02
SUSE	SUSE-SU-2016:1703-1	qemu	2016-06-29
SUSE	SUSE-SU-2016:1698-1	kvm	2016-06-28
Fedora	FEDORA-2016-a80eab65ba	qemu	2016-06-25
SUSE	SUSE-SU-2016:1560-1	qemu	2016-06-13
Fedora	FEDORA-2016-a3298e39f7	qemu	2016-05-20
Mageia	MGASA-2016-0176	qemu	2016-05-18
Fedora	FEDORA-2016-f2b1f07256	qemu	2016-05-15
Ubuntu	USN-2974-1	qemu, qemu-kvm	2016-05-12
Fedora	FEDORA-2016-75063477ca	xen	2016-05-01
Fedora	FEDORA-2016-35d7b09908	xen	2016-04-30

Comments (none posted)

xerces-j2: denial of service

Package(s):

xerces-j2

CVE #(s):

Created:

May 4, 2016

Updated:

May 24, 2016

Description:

From the openSUSE advisory:

bsc#814241: Fixed possible DoS through very long attribute names

Alerts:

Mageia	MGASA-2016-0205	xerces-j2	2016-05-24
openSUSE	openSUSE-SU-2016:1216-1	xerces-j2	2016-05-04

Comments (none posted)

Kernel release status

The current development kernel is 4.6-rc6, released on May 1. Linus said: "Things continue to be fairly calm, although I'm pretty sure I'll still do an rc7 in this series." As of this prepatch the code name has been changed to "Charred Weasel."

Stable updates: 4.5.3, 4.4.9, and 3.14.68 were released on May 4.

Comments (none posted)

Quotes of the week

I'm not trying to be insulting, but the amount of process involved in getting code into openstack is mind boggling to this kernel dev. Submit blueprint. Submit spec. Get approved. Then submit code. I suppose it's lighter weight than working for the government. But it's no wonder their release cycles are 6 months long.

— Kristen Accardi

FWIW, the whole point of DAX on existing filesystems was to not need major changes to existing filesystems to support fast pmem [persistent memory] operations. i.e. to get something working quickly while pmem native filesytems are developed to support pmem and all it's quirks in a clean and efficient manner.

Instead, what I'm seeing now is a trend towards forcing existing filesystems to support the requirements and quirks of DAX and pmem, without any focus on pmem native solutions. i.e. I'm hearing "we need major surgery to existing filesystems and block devices to make DAX work" rather than "how do we make this efficient for a pmem native solution rather than being bound to block device semantics"?

— Dave Chinner

Comments (none posted)

In pursuit of faster futexes

May 4, 2016

This article was contributed by Neil Brown

Futexes, the primitives provided by Linux for fast user-space mutex support, have been explored many times in these pages. They have gained various improvements over the years such as priority inheritance and robustness in the face of processes dying. But it appears that there is still at least one thing they lack: a recent patch set from Thomas Gleixner, along with a revised version, aims to correct the unfortunate fact that they just aren't fast enough.

The claim that futexes are fast (as advertised by the "f" in the name) is primarily based on their behavior when there is no contention on any specific futex. Claiming a futex that no other task holds, or releasing a futex that no other task wants, is extremely quick; the entire operation happens in user space with no involvement from the kernel. The claims that futexes are not fast enough, instead, focus on the contended case: waiting for a busy lock, or sending a wakeup while releasing a lock that others are waiting for. These operations must involve calls into the kernel as sleep/wakeup events and communication between different tasks are involved. It is expected that this case won't be as fast as the uncontended case, but hoped that it can be faster than it is, and particularly that the delays caused can be more predictable. The source of the delays has to do with shared state managed by the kernel.

Futex state: not everything is a file (descriptor)

Traditionally in Unix, most of the state owned by a process is represented by file descriptors, with memory mappings being the main exception. Uniformly using file descriptors provides a number of benefits: the kernel can find the state using a simple array lookup, the file descriptor limit stops processes from inappropriately overusing memory, state can easily be released with a simple close(), and everything can be cleaned up nicely when the process exits.

Futexes do not make use of file descriptors (for general state management) so none of these benefits apply. They use such a tiny amount of kernel space, and then only transiently, that it could be argued that the lack of file descriptors is not a problem. Or at least it could until the discussion around Gleixner's first patch set, where exactly this set of benefits was found to be wanting. While this first attempt has since been discarded in favor of a much simpler approach, exploring it further serves to highlight the key issues and shows what a complete solution might look like.

If we leave priority inheritance out of the picture for simplicity, there are two data structures that the kernel uses to support futexes. struct futex_q is used to represent a task that is waiting on a futex. There is at most one of these for each task and it is only needed while the task is waiting in the futex(FUTEX_WAIT) system call, so it is easily allocated on the stack. This piece of kernel state doesn't require any extra management.

The second data structure is a fixed sized hash table comprising an array of struct futex_hash_bucket; it looks something like this:

Each bucket has a linked list of futex_q structures representing waiting tasks together with a spinlock and some accounting to manage that list. When a FUTEX_WAIT or FUTEX_WAKE request is made, the address of the futex in question is hashed and the resulting number is used to choose a bucket in the hash table, either to attach the waiting task or to find tasks to wake up.

The performance issues arise when accessing this hash table, and they are issues that would not affect access in a file-descriptor model. First, the "address" of a futex can, in the general case, be either an offset in memory or an offset in a file and, to ensure that the correct calculation is made, the fairly busy mmap_sem semaphore must be claimed. A more significant motivation for the current patches is that a single bucket can be shared by multiple futexes. This makes the process of walking the linked list of futex_q structures to find tasks to wake up non-deterministic since the length could vary depending on the extent of sharing. For realtime workloads determinism is important; those loads would benefit from the hash buckets not being shared.

The single hash table causes a different sort of performance problem that affects NUMA machines. Due to their non-uniform memory architecture, some memory is much faster to access than other memory. Linux normally allocates memory resources needed by a task from the memory that is closest to the CPU that the task is running on, so as to make use of the faster access. Since the hash table is all at one location, the memory will probably be slow for most tasks.

Gleixner, the realtime tree maintainer, reported that these problems can be measured and that in real world applications the hash collisions "cause performance or determinism issues". This is not a particularly new observation: Darren Hart reported in a summary of the state of futexes in 2009 that "the futex hash table is shared across all processes and is protected by spinlocks which can lead to real overhead, especially on large systems." What does seem to be new is that Gleixner has a proposal to fix the problems.

Buckets get allocated instead of shared

The core of Gleixner's initial proposal was to replace use of the global table of buckets, shared by all futexes, with dynamically allocated buckets — one for each futex. This was an opt-in change: a task needed to explicitly request an attached futex to get one that has its own private bucket in which waiting tasks are queued.

If we return to the file descriptor model mentioned earlier, kernel state is usually attached via some system call like open(), socket(), or pipe(). These calls create a data structure — a struct file — and return a file descriptor, private to the process, that can be used to access it. Often there will be a common namespace so that two processes can access the same thing: a shared inode might be found by name and referenced by two private files each accessed through file descriptors.

Each of these ideas are present in Gleixner's implementation, though with different names. In place of a file descriptor there is the task-local address of the futex that is purely a memory address, never a file offset. It is hashed for use as a key to a new per-task hash table — the futex_cache. In place of the struct file, the hash table has a futex_cache_slot that contains information about the futex. Unlike most hash tables in the kernel, this one doesn't allow chaining: if a potential collision is ever detected the size of the hash table is doubled.

In place of the shared inode, attached futexes have a shared futex_state structure that contains the global bucket for that futex. Finally, to serve as a namespace, the existing global hash table is used. Each futex_state contains a futex_q that can be linked into that table.

With this infrastructure in place, a task that wants to work with an attached futex must first attach it:

    sys_futex(FUTEX_ATTACH | FUTEX_ATTACHED, uaddr, ....);

and can later detach it:

    sys_futex(FUTEX_DETACH | FUTEX_ATTACHED, uaddr, ....);

All operations on the attached futex would include the FUTEX_ATTACHED flag to make it clear they expect an attached futex, rather than a normal on-demand futex.

The FUTEX_ATTACH behaves a little like open() and finds or creates a futex_state by performing a lookup in the global hash table, and then attaches it to the task-local futex_cache. All future accesses to that futex will find the futex_state with a lookup in the futex_cache which will be a constant-time lockless access to memory that is probably on the same NUMA node as the task. There is every reason to expect that this would be faster and Gleixner has some numbers to prove it, though he admitted they were a synthetic benchmark rather than a real-world load.

It's always about the interface

The main criticisms that Gleixner's approach received were not that he was re-inventing the file-descriptor model, but that he was changing the interface at all.

Having these faster futexes was not contentious. Requiring, or even allowing, the application programmer to choose between the old and the new behavior is where the problem lies. Linus Torvalds in particular didn't think that programmers would make the right choice, primarily because they wouldn't have the required perspective or the necessary information to make an informed choice. The tradeoffs are really at the system level rather than the application level: large memory systems, particularly NUMA systems, and those designed to support realtime loads would be expected to benefit. Smaller systems or those with no realtime demands are unlikely to notice and could suffer from the extra memory allocations. So while a system-wide configuration might be suitable, a per-futex configuration would not. This seems to mean that futexes would need to automatically attach without requiring an explicit request from the task.

Torvald Riegel supported this conclusion from the different perspective provided by glibc. When using futexes to implement, for example, C11 mutexes, "there's no good place to add the attach/detach calls" and no way to deduce whether it is worth attaching at all.

It is worth noting that the new FUTEX_ATTACH interface makes the mistake of conflating two different elements of functionality. A task that issues this request is both asking for the faster implementation and agreeing to be involved in resource management, implicitly stating that it will call FUTEX_DETACH when the futex is no longer needed. Torvalds rejected the first of these as inappropriate and Riegel challenged the second as being hard to work with in practice. This effectively rang the death knell for explicit attachment.

Automatic attachment

Gleixner had already considered automatic attachment but had rejected it because of problems with, to use his list:

     - Memory consumption
     - Life time issues
     - Performance issues due to the necessary allocations

Starting with the lifetime issues, it is fairly clear that the lifetime of the futex_state and futex_cache_slot structures would start when a thread needed to wake or wait on a futex. When the lifetime ends is the interesting question and, while it wasn't discussed, there seem to be two credible options. The easy option is for these structures to remain until the thread(s) using them exits, or at least until the memory containing the futex is unmapped. This could be long after the futex is no longer of interest, and so is wasteful.

The more conservative option would be to keep the structures on some sort of LRU (least-recently used) list and discard the state for old futexes when the total allocated seems too high. As this would introduce a new source of non-determinism in access speed, the approach is likely a non-starter, so wastefulness is the only option.

This brings us to memory consumption. Transitioning from the current implementation to attached futexes changes the kernel memory used per idle futex from zero to something non-zero. This may not be a small change. It is easy to imagine an application embedding a futex in every instance of some object that is allocated and de-allocated in large numbers. Every time such a futex suffers contention, an extra in-kernel data structure would be created. The number of such would probably not grow quickly, but it could just keep on growing. This would particularly put pressure on the futex_cache which could easily become too large to manage.

The performance issues due to extra allocations are not a problem with explicit attachment for, as Gleixner later clarified: "Attach/detach is not really critical." With implicit attachment they would happen at first contention which would introduce new non-determinism. A realtime task working with automatically attached futexes would probably avoid this by issuing some no-op operation on a futex just to trigger the attachment at a predictable time.

All of these problems effectively rule out implicit attachment, meaning that, despite the fact that they remove nearly all the overhead for futex accesses, attached futexes really don't have a future.

Version two: no more attachment

Gleixner did indeed determine that attachment has no future and came up with an alternate scheme. The last time that futex performance was a problem, the response was to increase the size of the global hash table and enforce an alignment of buckets to cache lines to improve SMP behavior. Gleixner's latest patch set follows the same idea with a bit more sophistication. Rather than increase the single global hash table a "sharding" approach is taken, creating multiple distinct hash tables for multiple distinct sets of futexes.

Futexes can be declared as private, in which case they avoid the mmap_sem semaphore and can be only be used by threads in a single process. The set of private futexes for a given process form the basis for sharding and can, with the new patches, have exclusive access to a per-process hash table of futex buckets. All shared futexes still use the single global hash table. This approach addresses the NUMA issue by having the hash table on the same NUMA node as the process, and addresses the collision issue by dedicating more hash-table space per process. It only helps with private futexes, but these seem to be the style most likely to be used by applications needing predictable performance.

The choice of whether to allocate a per-process hash table is left as a system-wide configuration which is where the tradeoffs can best be assessed. The application is allowed a small role in the choice of when that table is allocated: a new futex command can request immediate allocation and suggest a preferred size. This avoids the non-determinism that would result from the default policy of allocation on first conflict.

It seems likely that this is the end of the story for now. There has been no distaste shown for the latest patch set and Gleixner is confident that it solves his particular problem. There would be no point aiming for a greater level of perfection until another demonstrated concrete need comes along.

Comments (6 posted)

Network filesystem topics

By Jake Edge
May 4, 2016

LSFMM 2016

Steve French and Michael Adam led a session in the filesystem-only track at the 2016 Linux Storage, Filesystem, and Memory-Management Summit on network filesystems and some of the pain points for them on Linux. One of the main topics was case-insensitive file-name lookups.

French said that doing case-insensitive lookups was a "huge performance issue in Samba". The filesystem won't allow the creation of files with the wrong case, but files created outside of Samba can have mixed case or file names that collide in a case-insensitive comparison. That could lead to deleting the wrong file, for example.

Ric Wheeler suggested that what was really being sought is case-insensitive lookups but preserving the case on file creation. Ted Ts'o said that he has never been interested in handling case-insensitive lookups because Unicode changes the case-folding algorithm with some frequency, which would lead to having to update the kernel code to match that. Al Viro noted that preserving the case can lead to problems in the directory entry (dentry) cache; if both foo.h and FOO.H have been looked up, they will hash to different dentries.

Ts'o said that they would need to hash to the same dentry. Wheeler suggested that the dentry could always be lower case and that the file could have an extended attribute (xattr) that contains the real case-preserved name. That could be implemented by Samba, but there is a problem, as Ts'o pointed out: the Unix side wants to see the file names with the case preserved.

David Howells wondered if the case could simply be folded before the hash is calculated. But the knowledge of case and case insensitivity is not a part of the VFS, Viro said, and the hash is calculated by the filesystems themselves. Ts'o said that currently case insensitivity is not a first-class feature; it is instead just some hacks here and there. If case insensitivity is going to be added to filesystems like ext4, there are some hurdles to clear. For example, there are on-disk hashes in ext4 and he is concerned that changes to the case-folding rules could cause the hash to change, resulting in lost files.

Adam said that handling the case problem is interesting, but there are other problems for network filesystems. He noted that NFS is becoming more like Samba over time. That means that some of the problems that Samba is handling internally will be need to be solved for NFS, as well, though there will be subtle differences between them.

Both the "birth time" attribute for files and rich ACLs were mentioned as areas where standard access mechanisms are needed, though there are plenty of others. The problem is that filesystems provide different ways to get these pieces of information, such as ioctl() commands or from xattrs. French said there should be some kind of system call to hide those differences.

The perennially discussed xstat() system call was suggested as that interface, but discussions of xstat() always result in lots of bikeshedding about which attributes it should handle, Viro said. Ts'o said that "people try to do too much" with xstat(). In fact, there was a short session on xstat() later in the day that tried to reduce the scope of the system call with an eye toward getting something merged.

If there are twenty problems that can't be solved for network filesystems and five that can, even getting three of those solved would be a nice start, French said. There are issues for remote DMA (RDMA) and how to manage a direct copy from a device, for example. There are also device characteristics (e.g. whether it is an SSD) that applications want to know. Windows applications want to be able to determine attributes like alignment and seek penalty, but there is no consistent way to get that information. In addition, French said he doesn't want to have to decide whether a filesystem is integrity protected, but wants to query for it in some standard way.

Christoph Hellwig has been suggesting that filesystems move away from xattrs and to standardized filesystem ioctl() commands, French said. Ts'o said that the problem with xattrs is that they have become a kind of ASCII ioctl(); filesystems are parsing and creating xattrs that don't live on disk. At that point, the time for the session expired.

Comments (none posted)

xstat()

By Jake Edge
May 4, 2016

LSFMM 2016

The proposed xstat() system call, which is meant to extend the functionality of the stat() call to get additional file-status information, has been discussed quite a bit over the years, but has never been merged. The main impediment seems to be a lot of bikeshedding about how much information—and which specific pieces—will be returned. David Howells led a short filesystem-only discussion on xstat() at the 2016 Linux Storage, Filesystem, and Memory-Management Summit.

Howells presented a long list of possibilities that could be added to the structure for additional file status information to be returned by a call like xstat()—things like larger timestamps, the creation (or birth) time for a file, data version number (for some filesystems), inode generation number, and more. In general, there are more fields, with some that have grown larger, for xstat().

There is also space at the end of the structure for growth. There are ways for callers to indicate what information they are interested in, as well as ways for the filesystem to indicate which pieces of valid information have been returned.

Howells noted that Dave Chinner wanted more I/O parameters (e.g. preferred read and write sizes, erase block size). There were five to seven different numbers that Chinner wanted, but those could always be added later, he said.

There are also some useful information flags that will be returned. Those will indicate if the file is a kernel file (e.g. in /proc or /sys), if it is compressed (and thus will result in extra latency when accessed), if it is encrypted, or if it is a sparse file. Windows has many of these indications.

But Ted Ts'o complained that there are two different definitions of a compressed file. It could mean that the file is not compressible, because it has already been done, or it could mean that the filesystem has done something clever and a read will return the real file contents. It is important to clearly define what the flag means. The FS_IOC_GETFLAGS ioctl() command did not do so, he said, so he wanted to ensure that the same mistake is not made with xstat().

There are other pieces of information that xstat() could return, Howells said. For example, whether accessing the file will cause an automount action or getting "foreign user ID" information for filesystems that don't have Unix-style UIDs or that have UIDs that do not map to the local system. There are also the Windows file attributes (archive, hidden, read-only, and system) that could be returned.

Ts'o suggested leaving out anything that did not have a clear definition of what it meant. That might help get xstat() merged. Others can be added later, he said.

Howells then described more of the functionality in his current version. There are three modes of operation. The standard mode would work the same way that stat() works today; it would force a sync of the file and retrieve an update from the server (if there is one). The second would be a "sync if we need to" mode; if only certain information that is stored locally is needed, it would simply be returned, but if the information requested required an update from the server (e.g. atime), that will be done. The third, "no sync" mode, means that only local values will be used; "it might be wrong, but it will be fast". For local filesystems, all three modes work the same way.

Jeff Layton asked: "How do we get it in without excessive bikeshedding?" He essentially answered his own question by suggesting that Howells start small and simply add "a few things that people really want". Joel Becker suggested that only parameters with "actual users in some upstream" be supported. That could help trim the list, he said.

Howells said that he asked for comments from various upstreams, but that only Samba had responded. Becker reiterated that whatever went in should be guided by actual users, since it takes work to support adding these bits of information. Howells agreed, noting that leaving extra space and having the masks and flags will leave room for expansion.

As it turns out, Howells posted a new patch set after LSFMM that reintroduces xstat() as the statx() system call.

Comments (none posted)

Stream IDs and I/O hints

By Jake Edge
May 4, 2016

LSFMM 2016

I/O hints are a way to try to give storage devices information that will allow them to make better decisions about how to store the data. One of the more recent hints is to have multiple "streams" of data that is associated in some way, which was mentioned in a storage standards update session the previous day. Changho Choi and Martin Petersen led a session at the 2016 Linux Storage, Filesystem, and Memory-Management Summit to flesh out more about streams, stream IDs, and I/O hints in general.

Choi said that he is leading the multi-stream specification and software-development work at Samsung. There is no mechanism for storage devices to expose their internal organization to the host, which can lead to inefficient placement of data and inefficient background operations (e.g. garbage collection). Streams are an attempt to provide better collaboration between the host and the device. The host gives hints to the device, which will then place the data in the most efficient way. That leads to better endurance as well as improved and consistent performance and latency, he said.

A stream ID would be associated with operations for data that is expected to have the same lifetime. For example, temporary data, metadata, and user data could be separated into their own streams. The ID would be passed down to the device using the multi-stream interface and the device could put the data in the same erase blocks to avoid copying during garbage collection.

For efficiency, proper mapping of data to streams is essential, Choi said. Keith Packard noted that filesystems try to put writes in logical block address (LBA) order for rotating media and wondered if that was enough of a hint. Choi said that more information was needed. James Bottomley suggested that knowing the size and organization of erase blocks on the device could allow the kernel to lay out the data properly.

But there are already devices shipping with the multi-stream feature, from Samsung and others, Choi said. It is also part of the T10 (SCSI) standard and will be going into T13 (ATA) and NVM Express (NVMe) specifications.

Choi suggested exposing an interface for user space that would allow applications to set the stream IDs for writes. But Bottomley asked if there was really a need for a user-space interface. In the past, hints exposed to application developers went largely unused. It would be easier if the stream IDs were all handled by the kernel itself. He was also concerned that there would not be enough stream IDs available, so the kernel would end up using them all; none would be available to offer to user space.

Martin Petersen said that he was not against a user-space interface if one was needed, but suggested that it would be implemented with posix_fadvise() or something like that rather than directly exposing the IDs to user space. Choi thought that applications might have a better idea of the lifetime of their data than the kernel would, however.

At that point, Petersen took over to describe some research he had done on hints: how they are used and which are effective. There are several conduits for hints in the kernel, including posix_fadvise(), ioprio (available using ioprio_set()), the REQ_META flag for metadata, NFS v4.2, SCSI I/O advice hints, and so on. There are tons of different hints available; vendors implement different subsets of them.

So he wanted to try to figure out which hints actually make a difference. He asked internally (at Oracle) and externally about available hints, which resulted in a long list. From that, he pared the list back to hints that actually work. That resulted in half a dozen hints that characterize the data:

Transaction - filesystem or database journals
Metadata - filesystem metadata
Paging - swap
Realtime - audio/video streaming
Data - normal application I/O
Background - backup, data migration, RAID resync, scrubbing

Of those, the background class made the most difference, as it ensures that a backup program doesn't steal I/O operations from real work, he said.

The original streams proposal requires that the block layer request a stream ID from a device by opening a stream. Eventually those streams would need to be closed as well. For NVMe, streams are closely tied to the hardware write channels, which are a scarce resource. The explicit stream open/close is not popular and is difficult to do in some configurations (e.g. multipath).

So Petersen is proposing a combination of hints and streams. Device hints would be set based on knowledge the kernel has about the I/O. The I/O priority would be used to set the background I/O class hint (though it might move to a REQ_BG request flag), other request flags (REQ_META, REQ_JOURNAL, and REQ_SWAP) would set those hints, and posix_fadvise() flags would also set the appropriate hints.

Stream IDs would be based on files, which would allow sending the file to different devices and getting roughly the same behavior, he said. The proposal would remove the requirement to open and close streams and would provide a common model for all device types, so flash controllers, storage arrays, and shingled magnetic recording (SMR) devices could all make better decisions about data placement. This solution is being proposed to the standards groups as a way to resolve the problems with the existing hints and multi-stream specifications.

Comments (none posted)

Background writeback

By Jake Edge
May 4, 2016

LSFMM 2016

The problems with background writeback in Linux have been known for quite some time. Recently, there has been an effort to apply what was learned by network developers solving the bufferbloat problem to the block layer. Jens Axboe led a filesystem and storage track session at the 2016 Linux Storage, Filesystem, and Memory-Management Summit to discuss this work.

The basic problem is that flushing block data from memory to storage (writeback) can flood the device queues to the point where any other reads and writes experience high latency. He has posted several versions of a patch set to address the problem and believes it is getting close to its final form. There are fewer tunables and it all just basically works, he said.

The queues are managed on the device side in ways that are "very loosely based on CoDel" from the networking code. The queues will be monitored and write requests will be throttled when the queues get too large. He thought about dropping writes instead (as CoDel does with network packets), but decided "people would be unhappy" with that approach.

The problem is largely solved at this point. Both read and write latencies are improved, but there is still some tweaking needed to make it work better. The algorithm is such that if the device is fast enough, it "just stays out of the way". It also narrows in on the right queue size quickly and if there are no reads contending for the queues, it "does nothing at all". He did note that he had not yet run the "crazy Chinner test case" again.

Ted Ts'o asked about the interaction with the I/O controller for control groups that is trying to do proportional I/O. Axboe said he was not particularly concerned about that. Controllers for each control group will need to be aware of each other, but it should all "probably be fine".

David Howells asked about writeback that is going to multiple devices. Axboe said that still needs work. Someone else asked about background reads, which Axboe said could be added. Nothing is inherently blocking that, but the work still needs to be done.

Comments (none posted)

Multipage bio_vecs

By Jake Edge
May 4, 2016

LSFMM 2016

In the block layer, larger I/O operations tend to be more efficient, but current kernels limit how large those operations can be. The bio_vec structure, which describes the buffer for an I/O operation, can only store a single page of tuples (of page, offset, and length) to describe the I/O buffer. There have been efforts over the years to allow multiple pages of array entries, so that even larger I/O operations can be held in a single bio_vec. Ming Lei led a session at the 2016 Linux Storage, Filesystem, and Memory-Management Summit to discuss patches to support bio_vec structures with multiple pages for the arrays.

Multipage bio_vec structures would consist of multiple, physically contiguous pages that could hold a larger array. It is the correct thing to do, Lei said. It will save memory as there will be fewer bio_vec structures needed and it will increase the transfer size for each struct bio (which contains a pointer to a bio_vec). Currently, the single-page nature of a bio_vec means that only one megabyte of I/O can be contained in a single bio_vec; adding support for multiple pages will remove that limit.

Jens Axboe agreed that there are benefits to larger bio_vec arrays, but was concerned about requesters getting physically contiguous pages. That would have to be done when the bio is created. Lei said that it is not hard to figure out how many pages will be needed before creating the bio, though.

All of the "magic" is in the bio_vec and bio iterators, one developer in the audience said. So there would be a need to introduce new helpers to iterate over the multipage bio_vec. The new name for the helper would require that all callers change, which would provide a good opportunity to review all of the users of the helpers, Christoph Hellwig said.

The patches also clean up direct access to some fields in bio structures: bi_vcnt, which tracks the number of entries in the bio_vec array, and the pointer to the bio_vec itself (bi_io_vec).

Axboe was concerned about handling all of the different special cases. There need to be "some real wins" in the patch set, since the memory savings are not all that huge. He is "not completely sold on why multipage is needed".

Hellwig agreed that the memory savings were not particularly significant, but that there is CPU time wasted in iterating over the segments. At various levels of the storage stack, the kernel has to iterate over the bio and bio_vec structures that make up I/O requests, so consolidating that information will save CPU time. There are also many needed cleanups in the patches, he said, so those should be picked up; "then, hopefully, we can get to the multipage bio_vecs".

Axboe said that the patches have been posted, but are not all destined for 4.7. He will queue up some of the preparatory patches, but the rest "need some time to digest".

Comments (none posted)

Exposing extent information to user space

By Jake Edge
May 4, 2016

LSFMM 2016

In a short, filesystem-only session at the 2016 Linux Storage, Filesystem, and Memory-Management Summit, Josef Bacik led a discussion on exposing information on extents, which are contiguous ranges of blocks allocated for a file (or files) by the filesystem, to user space. That could be done either by extending the FIEMAP ioctl() command or by coming up with a new interface. Bacik said that he was standing in for Mark Fasheh, who was unable to attend the session.

FIEMAP just reports whether an extent is shared or not, but there are some applications that want to know which inodes are sharing the extents. There are reserved 64-bit fields in struct fiemap_extent that could be used to report the inode numbers, Bacik said. He asked if that seemed like a reasonable approach.

Ric Wheeler wondered if there was really a need for applications to unwind all of this information. He asked: "Is there a backup application that will use this?" Jeff Mahoney responded that there is someone requesting the functionality.

Darrick Wong said that as part of his reverse mapping and reflink() work for XFS he has an interface that will allow applications to retrieve that kind of information. You can pass a range of physical block numbers to the reverse-map ioctl() and get back a list of objects (e.g. inodes) that think they own those blocks, he said.

Bacik said that sounded like the right interface: "Let's use that." Wong said that he would post some patches once he returned home from the summit.

Comments (none posted)

DAX on BTT

By Jake Edge
May 4, 2016

LSFMM 2016

In the final plenary session of the 2016 Linux Storage, Filesystem, and Memory-Management Summit, much of the team that works on the DAX direct-access mechanism led a discussion on how DAX should interact with the block translation table (BTT)—a mechanism aimed at making persistent memory have the atomic sector-write properties that users expect from block devices. Dan Williams took the role of ringleader, but Matthew Wilcox, Vishal Verma, and Ross Zwisler were also on-stage to participate.

Williams noted that Microsoft has adopted DAX for persistent memory and is even calling it DAX. Wilcox said that it was an indication that Microsoft is "listening to customers; they've changed".

[Matthew Wilcox, Vishal Verma, Ross Zwisler, and Dan Williams]

BTT is a way to put block-layer-like semantics onto persistent memory, which handles writes at a cache-line granularity (i.e. 64 bytes), so that 512-byte (sector) writes are atomic. This eliminates the problem of "sector tearing", where a power or other failure causes a partial write to a sector resulting in a mixture of old and new data—a situation that applications (or filesystems) are probably not prepared to handle. Microsoft supports DAX on both BTT and non-BTT block devices, while Linux only supports it for non-BTT devices. Williams asked: "should we follow them [Microsoft] down that rabbit hole?"

The problem is that BTT is meant to fix a problem where persistent memory is treated like a block device, which is not what DAX is aimed at. Using BTT only for filesystem metadata might be one approach, Zwisler said. But Ric Wheeler noted that filesystems already put a lot of work into checksumming metadata, so using BTT for that would make things much slower for little or no gain.

Jeff Moyer pointed out that sector tearing can happen on block devices like SSDs, which is not what users expect. Joel Becker suggested that something like the SCSI atomic write command could be used by filesystems or applications that are concerned about torn sectors. That command guarantees that the sector is either written in full or not at all. There is no way to "magically save applications from torn sectors" unless they take some kind of precaution, he said.

There is a bit of a "hidden agenda" in supporting BTT, though, Williams said. Currently, the drivers are not aware of when DAX mappings are established and torn down, but that would change for BTT support. Wilcox said he has a patch series that addresses some parts of that by making the radix tree the source for that information.

Comments (7 posted)

task_diag and statx()

By Jonathan Corbet
May 4, 2016

The interfaces supported by Linux to provide access to information about processes and files have literally been around for decades. One might think that, by this time, they would have reached a state of relative perfection. But things are not so perfect that developers are deterred from working on alternatives; the motivating factor in the two cases studied here is the same: reducing the cost of getting information out of the kernel while increasing the range of information that is available.

task_diag

There is no system call in Linux that provides information about running processes; instead, that information can be found in the /proc filesystem. Each process is represented by a directory under /proc; that directory contains a directory tree of its own with files providing information on just about every aspect of the process's existence. A quick look at the /proc hierarchy for a running bash instance reveals 279 files in 40 different directories. Whether one wants to know about control-group membership, resource usage, memory mappings, environment variables, open files, namespaces, out-of-memory-killer policies, or more, there is a file somewhere in that tree with the requisite information.

There are a lot of advantages to /proc, starting with the way it implements the classic Unix "everything is a file" approach. The information is readable as plain text, making it accessible from the command line or shell scripts. To a great extent, the interface is self-documenting, though some parts are more obvious than others. The contents of the stat file, for example, require an outside guide to be intelligible.

There are some downsides to this approach too, though. Accessing a file in /proc requires a minimum of three system calls — open(), read(), and close() — and that is after the file has been located in the directory hierarchy. Getting a range of information can require reading several files, with the system-call count multiplied accordingly. Some /proc files are expensive to read, and much of the resulting data may not be of interest to the reading process. There has been, to put it charitably, no unifying vision guiding the design of the /proc hierarchy, so each file there must be approached as a new parsing problem. It all adds up to a slow and cumbersome interface for applications that need significant amounts of information about multiple processes.

A possible solution comes in the form of the task_diag patch set from Andrey Vagin; it adds a binary interface allowing the extraction of lots of process information from the kernel using a single request. The starting point is a file called /proc/task-diag, which an interested process must open. That process then uses the netlink protocol to send a message describing the desired information, which can then be read back from the same file.

The request for information is contained within this structure:

    struct task_diag_pid {
	__u64   show_flags;
	__u64   dump_strategy;
	__u32   pid;
    };

The dump_strategy field tells the kernel which processes are of interest. Its value can be one of TASK_DIAG_DUMP_ONE (information about the single process identified by pid), TASK_DIAG_DUMP_THREAD (get information about all threads of pid), TASK_DIAG_DUMP_CHILDREN (all children of pid), TASK_DIAG_DUMP_ALL (all processes in the system) or TASK_DIAG_DUMP_ALL_THREADS (all threads in the system).

The show_flags field, instead, describes which information is to be returned for each process. With TASK_DIAG_SHOW_BASE, the "base" information will be returned:

    struct task_diag_base {
	__u32   tgid;
	__u32   pid;
	__u32   ppid;
	__u32   tpid;
	__u32   sid;
	__u32   pgid;
	__u8    state;
	char    comm[TASK_DIAG_COMM_LEN];
    };

Other possible flags include TASK_DIAG_SHOW_CREDS to get credential information, TASK_SHOW_VMA and TASK_SHOW_VMA_STAT for information on memory mappings, TASK_DIAG_SHOW_STAT for resource-usage statistics, and TASK_DIAG_SHOW_STATM for memory-usage statistics. If this interface is merged into the mainline, other options will surely follow.

The patches have been through a few rounds of review. Presumably something along these lines will eventually be merged, but it is not clear that the level of review required to safely add a new major kernel API has happened. There is also no man page for this feature yet. So it would not be surprising if a few more iterations were required before this one is declared to be ready.

statx()

Information about files in Linux, as with all Unix-like systems, comes via the stat() system call and its variants. Developers have chafed against its limitations for a long time. This system call, being enshrined in the POSIX standard, cannot be extended to return more information. It will likely return information that the calling process doesn't need — a wasted effort that can, for some information and filesystems, be expensive. And so on. For these reasons, an extended stat() system call has been a topic of discussion for many years.

Back in 2010, David Howells proposed a xstat() call that addressed a number of these problems, but that proposal got bogged down in discussion without being merged. Six years later, David is back with a new version of this patch. Time will tell if he is more successful this time around.

The new system call is now called statx(); the proposed interface is:

	int statx(int dfd, const char *filename, unsigned atflag,
		  unsigned mask, struct statx *buffer);

The file of interest is identified by filename; that file is expected to be found in or underneath the directory indicated by the file descriptor passed in dfd. If dfd is AT_FDCWD, the filename is interpreted relative to the current working directory. If filename is null, information about the file represented by dfd is returned instead.

The atflag parameter modifies how the information is collected. If it is AT_SYMLINK_NOFOLLOW and filename is a symbolic link, information is returned about the link itself. Other atflag values include AT_NO_AUTOMOUNT to prevent filesystems from being automatically mounted by the request, AT_FORCE_ATTR_SYNC to force a network filesystem to update attributes from the server before returning the information, and AT_NO_ATTR_SYNC to avoid updating from the server, even at the cost of returning approximate information. That last option can speed things up considerably when querying information about files on remote filesystems.

The mask parameter, instead, specifies which information the caller is looking for. The current patch set has fifteen options, varying from STATX_MODE (to get the permission bits) to STATX_GEN to get the current inode generation number (on filesystems that have such a concept). That mask appears again in the returned structure to indicate which fields are valid; that structure looks like:

    struct statx {
	__u32	st_mask;	/* What results were written [uncond] */
	__u32	st_information;	/* Information about the file [uncond] */
	__u32	st_blksize;	/* Preferred general I/O size [uncond] */
	__u32	st_nlink;	/* Number of hard links */
	__u32	st_gen;		/* Inode generation number */
	__u32	st_uid;		/* User ID of owner */
	__u32	st_gid;		/* Group ID of owner */
	__u16	st_mode;	/* File mode */
	__u16	__spare0[1];
	__u64	st_ino;		/* Inode number */
	__u64	st_size;	/* File size */
	__u64	st_blocks;	/* Number of 512-byte blocks allocated */
	__u64	st_version;	/* Data version number */
	__s64	st_atime_s;	/* Last access time */
	__s64	st_btime_s;	/* File creation time */
	__s64	st_ctime_s;	/* Last attribute change time */
	__s64	st_mtime_s;	/* Last data modification time */
	__s32	st_atime_ns;	/* Last access time (ns part) */
	__s32	st_btime_ns;	/* File creation time (ns part) */
	__s32	st_ctime_ns;	/* Last attribute change time (ns part) */
	__s32	st_mtime_ns;	/* Last data modification time (ns part) */
	__u32	st_rdev_major;	/* Device ID of special file */
	__u32	st_rdev_minor;
	__u32	st_dev_major;	/* ID of device containing file [uncond] */
	__u32	st_dev_minor;
	__u64	__spare1[16];	/* Spare space for future expansion */
    };

Many of those fields match those found in the classic struct stat or are close to them. Times have been split into separate second and nanosecond fields, enabling both high-precision timestamps and year-2038 compliance. The __spare1 array at the end is meant to allow other types of data to be added in the future. Finally, st_information gives general information about the file, including whether it's encrypted, whether it's a kernel-generated file, or whether it's stored on a remote server.

The only response to this patch set, as of this writing, came from Jeff Layton, who suggested "I think we really ought to resist excessive bikeshedding this time around". If the other developers accept that advice, then it's possible that an enhanced stat() interface might just get into the kernel sometime this year. Nobody will be able to complain that this particular change has been rushed.

Comments (28 posted)

Linus Torvalds Linux 4.6-rc6 ?

Greg KH Linux 4.5.3 ?

Greg KH Linux 4.4.9 ?

Kamal Mostafa Linux 4.2.8-ckt9 ?

Sasha Levin Linux 4.1.23 ?

Kamal Mostafa Linux 3.19.8-ckt20 ?

Sasha Levin Linux 3.18.32 ?

Greg KH Linux 3.14.68 ?

Zefan Li Linux 3.4.112 ?

Mike Travis X86_64, UV: Update kernel for SGI UV4 support ?

David Carrillo-Cisneros 2nd Iteration of Cache QoS Monitoring support. ?

Thomas Garnier x86, boot: KASLR memory implementation ?

Thomas Gleixner futex: Add support for process private hashing ?

Josh Poimboeuf livepatch: hybrid consistency model ?

David Howells Enhanced file stat system call ?

Laxman Dewangan Add support for MAXIM MAX77620/MAX20024 PMIC ?

Lijun Ou Add HiSilicon RoCE driver ?

Mylène Josserand add support for Microcrystal RV-3049 ?

Lu Baolu usb: add support for Intel dual role port mux ?

Andy Lutomirski i2c_imc: New driver, at long last ?

Garlic Tseng [PATCH 0/7] ASoC: Mediatek: Add support for MT2701 SOC ?

Enric Balletbo i Serra Add ANX7814 I2C bridge driver ?

Florian Fainelli pci: host: Broadcom STB PCIE RC controller support ?

Joao Pinto add support for DWC UFS Controller ?

Tiffany Lin Add MT8173 Video Encoder Driver and VPU Driver ?

Jon Hunter Add support for Tegra210 AGIC ?

David Wu add i2c driver supported for rk3399 ?

Sagar Dharia Introduce framework for SLIMbus device drivers ?

Hans Verkuil HDMI CEC framework ?

Roger Quadros USB OTG/dual-role framework ?

Baolin Wang Introduce usb charger framework to deal with the usb gadget power negotation ?

Caesar Wang Thermal: Support for hardware-tracked trip points ?

Eric Dumazet net: make TCP preemptible ?

Stephan Mueller /dev/random - a new approach ?

Theodore Ts'o random: replace urandom pool with a CRNG ?

David A. Wheeler Add safename, a new minor Linux Security Module (LSM) ?

Masami Hiramatsu perf-probe --cache and SDT support ?

Jesper Dangaard Brouer samples/bpf: Improve user experience ?

Fedora's Node.js problem

By Jake Edge
May 4, 2016

Aligning distribution and major package schedules is often something of a tricky balancing act. Fedora is currently doing some of that; it is trying to figure out what to with its Node.js package for Fedora 24. Node.js 5.10 is currently packaged for Fedora 24, but that release will only be supported until mid-year, which would mean that Fedora developers would have to backport security fixes for as long as a year. But the most recent release (6.0) came out on April 26 and may not yet support all of the other dependent packages, which puts Fedora somewhere between a rock and a hard place.

Current Node.js package maintainer Stephen Gallagher posted a message about the problem on the release day for Node.js 6.0. That release is a "significant ABI-breaking release, which means there is no guarantee that existing modules will work with it at all", he said. That release is slated to become the next long-term support release in October—on paper, that would make it a good choice for Fedora 24, since it will be supported until 2019. But Fedora 24 is nearly out the door, with a final freeze scheduled for the end of May and the release in mid-June. So, any change, especially one that could break all of the packages dependent on Node.js, is worrisome—at best.

Currently, Fedora 24 has Node.js 5.x, but that will be unsupported by the Node.js developers relatively soon. One option would be to stick with that, but once Fedora 24 is released, the project's policy of disallowing major ABI changes in a stable release would mean that Fedora has to pick up the maintenance burden. Gallagher put it this way:

This means manually backporting any security issues that come up without the benefit of a new version to rebase to and with an increasing likelihood of the patches diverging from upstream.

Another option might be to fall back to Node.js 4.x, which was released in October 2015 and will be supported until well after the end of life for Fedora 24, but that is not without potential problems as well. Any of the dependent packages that have started using features from 5.x may not work. It all adds up to no "particularly good options", he said.

No one in the resulting thread seemed to like the "stick with 5.x" option. There was some talk of just abandoning 5.x and moving to 6.x sometime around October, but that would require Fedora Engineering Steering Committee (FESCo) approval and would set something of a bad precedent. But "drago01" thought that concerns about upgrading the package during the release cycle were based on a misunderstanding of the update policy, which says that it is possible to upgrade when an upstream stops supporting a release if backporting fixes "would be impractical". The definition of "practical" is up to the packager and FESCo.

Gallagher pointed out that the policy is likely changing, however:

While true, FESCo recently (last Friday) approved a draft update to that policy that explains that backwards-compatibility breakages are almost never acceptable in a stable release. Since 6.0 breaks compat, FESCo would probably vote against the upgrade mid-release (and I would agree with that).

Tom Hughes noted that there actually might not be any problem falling back to 4.x, since most packages try to support both 4.x and 5.x. Gallagher confirmed that by querying the package repository:

So according to this, we have nothing in the package collection that is known to require only 5.x or later. So that's a point in favor of the 4.x downgrade approach.

I don't love the idea of regressing the versions post-Beta, but it's starting to look like the least-risky approach.

The beta release of Fedora 24 is targeted for May 10, so the regression to 4.x would happen after that. But if the repository metadata is accurate, there should be few repercussions to making the switch. Gallagher announced that plan on April 28 and asked Node.js users to test with the new package he had built.

There was also some discussion of which Node.js branches should be used in the future. Hughes asked if it would ever make sense to use versions without long-term support given Fedora's thirteen-month support cycle. Gallagher replied that it turns out those releases are generally only supported for nine months or so, which means that Fedora should always ship the long-term support releases. That seems to be the plan moving forward.

In the end, it would seem that it will be a fairly painless transition, even if it looked like there were only ugly choices at the start. As part of its general philosophy, Fedora targets the newest releases of packages, which is presumably why Node.js 5.x got picked up in the first place. But the burden of maintaining an unsupported release, especially for an internet-facing package like Node.js, is quite large. Better to distribute an older version than to risk exposing Fedora Node.js users to the vulnerabilities that will undoubtedly be uncovered during the life of Fedora 24.

Comments (6 posted)

Devuan Jessie beta released

The Devuan community has finally gotten a beta release out for testing. "Debian GNU+Linux [sic] is a fork of Debian without systemd, on its way to become much more than that. This Beta release marks an important milestone towards the sustainability and the continuation of Devuan as an universal base distribution."

Full Story (comments: 370)

The Linux Embedded Development Environment launches

The Linux Embedded Development Environment (or LEDE) project, a fork (or "spinoff") of OpenWrt, has announced its existence. "We are building an embedded Linux distribution that makes it easy for developers, system administrators or other Linux enthusiasts to build and customize software for embedded devices, especially wireless routers. [...] Members of the project already include a significant share of the most active members of the OpenWrt community. We intend to bring new life to Embedded Linux development by creating a community with a strong focus on transparency, collaboration and decentralisation." The new project lives at lede-project.org. (Thanks to Mattias Mattsson).

Full Story (comments: 16)

Distribution newsletters

DistroWatch Weekly, Issue 659 (May 2)
Lunar Linux weekly news (April 29)
Ubuntu Kernel Team weekly newsletter (May 3)
Ubuntu Weekly Newsletter, Issue 463 (May 1)

Comments (none posted)

Ubuntu 16.04 Review: What’s New for Desktop Users (Linux.com)

Linux.com reviews Ubuntu 16.04 LTS on the desktop, including Snaps, a new way of packaging and delivering applications. "Snaps also offer relatively more security because each app is sandboxed -- although there is still some room for improvement. But, like any other new technology, it will get better with time. In regard to privacy and security, I should mention that Unity previously was heavily criticized for integrating online ads and services with Dash. It was seen a privacy leak. Ubuntu 16.04, however, comes the latest version of Unity for the desktop -- that's 7.4 -- which disables online search or ads as the default."

Comments (none posted)

Sandboxing for the unprivileged with bubblewrap

By Nathan Willis
May 4, 2016

Sandboxing a Linux application is a task often associated with server-side containers that are exposed to the Internet at large. One wants a database server or web application to be contained, for instance, so that if something goes awry, that does not harm the entire host operating system. But work progresses on sandboxing desktop applications, too. On that front, Alex Larsson and others from the GNOME community have recently started working on a new "user-level" sandboxing utility called bubblewrap. The tool can be used by unprivileged users to run an application in a sandboxed environment, so that it cannot access OS resources and is similarly restricted from accessing the user's home directory.

Much of the effort (or, at least, the highly publicized effort) that has gone into application-level sandboxing in recent years has been driven by the needs of those deploying containers at large scale—in particular, by projects like Docker and Kubernetes that are most popular for web applications. But the desire to isolate a process from the surrounding OS is also relevant to people running desktop Linux distributions.

Ubuntu's snap system and GNOME's xdg-app both incorporate measures to isolate an application process (via mechanisms like control groups and namespaces). But both of those frameworks provide quite a bit more than the sandbox itself. For instance, they define runtime environments that the sandboxed process can depend on for system services and they provide a way to package an application with its dependencies included.

Furthermore, as Larsson explained in an April 29 blog post, container-management tools themselves are generally designed for system administrators and, thus, run as a privileged user. That makes such tools a potential cause for security concern, as granting users access to them means trusting those users with considerable administrative privileges (and, in the case of security bugs, can lead to privilege escalations). In addition, there are several scenarios beyond container orchestration in which sandboxing a process is a wise idea. Development builds of software could be sandboxed to isolate them from the OS, for example, or a sandbox could be used to limit an application's privileges in some specific manner. In short, a sandboxing utility that unprivileged users could employ to isolate a process would find numerous uses.

In response to all of these issues, Larsson and the other xdg-app developers started the bubblewrap project in February. Bubblewrap's internals originated in Colin Walters's linux-user-chroot, which was then ported into xdg-app as a helper application. But the helper was tied in directly to xdg-app in several places. It assumed, for instance, that the sandboxed process would be mounted on top of an xdg-app runtime. The standalone version is more generic, and builds substantially over what linux-user-chroot was capable of.

The bubblewrap README notes that user namespaces could also be used to let unprivileged users do application sandboxing by allowing those users to run an existing container-management tool in their own user namespace. But it points to the ongoing concerns about the security of user namespaces as a rationale for the new project.

As is, bubblewrap provides only a subset of the functionality that user namespaces could allow. In particular, it does not provide control over iptables, which the README says could prevent vulnerabilities like CVE-2016-3135.

Bubblewrap works by creating an empty mount namespace with the root on a temporary filesystem that will be destroyed after the sandboxed process exits. Using switches, the user can construct the desired filesystem environment within the mount namespace by bind-mounting the desired directories from the host system.

On the simple end of the spectrum,

    bwrap --bind /some/chroot / foo.sh

will essentially create a traditional chroot jail. The final argument (here, foo.sh) is simply the executable to be started in the sandbox. But the configuration supported by bubblewrap can be substantially more complex. So, for instance, the example shown in the README:

    bwrap --ro-bind /usr /usr \
       --dir /tmp \
       --proc /proc \
       --dev /dev \
       --ro-bind /etc/resolv.conf /etc/resolv.conf \
       --symlink usr/lib /lib \
       --symlink usr/lib64 /lib64 \
       --symlink usr/bin /bin \
       --symlink usr/sbin /sbin \
       --chdir / \
       --unshare-pid \
       --unshare-net \
       --dir /run/user/$(id -u) \
       --setenv XDG_RUNTIME_DIR "/run/user/`id -u`" \
       /bin/sh

creates a read-only bind mount from the host system for /usr and for /etc/resolv.conf, but creates its own /tmp, /proc, and /dev directories by using the --dir switch.

The --unshare-pid switch creates a PID namespace in the sandbox and --unshare-net creates a network namespace. The other options currently supported are IPC namespaces, user namespaces (including specifying a custom UID and GID), and UTS namespaces.

Apart from specifying the UID and GID, no options are provided for the namespace switches. The sandbox isolates the executable from the host filesystem, IPC mechanism, and the network. In addition, the --seccomp file_descriptor switch can be used to load a set of seccomp rules for the sandbox.

Bubblewrap has now replaced the earlier helper application in xdg-app and is available for the rpm-ostree packaging system. The bubblewrap page notes, however, that it could also be useful in existing server-oriented container environments. Whether that idea catches on or not remains to be seen; in the meantime, bubblewrap provides an interesting extension of Linux's sandboxing capabilities into the unprivileged-user space.

Comments (7 posted)

Quote of the week

Answering a user question on IRC doesn’t scale, only one person and a few lurkers will benefit from it. Answering a user question on a mailing list scales a little better, since the answer is archived and can be searched for. What really scales is instead to improve the reference manual. Yes, you know, documentation.

— Sébastien Wilmet

Comments (10 posted)

De Maré: Mercurial 3.7 and 3.8

Mercurial revision-control system developer Mathias De Maré summarizes the changes in the 3.7 and 3.8 releases. "Mercurial 3.7 had a major focus on performance. This is — to a large degree — due to large users like Facebook and Mozilla working on both performance and scalability."

Comments (none posted)

WebExtensions in Firefox 48

At the Mozilla blog, Andy McKay announces that the browser maker has officially declared WebExtensions ready to use for add-on development. "With the release of Firefox 48, we feel WebExtensions are in a stable state. We recommend developers start to use the WebExtensions API for their add-on development." The WebExtensions support released for Firefox 48 includes improvements to the "alarms, bookmarks, downloads, notifications, webNavigation, webRequest, windows and tabs" APIs, support for a new Content Security Policy that limits where resources can be loaded from, and support in Firefox for Android. LWN looked at the WebExtensions API in December.

Comments (none posted)

Development newsletters from the past week

Comments (none posted)

A guide to inline assembly code in GCC

The "linux-insides" series of articles has gained an overview of inline assembly in GCC. "I've decided to write this to consolidate my knowledge related to inline assembly here. As inline assembly statements are quite common in the Linux kernel and we may see them in linux-insides parts sometimes, I thought that it would be useful if we would have a special part which contains descriptions of the more important aspects of inline assembly. Of course you may find comprehensive information about inline assembly in the official documentation, but I like the rules all in one place."

Comments (31 posted)

The ACM 2015 technical awards

The Association for Computing Machinery has announced the recipients of its 2015 technical awards. They are Brent Walters, Michael Luby, Eric Horvitz, and: "Richard Stallman, recipient of the ACM Software System Award for the development and leadership of GCC (GNU Compiler Collection), which has enabled extensive software and hardware innovation, and has been a lynchpin of the free software movement."

Comments (7 posted)

X.Org votes to join SPI

The results of the X.Org election are in. There were two things up for a vote: four seats on the board of directors and amending the bylaws to join Software in the Public Interest (SPI). Unlike last year's election, this year's vote met the required 2/3 approval to join SPI (61 voters out of 65 members, with 54 voting "Yes", 4 "No", and 3 "Abstain"). In addition, Egbert Eich, Alex Deucher, Keith Packard, and Bryce Harrington were elected to the board.

Comments (2 posted)

FSFE Newsletter - May 2016

This edition of the Free Software Foundation Europe newsletter covers EU jeopardises its goals in standardisation with FRAND licensing, European Commission vs Google Android, and much more.

Full Story (comments: none)

Humble Book Bundle: Hacking, presented by No Starch Press

No Starch Press and Humble Bundle present the "Humble Book Bundle: Hacking". "The bundle includes a selection of the company's finest--such as worldwide best seller "Hacking: The Art of Exploitation"; classics like "Hacking the Xbox"; and more recent best sellers like "Automate the Boring Stuff with Python," "Black Hat Python," and "Practical Malware Analysis." This bundle is a true bargain--valued at over US $350--and with Humble Bundle's pay-what-you-want model, customers can pay whatever price they think is fair."

Full Story (comments: none)

DebConf16 Call for Proposals deadline extended

The call for proposals deadline for DebConf has been extended until May 15. DebConf will be held July 2-9 in Cape Town, South Africa.

Full Story (comments: none)

EuroPython 2016: Extra Hot Topics - Call for Proposals

There will be a second call for proposals for EuroPython open only June 4-12. This call is strictly reserved for hot topics, emerging technologies, brand new developments in software & hardware, and recent results in research and science. EuroPython will take place July 17-24 in Bilbao, Spain.

Full Story (comments: none)

Tracing Summit 2016 Call for Presentations

There will be a Tracing Summit on October 12 in Berlin, Germany, co-located with the Embedded Linux Conference. "The Tracing Summit is organized by the Linux Foundation Diagnostic and Monitoring Workgroup (http://diamon.org). This event focuses on the tracing area, gathering people involved in development and end-users of tracing tools as well as trace analysis tools. The main target of this Tracing Summit is to provide room for discussion between people in the various areas that benefit from tracing, namely parallel, distributed and/or real-time systems, as well as kernel development." The call for proposals ends July 15.

Full Story (comments: none)

CFP Deadlines: May 5, 2016 to July 4, 2016

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

Deadline	Event Dates	Event	Location
May 6	October 26 October 27	All Things Open	Raleigh, NC, USA
May 6	July 13 July 15	LinuxCon Japan	Tokyo, Japan
May 8	August 12 August 16	PyCon Australia 2016	Melbourne, Australia
May 15	July 2 July 9	DebConf16	Cape Town, South Africa
May 15	September 1 September 8	QtCon 2016	Berlin, Germany
May 15	June 11 June 12	Linuxwochen Linz	Linz, Austria
May 16	October 31 November 2	O’Reilly Security Conference	New York, NY, USA
May 23	October 17 October 19	O'Reilly Open Source Convention	London, UK
May 23	August 20 August 21	FrOSCon - Free and Open Source Software Conference	Sankt-Augustin, Germany
May 24	August 18 August 21	Camp++ 0x7e0	Komárom, Hungary
May 24	November 9 November 11	O’Reilly Security Conference EU	Amsterdam, Netherlands
May 25	October 5 October 7	International Workshop on OpenMP	Nara, Japan
May 29	September 20 September 23	PyCon JP 2016	Tokyo, Japan
May 30	September 13 September 16	PostgresOpen 2016	Dallas, TX, USA
June 3	June 24 June 25	French Perl Workshop 2016	Paris, France
June 4	July 30 July 31	PyOhio	Columbus, OH, USA
June 5	September 26 September 27	Open Source Backup Conference	Cologne, Germany
June 5	September 9 September 10	RustConf 2016	Portland, OR, USA
June 10	August 25 August 26	Linux Security Summit 2016	Toronto, Canada
June 11	October 3 October 5	OpenMP Conference	Nara, Japan
June 15	September 8 September 9	First OpenPGP conference	Cologne, Germany
June 15	November 16 November 17	Paris Open Source Summit	Paris, France
June 20	September 9 September 11	Kiwi PyCon 2016	Dunedin, New Zealand
June 22	September 19 September 23	Libre Application Summit	Portland, OR, USA
June 26	October 11 October 13	Embedded Linux Conference Europe	Berlin, Germany
June 30	November 29 December 2	Open Source Monitoring Conference	Nürnberg, Germany

If the CFP deadline for your event does not appear here, please tell us about it.

Early LinuxCon + ContainerCon keynotes announced

The Linux Foundation has announced the lineup of early keynote speakers for LinuxCon and ContainerCon. Speakers include Cory Doctorow, Margaret Heffernan, Ainissa Ramirez, and Linus Torvalds. LinuxCon and ContainerCon will take place August 22-24 in Toronto, Canada.

Full Story (comments: none)

Events: May 5, 2016 to July 4, 2016

The following event listing is taken from the LWN.net Calendar.

Date(s)	Event	Location
May 1 June 29	Open Source Innovation Spring	Paris, France
May 2 May 5	FOSS4G North America	Raleigh, NC, USA
May 9 May 13	ApacheCon North America	Vancouver, Canada
May 10 May 12	Samba eXPerience 2016	Berlin, Germany
May 14 May 15	Open Source Conference Albania	Tirana, Albania
May 14 May 15	Community Leadership Summit 2016	Austin, TX, USA
May 16 May 19	OSCON 2016	Austin, TX, USA
May 17 May 21	PGCon - PostgreSQL Conference for Users and Developers	Ottawa, Canada
May 24 May 25	Cloud Foundry Summit	Santa Clara, CA, USA
May 26	NLUUG - Spring conference 2016	Bunnik, The Netherlands
May 28 June 5	PyCon 2016	Portland, OR, USA
June 1 June 2	Apache MesosCon	Denver, CO, USA
June 4 June 5	Coliberator 2016	Bucharest, Romania
June 11	TÜBIX 2016	Tübingen, Germany
June 11 June 12	Linuxwochen Linz	Linz, Austria
June 14 June 15	PyData Paris 2016	Paris, France
June 19 June 21	DockerCon	Seattle, WA, USA
June 20 June 23	OPNFV Summit	Berlin, Germany
June 21 June 22	Deutsche OpenStack Tage	Köln, Deutschland
June 21 June 24	Open Source Bridge	Portland, OR, USA
June 21 June 25	Third Julia Conference	Cambridge, MA, USA
June 21 June 28	Wikimania	Esino Lario, Italy
June 22 June 26	openSUSE Conference 2016	Nürnberg, Germany
June 22 June 24	USENIX Annual Technical Conference	Denver, CO, USA
June 23 July 1	DebCamp	Cape Town, South Africa
June 24 June 25	Hong Kong Open Source Conference 2016	Hong Kong, Hong Kong
June 24 June 25	devopsdays Silicon Valley 2016	Mountain View, CA, USA
June 24	Swiss PostgreSQL Day	Rapperswil, Switzerland
June 24 June 25	French Perl Workshop 2016	Paris, France
June 27 July 1	Hack in Paris	Paris, France
June 27 July 1	12th Netfilter Workshop	Amsterdam, Netherlands
July 2 July 9	DebConf16	Cape Town, South Africa

If your event does not appear here, please tell us about it.

LWN.net Weekly Edition for May 5, 2016

Bird versus fox

Lizard tech

Out of the nest

Security

Brief items

New vulnerabilities

botan: side channel attack

botan: insufficient randomness

chromium-browser: multiple vulnerabilities

i7z: denial of service

java: three vulnerabilities

jq: two vulnerabilities

kernel: two vulnerabilities

mercurial: code execution

minissdpd: denial of service

ntp: multiple vulnerabilities

ntp: multiple vulnerabilities

openssl: multiple vulnerabilities

openssl: information leak

openvas: cross-site scripting

oxide-qt: code execution

php: multiple vulnerabilities

php-ZendFramework: multiple vulnerabilities

roundcubemail: three vulnerabilities

subversion: multiple vulnerabilities

tardiff: two vulnerabilities

ubuntu-core-launcher: code execution

xen: three vulnerabilities

xerces-j2: denial of service

Kernel development

Brief items

Kernel development news

Futex state: not everything is a file (descriptor)

Buckets get allocated instead of shared

It's always about the interface

Automatic attachment

Version two: no more attachment

task_diag

statx()

Patches and updates

Kernel trees

Architecture-specific

Core kernel code

Device drivers

Device driver infrastructure

Networking

Security-related

Miscellaneous

Distributions

Brief items

Newsletters and articles of interest

Development

Brief items

Newsletters and articles

Announcements

Brief items

Articles of interest

New Books

Calls for Presentations

CFP Deadlines: May 5, 2016 to July 4, 2016

Upcoming Events

Events: May 5, 2016 to July 4, 2016