LWN.net Weekly Edition for September 29, 2016
GTK+, version numbering, and long-term support
GTK+ version 3.22 was released on September 21, bringing with it a range of improvements to Wayland support, gesture support for pressure-sensitive tablets, several new widgets, and more. The release also marks a turning point for how stable and development branches of the code will be maintained. Moving forward, the project is adopting a new scheme that allows it to designate certain stable releases for long-term support. The plan also breaks with past releases where version numbering is concerned, though the project is keen to downplay that change in favor of focusing on the support that stable releases will offer to downstream projects.
The new release scheme was announced on September 1 in a Google+ post and an accompanying blog post written by Allan Day. The blog post explains more of the background issues that led up to the decision to adopt a new scheme.
GTK+ has long used the traditional "major.minor.micro" numbering scheme (sometimes called semantic versioning) that was once the approach favored by free-software projects. Bumping the major number indicated a significant API break, breaking backward compatibility. Bumping the minor number to the next even value indicated a stable update, while the odd values designated development branches. Micro (or patch) releases were reserved for bug-fix updates.
But, in the GTK+ 3.x era, Day notes, the project picked up significant development speed and also adopted a strict six-month release cycle. That pace has led to concerns over GTK+'s stability, particularly for projects other than GNOME, which shares many developers and other contributors with the GTK+ project. The GTK+ developers, however, want the project to be useful for a wide range of projects outside of GNOME, which prompted discussions earlier in 2016 about changing the release schedule.
Rethinking
In June, after the annual GTK+ hackfest, Allison Lortie announced the proposed change in a blog post that sparked a fair share of confusion and concern. Commenters were evidently perplexed by the proposal, which included difficult-to-parse statements like this:
One early comment by "Alex" summed up much of the general reaction:
To be fair, though, releasing x.0 versions that were unstable was certainly not the intent of the scheme announced. Rather, the plan was meant to suggest that GTK+ version 4 would continue to evolve over the course of the subsequent 4.y releases. Nevertheless, the confusion was demonstrably a problem. At GUADEC in August, the GTK+ team reexamined the topic with a promise to present an updated plan as soon as possible.
Rethinking again
The September 1 announcements, then, constitute that update, which will hopefully prove clearer to outsiders. In essence, the GTK+ x.0.0 releases moving forward will be designated stable, long-term support versions, with the project planning to release an x.0.0 release about once every two to three years. In between these releases, minor updates will also appear that may introduce new functionality. The minor releases will not be bound to a fixed six-month release cycle, however.
Next, the GTK+ development branches will be numbered x.9, to indicate that they are unstable releases being built in preparation for release x+1. This means that, in the future, there may be (for example) a stable, long-term support GTK+ 5.0 available, a series of updated releases (GTK+ 5.2, 5.4, and so on), and a development branch numbered 5.9.
Furthermore, any features deprecated in one x.0 release will be removed in the following x+1.0 release. This is another area where GTK+ has not historically had a strict policy, so stating and adhering to a regular deprecation formula will no doubt please many outside developers. The new plan also states that minor releases may add new widgets and update the GDK drawing backends used by the various window systems supported, but that no other changes will be made. Finally, micro releases for bug fixes and security updates will be made for three years.
Thus, the total lifespan of the x.0 long-term support releases will be three years. The wording is a bit ambiguous as to whether x.2 and other minor updates will also be supported for three years (potentially several months after the x.0 release), but that does not sound like the intent of the plan.
On a technical note, the blog post notes that future development releases of GTK+ will be labeled with the future stable release's version number in the pkg-config file, in order to make them parallel-installable with the current release. So, for example, the pkg-config file in GTK+ 3.90 will be gtk+-4.0, so it will not conflict with the current stable release, GTK+ 3.22.
Development releases are expected to appear about once every six months, all bearing version numbers in the x.9 range (e.g., x.90, x.92, x.94, etc.). That puts some indirect pressure on the project to release a stable y.0 release once the development version's minor number reaches .98, as Sébastien Wilmet noted on the GTK+ development list.
The new plan sets out a fairly regular numbering and release scheme but, of course, transitioning between the old and new schemes will be a tad awkward. This awkwardness takes the form of the new stable release, GTK+ 3.22, being declared the first long-term support version, even though it is not branded with an x.0 version number. Hopefully, that will be seen as a small price to pay for more predictable releases.
Downstreaming
The hope is that the plan for major and minor releases will better serve downstream project developers and Linux distributions. A guarantee of three years of security fixes should be enough for most Linux distributions, while the promise to make no significant changes to GTK+ internals in minor releases ought to be welcome news for downstream application developers. For distributions that offer their own long-term support releases with a lifespan longer than three years, Day asks that distribution representatives get in touch with the GTK+ project to develop a support plan.
Day's blog post also assures downstream developers that the project is committed to doing a better job of communicating changes—and of doing so in advance:
One of the other criticisms the project has faced in the past was that too many decisions were made within the relatively small set of core GTK+ developers, with that information not always making its way out into the wider GTK+ community in a timely fashion. The project must still deliver on this promise to ensure that changes are well-communicated to the outside world, but acknowledging the concern and making a public commitment to doing better are important steps.
Despite the increased emphasis on meeting the needs of downstream developers, there has not yet been a public statement from GTK+'s largest downstream project, GNOME, on whether (or how) it will adopt the same updated version-numbering and stability plan. In the past, GNOME and GTK+ version numbers have stayed in sync; with the newly announced plan, GNOME would have to adjust its numbering and release schedule as well in order to maintain that relationship.
Then again, perhaps no such change is warranted. A big part of the rationale for GTK+'s change was to better serve non-GNOME projects; enabling those two projects to move at different paces could be just what the developers want.
The anatomy of a Vulkan driver
Jason Ekstrand gave a presentation at the 2016 X.Org Developers Conference (XDC) on a driver that he and others wrote for the new Vulkan 3D graphics API on Intel graphics hardware. Vulkan is significantly different from OpenGL, which led the developers to making some design decisions that departed from those made for OpenGL drivers.
He started with an "obligatory brag slide" (slides [PDF]) that outlined the progress that had been made on the driver in only eight months, with roughly three and a half people. Ekstrand, Kristian Høgsberg, and Chad Versace, with help from a dozen others, got a Vulkan driver working that was released (as open source) on the same day that the Vulkan specification was released in February. Not everything was written from scratch; the driver uses the same internal representation and back-end compiler that Mesa uses. The driver passed the conformance tests on day one as well, which is not something that everyone in the industry can say, Ekstrand said.
Vulkan is a new industry-standard 3D rendering and compute API from Khronos, which is the same group that maintains OpenGL. It is not simply OpenGL++, he said, as it has been redesigned from the ground up. Vulkan is designed for modern GPUs and software. It will run on currently shipping (OpenGL ES 3.1 class) hardware.
A lot has happened since SGI released OpenGL 1.0 in 1992, which is why a new 3D API is needed. In the 24 years since that first release: GPUs have become more powerful and flexible, memory has become much cheaper, and multi-core CPUs are common. OpenGL has done "amazingly well" over that time, but it is showing its age at this point.
Multi-threaded programs are now commonplace, which makes OpenGL's state machine based on a singleton context kind of obsolete. Off-screen rendering is common as well. Beyond that, GPU hardware has become more standardized, so application developers don't want the API to hide the details of what the GPU is doing as OpenGL does.
Vulkan takes a different approach. It has an object-based API where there is no global state. All state is stored in the command buffer and there can be multiple command buffers. It is more explicit about what the GPU is doing: texture formats, memory management, and synchronization are all client-controlled. Those things are needed to support multi-threading, but also make drivers simpler.
Vulkan drivers do no error checking. There is a set of open-source, vendor-neutral validation layers that do much the same checking as is done in Mesa but they are meant to be disabled at runtime. The idea is for application developers to check their Vulkan code during development, so "why burn 10% of my CPU doing validation" when there are no errors in the Vulkan code?
There is a short distance between the API call and the driver in Vulkan, rather than traversing multiple layers as in Mesa. There is also a short distance between the driver function and actually putting data into the command buffer for the GPU. There are "no extra layers", Ekstrand said.
To handle multiple generations of hardware, each with its own packet format and packing scheme, the Vulkan driver has header files that are generated using Python scripts to process an XML representation of the formats. There is a function that uses that header file information to pack the command data into the buffer in the right way. It has debugging support that can assert() for various problems and the code can be run under Valgrind to find other kinds of problems.
To handle four separate Intel GPU generations, the code is compiled four times to create one version per generation. That allows the driver to keep up with new hardware more easily. The hardware-generation checks for each command function (as in the Mesa driver) are compiled away and the right thing is done for the generation in use. This is one example of where the team got to rethink things because it is a new, from-scratch driver.
One of the challenges faced by the team was in memory allocation. Vulkan provides a collection of heaps where clients can allocate VkDeviceMemory objects. The client can place VkImage or VkBuffer objects at explicit offsets within the VkDeviceMemory object. This doesn't map well to allocation from LibDRM, he said, but it does map well to Graphics Execution Manager (GEM) buffer objects. Other objects have small amounts of driver-allocated memory for state that the driver needs to track. The team had to figure out how to manage all those pieces of memory. Complicating matters was that the Intel hardware has different base addresses for different types of allocations (e.g. shaders, surface states), so the state information needs to be stored with others of the same type.
He and Høgsberg came up with a "crazy" memory allocation structure that they are pretty proud of, Ekstrand said. For device memory objects, GEM buffers are used; there is also a pool of GEM buffers that are used for back buffers. For the state objects, there are block pools that are allocated as a buffer object that grows in both directions as needed. The pools are initialized to provide objects of a specific size. Allocating from either end of the pool is required because of some hardware-specific restrictions.
The block pools are implemented as a 2GB memfd that gets mmap()-ed into the driver. An address in the middle is then turned into a GEM buffer object. The block pool is used to implement both a traditional "allocate and free" style state pool as well as a pool that is used for state that is associated with a command buffer. The latter pool has no free function, it simply gets reset when the command buffer is thrown away. It is a complicated infrastructure, but has worked well, he said.
Most hardware has support for compressed surfaces, but not all parts of the GPU understand all of the different formats. So a "resolve" operation is needed to decompress or recompress the surface at different points in the pipeline. Due to the multi-threaded nature of Vulkan, though, there is no real way to track when the resolves are needed on the CPU side. The Vulkan API provides two features ("render passes" and "layout transitions") that can help. Layout transitions are not currently used in the driver, but render passes delineate where resolves may be needed.
It is easier to write a Vulkan driver than one for OpenGL, Ekstrand said. The lack of error checking simplifies things to start with. The SPIR-V shader language is a bit easier to deal with than OpenGL's GLSL. Also, the Vulkan conformance tests consist of 115,000 tests that the driver developer doesn't have to write. It is a good set of tests, but there are still some holes, he said.
Some things are harder to do for Vulkan than for OpenGL. There is no CPU-side object state-tracking, for one thing. In addition, "applications have a lot more power for stupid". If the application is doing something wrong, which results in a bug filed against the driver, there is a good bit of work—without good tools—needed to track down the problem.
As far as sharing code between Vulkan and OpenGL drivers goes, there are a couple of different approaches. The approach taken was a "toolbox" that provides a number of different parts, from which a driver can be created. That approach has also provided better infrastructure for building other drivers in the future. Those looking for more details may want to view the YouTube video of the talk.
[I would like to thank the X.Org Foundation for sponsoring my travel to Helsinki for XDC.]
OpenType 1.8 and style attributes
In last week's look at the new revision of the OpenType font format, we focused primarily on the new variations font feature, which makes it possible to encode multiple design "masters" into a single font binary. This enables the renderer to generate a new font instance at runtime based on interpolating the masters in a particular permutation of their features (weight, width, slant, etc). Such new functionality will, at least in some cases, mean that application software will have to be reworked in order to present the available font variations to the end user in a meaningful fashion.
But there is another change inherent in the new feature that may not be as obvious at first glance. Variations fonts redefine the relationships between individual font files and font "families." There is a mechanism defined in the new standard to bridge the gap between the old world and the new, called the Style Attributes (STAT) table. For it to work in a meaningful fashion, though, it must be implemented by traditional, non-variations fonts as well—which may not be an easy sell.
There is no formal definition of a font family, but in general usage the term refers to a set of fonts that share core design principles and, in most cases, use a single name and come from the same designer or design team. The Ubuntu Font Family, for example, includes upright and italic fonts in four weights at the standard width, one weight of upright-only condensed width, and two weights (in upright and italic) of a monospaced variant.
The designers clearly present the fonts as a single conceptual unit, even though (for example) the monospaced version has several characters that use considerably different designs than the proportional version. Some people might argue that the monospace fonts are a separate family, and that together with the proportional fonts, they form a "superfamily." Since no one is in charge of the terminology, such disagreements happen. Similar ambiguities could be found in the Source Code Pro, Source Sans Pro, and Source Serif Pro fonts from Adobe, which were developed separately and take their design cues from unrelated historical typefaces.
An indisputable key to a font family, though, is the fact that the fonts belong together when they are presented to the user. In an OpenType variations font, there is a technical challenge at present but, conceptually, the task is easy: each of the various instances of the font comes from the same source and it can be addressed and otherwise treated as a set of coordinates in the overall design space: (weight=bold, width=normal, italic=no), for example, or (weight=750, width=200, italic=0), to be a bit more numerical. But there has never been a consistent way to map those sorts of design-space characteristics onto standard, non-variations font files. Doing so is the purpose of the STAT table.
Family matters
At the top level, the table lists all of the axes of variation used in the font family. Each axis has a string that can be displayed in user interfaces and an optional axisOrdering number. That ordering has a couple of possible interpretations. One is the order in which the axes should be sorted in a font name. For instance, if width sorts before weight, then a list would look like:
Foo Condensed
Foo Condensed Bold
Foo
Foo Bold
Foo Extended
Foo Extended Bold
and so forth. If weight sorts before width, though, then one would see:
Foo Condensed
Foo
Foo Extended
Foo Condensed Bold
Foo Bold
Foo Extended Bold
A different interpretation of the axisOrdering numbers would be to specify the order in which the various axes are shown in a font name. That is, whether to show "Foo Condensed Bold" or "Foo Bold Condensed" in the font menu.
Complicating this interpretation is the fact that OpenType already supports several other mechanisms with which to specify a font's name including all of those design attributes, via the name table. The three options are Name IDs 1 and 2, which can be used to specify a Font Family Name and Font Subfamily Name (respectively), Name IDs 16 and 17, which can be used for a Typographic Family Name and Typographic Subfamily Name, and Name IDs 21 and 22, which are for a Weight/Width/Slope (WWS) Family Name and WWS Subfamily Name. Each pair of Name ID entries can take any string, which are intended to be concatenated together in Family Subfamily fashion. The redundancy of multiple such similar options has not escaped the community's notice, of course; it remains for historical reasons.
Complicating matters even more is the fact that different software platforms interpret name table data in their own peculiar ways, such as when parsing and tokenizing the strings in Name IDs. In the OpenType session at ATypI 2016, Peter Constable noted that Microsoft's Graphics Device Interface (GDI) and Windows Presentation Foundation (WPF) each has its own approach to assembling the font name from the Name IDs, and CSS uses a different approach altogether. The obvious question, he said, is why add yet another possible naming mechanism to the mix. The answer is that STAT does not impose a hierarchical solution like the Family/Subfamily options in name do; it defines the variation axes and that is all. Whereas name table entries can be arbitrary strings that may or may not make sense, the thinking goes, at least STAT axes are well-defined and can be reasoned about.
The mapping problem
Confusing though the naming issues may be, a more practical feature of the STAT table is that fonts can provide a mapping between the numeric values defined for OpenType 1.8 axes and the names commonly used by the various font classification systems and shown in user interfaces. The predefined axes each have an expected range. Italic (ital) must be between zero and one; slant (slnt) must be between -90 and 90 degrees; optical size (opsz) and width (wdth) must be greater than zero; weight (wght) must be between one and 1000. During the interpolation process for variable fonts, all of these values get normalized to [-1,1], but these human-readable ranges were selected to better map to how existing font families are described, including the conventions of CSS, GDI, and WPF.
So it's quite simple to specify in the STAT table that the "Regular" weight of a font is 200, the "Semibold" is 450, and the "Bold" is 625, for example. The table even offers an ElidableAxisValueName flag to indicate that "Regular" (for example) can be dropped from the name shown in UIs, which is a nice convenience for end users.
Where things become trickier, however, is when a font family starts out with a few fonts files at first (possibly even just one) and adds more later. For example, consider a variations font that supports the weight and width axes, all in roman (non-italic) style. In that case, there would be no ital axis defined in the font file. But if a matching italic variations font is released later, then the original font's STAT table is suddenly incomplete, because it does not indicate where that first font lies on the font family's new roman-to-italic axis.
The OpenType 1.8 specification offers a fix for this through the STAT table. The newly released italic font should include (naturally) an ital variation axis, and the axis's record in STAT table would include the relevant entries for the new font, plus one entry for the old font as well. The old font's record gets marked with the OlderSiblingFontAttribute flag, which is meant to indicate to the application or operating system where the old font gets mapped into the new, expanded font family on the ital axis. In our simple example, the entry for the old font would be a zero on the ital axis, but lots of other permutations are certainly possible.
So this feature lets one font file supply data about a separate, older font file that software implementations are expected to read and adhere to. The specification does not dictate how a program should go about determining which older font file is the one referenced by entries flagged with OlderSiblingFontAttribute. Presumably, some Name ID(s) from the name table are involved but, as we have seen, there are several of those to choose from. And it is possible that more than one older font might need to be retroactively referenced in such a fashion—consider yet another new font added to our example above that adds an optical-size variant. That font would have to include OlderSiblingFontAttribute information for the older fonts as well.
Assuming that the old and new font files are released by the same (non-pathological) person and there are no naming conflicts with other fonts on the system, there should be no misunderstandings. But it is not quite clear how software should interpret matters when the font name in the new font file seems to match more than one old font file. And the specification recommends that all new non-variations font families supply the STAT table with OlderSiblingFontAttribute flags, too. For a traditional font family like the Ubuntu Font Family, there are lots of individual files to be built and distributed (13 in the Ubuntu Font's case), with a lot of tables that could get confused or out-of-sync as updates are installed.
Practically speaking, it will be quite some time before OpenType variations fonts become the norm on most users' systems. So type foundries can be expected to release variations-font versions of their binaries as well as sets of individual, non-variations font files. Getting the STAT tables right may take some time; deciphering the font-family information that the tables encode, on the software side, may take some time as well.
Security
The trouble with new TLS version numbers
The TLS working group in the IETF is currently working on the next version of the encryption protocol: TLS 1.3. The new protocol will bring performance improvements by avoiding round trips and will deprecate a lot of dangerous cryptographic constructions. But, apart from technical improvements, it will also bring something that may seem trivial, but that could cause a lot of trouble: a new version number. That will probably lead to a redesign of the TLS version-negotiation mechanism.
When a new version of a protocol gets introduced, there must be some mechanism to keep compatibility with existing implementations. Not everyone will move to TLS 1.3; many legacy implementations will keep using TLS 1.2 or older versions for years to come.
TLS uses a version mechanism that may seem relatively simple, but it has been the source of a surprising number of problems. When a client connects to a server, it sends the highest version number it supports in the ClientHello message. The server can reply with any version equal or lower than that. Therefore, if a client connects to a server with a maximum version number of 1.2 and the server only supports TLS 1.0, it will answer with that version. As long as the client still has compatibility for TLS 1.0, a successful connection can be established.
This ideal case often doesn't occur, however, due to faulty server implementations. Many servers simply fail once one tries to connect with a higher TLS version than they support. The failure can happen in a variety of ways. Some servers terminate the connection on a TCP level or send a TLS error alert, others simply wait until a timeout happens. Some also successfully send a TLS ServerHello and almost complete a handshake, but fail later during verification of the FinishedMessage, which is the last part of the handshake. All these behaviors are bugs in the server software.
Version intolerance
This problem is known as "version intolerance" and it has cropped up every time browsers and TLS implementations have introduced new protocol versions. An old web page documents the problem; it was written by Netscape in 2003 and can be found in the Mozilla wiki. Most of the affected devices were enterprise TLS appliances, although occasionally free implementations like OpenSSL were also affected.
Browser vendors have reacted to these problems with a questionable strategy: after a connection failure, the browser tries to reconnect with a lower TLS or SSL version. Back then, the only versions in widespread use were SSL 3 and TLS 1.0. While this avoided problems with broken servers, it introduced another problem: these downgrades occasionally happened because of dropped packets due to bad network connections. Therefore protocol features that were only supported in TLS 1.0 stopped working on an irregular basis.
One extension that TLS 1.0 introduced is called Server Name Indication (SNI) and it removed a limitation of the old SSL protocol, by allowing multiple domains with different certificates to be hosted on the same IP address. SNI allows shared hosting services that often host hundreds of websites on the same IP to deploy HTTPS. The deployment of SNI was severely hampered by the browser's version fallbacks in TLS, because randomly website visitors would see the wrong certificate due to a connection downgrade to SSL 3.
The version fallbacks also introduced security issues. If browsers try to reconnect with a lower TLS or SSL version, then a man-in-the-middle attacker can force these version downgrades by blocking ClientHello messages with higher version numbers. At the Black Hat USA conference in 2014, Antoine Delignat-Lavaud presented an attack called "virtual host confusion" (YouTube video, paper [PDF]). The attack exploited the fact that an attacker can disable SNI by a forced version downgrade.
Later that year, Bodo Möller, Thai Duong, and Krzysztof Kotowicz discovered the POODLE attack — a padding oracle attack that exploits the fact that in SSL 3 the padding of the encryption was undefined and could have any value. But that alone wouldn't have been very interesting, because at that time SSL 3 was rarely used. In combination with version fallbacks, however, POODLE became a severe issue because almost all servers and clients still supported SSL 3. With version downgrades it was easy to force a connection to use the old protocol. The POODLE paper introduced the term "protocol downgrade dance" for the downgrade behavior of browsers.
In response to these kinds of problems, a mechanism called "Signaling Cipher Suite Value" (SCSV) was introduced. By including a special cipher suite value, servers could signal to clients that they weren't defective, thus if a connection used a version downgrade it shouldn't be established. SCSV got standardized as RFC 7507, but it quickly became almost obsolete, because browser vendors decided that they could get rid of the questionable version fallbacks entirely.
SCSV is notable, though, because it is a feature for the TLS standard that exists solely to work around buggy implementations. But it's not the only such feature. Some devices from the company F5 fail to allow connections if a handshake has a size between 256 and 512 bytes. Therefore a padding extension was introduced that simply expands the handshake to avoid those sizes. However, it later turned out that this solution would cause other implementations to fail, because they don't accept handshakes larger than 512 bytes.
The return of fallbacks
Despite all the drama version fallbacks have caused, they may make a comeback. In a recent blog post, Google developer Adam Langley commented:
Langley was certain that there is no way to avoid TLS version fallbacks when TLS 1.3 gets introduced. The reason is that currently about three percent of the major web pages have problems with TLS 1.3 handshakes. In theory, browser vendors could skip the fallbacks and simply break non-compliant sites, however that's unlikely to happen. A browser that breaks a large number of sites and devices will likely face a backlash from users and may push those users to choose another browser. Chrome has often faced heavy criticism from users when it deprecated insecure mechanisms in the past. When Google deprecated insecure Diffie-Hellman parameters, it broke connections to a Cisco RV042G router. While it is obvious that Cisco was at fault here, the user reactions that can be seen in Chrome's public forum blamed Google for its effort to make the Internet more secure.
TLS 1.3 contains a mechanism similar to SCSV that could avoid the worst consequences of version intolerance. By sending a specific value in the random number field of the handshake, a server can indicate that it doesn't want downgraded connections. Still this is far from ideal, as it adds another layer of complexity. Ideally vendors should just fix their TLS implementations.Vendor responses
The vendors responsible for broken version negotiations mostly don't seem to care a lot. I have tried to identify affected vendors. Many of the buggy web pages use Citrix Netscaler devices. Citrix has informed me that it is aware of this problem, although it doesn't consider it to be a security issue. Citrix was unable to give any timeline on when this bug will be fixed.
Several products from IBM, among them IBM HTTP Server and Lotus Domino, are also affected. At first IBM security simply denied that there is a problem and claimed that the issue was already fixed in the current HTTP Server release. After informing them that I actually tested with the latest release and that it is still affected, the company looked into it. IBM informed me that it doesn't treat the issue as a security vulnerability. IBM was unable to give a concrete timeline when a fix will be available, but informed me that it will likely happen with the next version of its TLS implementation, GSKit, which will be released by the end of the year. A while later, IBM went back into denial mode and informed me that the issue was closed, because the company was unable to reproduce it — after it already confirmed that it was working on a fix.
So two major vendors didn't consider this issue a security vulnerability and didn't see any urgency to tackle it. While it is true that this issue itself doesn't cause a security problem for its device owner,past experience has shown that down the line these bugs can cause security issues, because they force client implementations to implement dangerous behavior.
The third vendor that could be identified was Cisco and version intolerance affects their ACE load-balancer devices. These devices are out of support and no longer receive updates. It was made clear to me that Cisco won't consider any exceptions to its end-of-life policy. So people who still use these devices will have to live with this bug, with no way of fixing it. Cisco did promise to verify whether devices that are still supported are also affected by this bug. As the software of these devices is proprietary, there is no way for users to fix these bugs themselves.
I also tried to contact operators of major affected web pages, but with limited success. The most notable web pages that fail with a TLS 1.3 handshake are apple.com, ebay.com, and various localized versions of PayPal. In many cases, only connections without a leading www are affected. The reason for that is probably that the www version of a site is often transferred to a content delivery network, while the domain without www is delivered by another device that simply forwards connections.
Apple and eBay didn't answer questions about their version intolerant web services; both sites are still affected. PayPal simply said that TLS issues aren't covered by their bug bounty program, but refused to discuss the issue any further.
Server operators can test their server for TLS version intolerance with the SSL Labs test or with the testssl.sh tool. Both tests have limitations and don't catch all instances of version intolerance. The most reliable way to test right now is to use the Beta or Dev channel release of Chrome and manually enable TLS 1.3 (via chrome://flags option "Maximum TLS version enabled") or use Firefox Nightly (set "security.tls.version.max" and "security.tls.version.fallback-limit" to "4" in about:config). Trying to access version intolerant sites that usually support HTTPS will result in a connection failure.
Rethinking version negotiation
Given the situation, Google developer David Benjamin proposed a different route with a redesign of the whole version negotiation mechanism. He suggested that the version could be negotiated with an extension that sends a list of supported newer versions. Obviously the same problem with version intolerance could happen again with such a solution in the future: servers may simply not work if they see any version in the extension that they don't know.
To avoid this, Benjamin proposed that browsers could randomly send bogus version numbers that get reserved with a guarantee that they will never be used for any real TLS version. Any correct implementation should just ignore all unsupported version values. Bugs in servers that fail when they see a version number they don't support would likely be discovered much earlier, so they probably will never make it into production releases. It is still possible that vendors could implement this in the wrong way by just ignoring the reserved bogus version numbers. However, it is hardly imaginable that one does so without outright trying to create non-compliant software.
Benjamin also proposed a generalized variant of this mechanism under the name Generate Random Extensions And Sustain Extensibility (GREASE). The same way that bogus version numbers are sent could be used for extensions and cipher suites to avoid bugs in those areas.
The proposal for a TLS version negotiation via an extension was received with skepticism during the last IETF conference in Berlin. It would further complicate an already complicated handshake. The existing ClientHello already contains two version numbers, the TLS record layer version and the real ClientHello version. The TLS record layer version never had any real meaning, so most implementations simply set it to the version value of TLS 1.0 and ignore it. TLS 1.3 will make this official and says that it must be ignored. What further adds to confusion is that the version numbers sent over the wire don't match the version numbers of the protocol. For historic reasons — all versions of TLS came after SSL version 3 — TLS 1.0 is indicated with the value pair {3, 1}, TLS 1.3 will be {3, 4}.
The TLS community was therefore uneasy with the idea of adding another layer of complexity. But Benjamin's latest proposal got more support on the mailing list than during the IETF conference. It has now the status of a rough consensus and will most likely be part of TLS 1.3.
The GREASE strategy is an interesting new paradigm for designing protocols in an ecosystem where many vendors ship low-quality products that implement specifications incorrectly. There is a need to stay compatible with an existing infrastructure of defective devices. Similar strategies have been used in other cases. HTTP/2, for example, is not negotiated over a normal HTTP request, instead an extension mechanism for TLS called Application-Layer Protocol Negotiation (ALPN) is used to negotiate the higher version.
David Benjamin's GREASE concept goes one step further and tries anticipate potential failures. He has tried to design a protocol where bugs will show up before products are shipped. It'll be interesting to see whether this leads to a less fragile TLS ecosystem.
Brief items
Security quotes of the week
However, events of the past week have convinced me that one of the fastest-growing censorship threats on the Internet today comes not from nation-states, but from super-empowered individuals who have been quietly building extremely potent cyber weapons with transnational reach.
More than 20 years after Gilmore first coined that turn of phrase, his most notable quotable has effectively been inverted — “Censorship can in fact route around the Internet.” The Internet can’t route around censorship when the censorship is all-pervasive and armed with, for all practical purposes, near-infinite reach and capacity. I call this rather unwelcome and hostile development the “The Democratization of Censorship.”
OpenSSL security advisory for September 26
This OpenSSL security advisory is notable in that it's the second one in four days; sites that updated after the first one may need to do so again. "This security update addresses issues that were caused by patches included in our previous security update, released on 22nd September 2016. Given the Critical severity of one of these flaws we have chosen to release this advisory immediately to prevent upgrades to the affected version, rather than delaying in order to provide our usual public pre-notification."
New vulnerabilities
bash: code execution
| Package(s): | bash | CVE #(s): | CVE-2016-0634 | ||||||||||||||||||||||||
| Created: | September 26, 2016 | Updated: | December 13, 2016 | ||||||||||||||||||||||||
| Description: | From the Red Hat bugzilla:
A vulnerability was found in a way bash expands the $HOSTNAME. Injecting the hostname with malicious code would cause it to run each time bash expanded \h in the prompt string. | ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
bind: denial of service
| Package(s): | bind | CVE #(s): | CVE-2016-2776 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | September 28, 2016 | Updated: | October 25, 2016 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the CVE entry:
buffer.c in named in ISC BIND 9 before 9.9.9-P3, 9.10.x before 9.10.4-P3, and 9.11.x before 9.11.0rc3 does not properly construct responses, which allows remote attackers to cause a denial of service (assertion failure and daemon exit) via a crafted query. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
drupal7-google_analytics: cross-site scripting
| Package(s): | drupal7-google_analytics | CVE #(s): | |||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | The drupal "Google Analytics" module suffers from a cross-site scripting vulnerability. See this advisory for details. "This vulnerability is mitigated by the fact that an attacker must have a role with the permission 'Administer Google Analytics'." | ||||||||||
| Alerts: |
| ||||||||||
drupal panels: multiple vulnerabilities
| Package(s): | drupal7-panels | CVE #(s): | |||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | The Drupal "Panels" contrib module suffers from multiple "critical" vulnerabilities. "Much of the functionality to modify these panels rely on backend routes that call administrative forms. These forms did not provide any access checks, or site specific encoded urls. This can allow an attacker to guess the backend url as an anonymous user and see data loaded for the form." | ||||||||||
| Alerts: |
| ||||||||||
dwarfutils: two vulnerabilities
| Package(s): | dwarfutils | CVE #(s): | CVE-2016-7510 CVE-2016-7511 | ||||||||||||
| Created: | September 26, 2016 | Updated: | October 10, 2016 | ||||||||||||
| Description: | From the Debian LTS advisory:
It was discovered that there were out-of-bounds read issues in dwarfutils, a library to consume and produce DWARF debug information. | ||||||||||||||
| Alerts: |
| ||||||||||||||
firefox: multiple vulnerabilities
| Package(s): | firefox | CVE #(s): | CVE-2016-5256 CVE-2016-5271 CVE-2016-5273 CVE-2016-5275 CVE-2016-5279 CVE-2016-5282 CVE-2016-5283 | ||||||||||||||||||||||||||||||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||||||||||||||||||||||||||||||
| Description: | Among the many vulnerabilities fixed in the firefox 49 release are CVE-2016-5256 (memory corruption bugs), CVE-2016-5271 (information disclosure), CVE-2016-5273 (code execution), CVE-2016-5275 (code execution), CVE-2016-5279 (information disclosure), CVE-2016-5282 (loading of favicons via non-whitelisted protocols), and CVE-2016-5283 (information disclosure). | ||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||
freerdp: denial of service
| Package(s): | freerdp | CVE #(s): | CVE-2013-4118 | ||||||||||||
| Created: | September 28, 2016 | Updated: | October 4, 2016 | ||||||||||||
| Description: | From the openSUSE advisory:
Add a NULL pointer check to fix a server crash. See the openSUSE bug report for more information. | ||||||||||||||
| Alerts: |
| ||||||||||||||
Horde: cross-site scripting
| Package(s): | php-horde-Horde-Mime-Viewer | CVE #(s): | |||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | According to this commit, Horde renders SVG images in the browser in a way that is subject to cross-site scripting attacks. | ||||||||||
| Alerts: |
| ||||||||||
Horde: cross-site scripting
| Package(s): | php-horde-Horde-Text-Filter | CVE #(s): | |||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | According to the Red Hat bug tracker,
Horde suffers from a "possible XSS vulnerability with data:html and form action was found in Text Filter". | ||||||||||
| Alerts: |
| ||||||||||
imagemagick: code execution
| Package(s): | imagemagick | CVE #(s): | |||||
| Created: | September 26, 2016 | Updated: | September 28, 2016 | ||||
| Description: | From the Debian advisory:
This updates fixes several vulnerabilities in imagemagick: Various memory handling problems and cases of missing or incomplete input sanitising may result in denial of service or the execution of arbitrary code if malformed SIXEL, PDB, MAP, SGI, TIFF and CALS files are processed. | ||||||
| Alerts: |
| ||||||
irssi: heap corruption
| Package(s): | irssi | CVE #(s): | CVE-2016-7045 CVE-2016-7044 | ||||||||||||||||||||||||||||
| Created: | September 22, 2016 | Updated: | October 11, 2016 | ||||||||||||||||||||||||||||
| Description: | According to the irssi advisory, a missing length check can cause a range of memory to be overwritten. Evidently, only zeroes can be written, so opinions differ on whether this vulnerability is exploitable for code execution. | ||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||
mactelnet: code execution
| Package(s): | mactelnet | CVE #(s): | CVE-2016-7115 | ||||
| Created: | September 26, 2016 | Updated: | September 28, 2016 | ||||
| Description: | From the CVE entry:
Buffer overflow in the handle_packet function in mactelnet.c in the client in MAC-Telnet 0.4.3 and earlier allows remote TELNET servers to execute arbitrary code via a long string in an MT_CPTYPE_ENCRYPTIONKEY control packet. | ||||||
| Alerts: |
| ||||||
mod_cluster: "remote exploits"
| Package(s): | mod_cluster | CVE #(s): | |||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||
| Description: | The Fedora advisory says: "Fixed remote exploits in Apache HTTP Server mod_manager and mod_proxy_cluster modules". Further information appears to be unavailable. | ||||||
| Alerts: |
| ||||||
mozilla: denial of service
| Package(s): | firefox, nss | CVE #(s): | CVE-2016-2827 | ||||||||||||||||||||
| Created: | September 26, 2016 | Updated: | September 28, 2016 | ||||||||||||||||||||
| Description: | From the CVE entry:
The mozilla::net::IsValidReferrerPolicy function in Mozilla Firefox before 49.0 allows remote attackers to cause a denial of service (out-of-bounds read and application crash) via a Content Security Policy (CSP) referrer directive with zero values. | ||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||
openssl: multiple vulnerabilities
| Package(s): | openssl | CVE #(s): | CVE-2016-2177 CVE-2016-2178 CVE-2016-2179 CVE-2016-2180 CVE-2016-2181 CVE-2016-2182 CVE-2016-2183 CVE-2016-6302 CVE-2016-6303 CVE-2016-6304 CVE-2016-6305 CVE-2016-6306 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | September 22, 2016 | Updated: | January 23, 2017 | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | The September 22 2016 OpenSSL advisory lists a number of problems fixed in the 1.1.0a, 1.0.2i, and 1.0.1u releases. The most serious would appear to be CVE-2016-6305, a "moderate" denial-of-service vulnerability. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
openssl: multiple vulnerabilities
| Package(s): | openssl | CVE #(s): | CVE-2016-6305 CVE-2016-6307 CVE-2016-6308 | ||||||||||||
| Created: | September 23, 2016 | Updated: | September 28, 2016 | ||||||||||||
| Description: | From the OpenSSL advisory: CVE-2016-6305 - OpenSSL 1.1.0 SSL/TLS will hang during a call to SSL_peek() if the peer sends an empty record. This could be exploited by a malicious peer in a Denial Of Service attack. CVE-2016-6307 - A TLS message includes 3 bytes for its length in the header for the message. This would allow for messages up to 16Mb in length. Messages of this length are excessive and OpenSSL includes a check to ensure that a peer is sending reasonably sized messages in order to avoid too much memory being consumed to service a connection. A flaw in the logic of version 1.1.0 means that memory for the message is allocated too early, prior to the excessive message length check. Due to way memory is allocated in OpenSSL this could mean an attacker could force up to 21Mb to be allocated to service a connection. This could lead to a Denial of Service through memory exhaustion. CVE-2016-6308 - A DTLS message includes 3 bytes for its length in the header for the message. This would allow for messages up to 16Mb in length. Messages of this length are excessive and OpenSSL includes a check to ensure that a peer is sending reasonably sized messages in order to avoid too much memory being consumed to service a connection. A flaw in the logic of version 1.1.0 means that memory for the message is allocated too early, prior to the excessive message length check. Due to way memory is allocated in OpenSSL this could mean an attacker could force up to 21Mb to be allocated to service a connection. This could lead to a Denial of Service through memory exhaustion. | ||||||||||||||
| Alerts: |
| ||||||||||||||
openssl: denial of service
| Package(s): | openssl | CVE #(s): | CVE-2016-7052 | ||||||||||||||||||||||||||||||||||||||||
| Created: | September 27, 2016 | Updated: | September 28, 2016 | ||||||||||||||||||||||||||||||||||||||||
| Description: | From the CVE entry:
crypto/x509/x509_vfy.c in OpenSSL 1.0.2i allows remote attackers to cause a denial of service (NULL pointer dereference and application crash) by triggering a CRL operation. | ||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||
openvas-libraries: multiple vulnerabilities
| Package(s): | openvas-libraries | CVE #(s): | |||||||||
| Created: | September 23, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | From the OpenVAS release notes: A number of memory leaks have been fixed. A bug which caused NASL arrays to be freed improperly causing memory corruption under certain circumstances has been fixed. | ||||||||||
| Alerts: |
| ||||||||||
openvas-scanner: denial of service
| Package(s): | openvas-scanner | CVE #(s): | |||||||||
| Created: | September 23, 2016 | Updated: | September 28, 2016 | ||||||||
| Description: | From the OpenVAS release notes: This release addresses a segmentation fault discovered after the release of OpenVAS Scanner 5.0.6 which could result in hanging or failing scans under certain circumstances. | ||||||||||
| Alerts: |
| ||||||||||
pidgin: mysterious vulnerabilities
| Package(s): | pidgin | CVE #(s): | CVE-2016-1000030 CVE-2016-2379 | ||||||||||||
| Created: | September 22, 2016 | Updated: | September 28, 2016 | ||||||||||||
| Description: | Pidgin suffers from a hashed-password disclosure vulnerability (said hash being usable to login via a replay attack) and a problem described only as "X.509 certificates Improperly Imported" (CVE-2016-1000030). | ||||||||||||||
| Alerts: |
| ||||||||||||||
policycoreutils: sandbox escape
| Package(s): | policycoreutils | CVE #(s): | CVE-2016-7545 | ||||||||||||||||||||||||
| Created: | September 26, 2016 | Updated: | November 23, 2016 | ||||||||||||||||||||||||
| Description: | From the Debian LTS advisory:
It was discovered that there was a sandbox escape via the "TIOCSTI" ioctl in policycoreutils, a set of programs required for the basic operation of an SELinux-based system. | ||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||
python-django: cross-site request forgery
| Package(s): | python-django | CVE #(s): | CVE-2016-7401 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | September 27, 2016 | Updated: | October 24, 2016 | ||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Debian advisory:
Sergey Bobrov discovered that cookie parsing in Django and Google Analytics interacted such a way that an attacker could set arbitrary cookies. This allows other malicious web sites to bypass the Cross-Site Request Forgery (CSRF) protections built into Django. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||
qemu: multiple vulnerabilities
| Package(s): | qemu | CVE #(s): | CVE-2016-6490 CVE-2016-6833 CVE-2016-6834 CVE-2016-6836 CVE-2016-6888 CVE-2016-7156 CVE-2016-7157 CVE-2016-7422 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Created: | September 26, 2016 | Updated: | September 28, 2016 | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| Description: | From the Gentoo advisory:
Multiple vulnerabilities have been discovered in QEMU. Local users within a guest QEMU environment can execute arbitrary code within the host or a cause a Denial of Service condition of the QEMU guest process. | ||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Alerts: |
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||
shiro: access control bypass
| Package(s): | shiro | CVE #(s): | CVE-2016-6802 | ||||
| Created: | September 23, 2016 | Updated: | September 28, 2016 | ||||
| Description: | From the CVE entry: Apache Shiro before 1.3.2, when using a non-root servlet context path, specifically crafted requests can be used to by pass some security servlet filters, resulting in unauthorized access. | ||||||
| Alerts: |
| ||||||
wireshark: denial of service
| Package(s): | wireshark-cli | CVE #(s): | CVE-2016-7175 | ||||
| Created: | September 27, 2016 | Updated: | September 28, 2016 | ||||
| Description: | From the CVE entry:
epan/dissectors/packet-qnet6.c in the QNX6 QNET dissector in Wireshark 2.x before 2.0.6 mishandles MAC address data, which allows remote attackers to cause a denial of service (out-of-bounds read and application crash) via a crafted packet. | ||||||
| Alerts: |
| ||||||
wordpress: multiple vulnerabilities
| Package(s): | wordpress | CVE #(s): | CVE-2015-8834 CVE-2016-4029 CVE-2016-6634 CVE-2016-6635 | ||||||||||||
| Created: | September 23, 2016 | Updated: | October 3, 2016 | ||||||||||||
| Description: | From the Debian-LTS advisory: CVE-2015-8834 - Cross-site scripting (XSS) vulnerability in wp-includes/wp-db.php in WordPress before 4.2.2 allows remote attackers to inject arbitrary web script or HTML via a long comment that is improperly stored because of limitations on the MySQL TEXT data type. NOTE: this vulnerability exists because of an incomplete fix for CVE-2015-3440 CVE-2016-4029 - WordPress before 4.5 does not consider octal and hexadecimal IP address formats when determining an intranet address, which allows remote attackers to bypass an intended SSRF protection mechanism via a crafted address. CVE-2016-6634 - Cross-site scripting (XSS) vulnerability in the network settings page in WordPress before 4.5 allows remote attackers to inject arbitrary web script or HTML via unspecified vectors. CVE-2016-6635 - Cross-site request forgery (CSRF) vulnerability in the wp_ajax_wp_compression_test function in wp-admin/includes/ajax- actions.php in WordPress before 4.5 allows remote attackers to hijack the authentication of administrators for requests that change the script compression option. | ||||||||||||||
| Alerts: |
| ||||||||||||||
Page editor: Jake Edge
Kernel development
Brief items
Kernel release status
The current development kernel is 4.8-rc8, released on September 25. Linus said: "Things actually did start to calm down this week, but I didn't get the feeling that there was no point in doing one final rc, so here we are. I expect the final 4.8 release next weekend, unless something really unexpected comes up."
The September 25 4.8 regression list has 15 entries.
Stable updates: 4.7.5 and 4.4.22 were released on September 24. The 4.7.6 and 4.4.23 updates are in the review process as of this writing; they can be expected on or after September 30.
Kernel development news
A look at the 4.8 development cycle
As of this writing, the 4.8 development cycle is nearing its end. Linus has let it be known that a relatively unusual -rc8 release candidate will be required before the final release, but that still means that the cycle will only require 70 days, fitting into the usual pattern. A look at the development statistics for this release also fits the pattern about now.With regard to the release cycle, it has become boringly regular in recent years. The 3.8 kernel, released on February 18, 2013, came out on a Sunday, as has every subsequent release with the exception of 3.11, which was released on Monday, September 2, 2013. In these last few years, the only cycle that has taken longer than 70 days was 3.13, which required 77 days. The extra week that time around was forced by Linus's travels, rather than anything inherent in that cycle itself. Since then, every cycle has taken 63 or 70 days, with the sole exception of 3.16, which showed up in 56 (and one could quibble that it was really a 63-day cycle as well — that was the time Linus experimented with opening the merge window before the previous final release had been made).
In this 70-day cycle, we have seen the addition of 13,253 non-merge changesets from 1,578 developers — so far; the numbers will increase slightly before the end. It is thus a busy cycle, though the record for the busiest (3.15, with 13,722 commits) remains unchallenged. Those developers grew the kernel by 350,000 lines this time around. The most active developers in this cycle were:
Most active 4.8 developers
By changesets Mauro Carvalho Chehab 347 2.6% Chris Wilson 266 2.0% Arnd Bergmann 180 1.4% Daniel Vetter 144 1.1% Geert Uytterhoeven 139 1.0% Wei Yongjun 129 1.0% Hans Verkuil 121 0.9% Arnaldo Carvalho de Melo 117 0.9% James Hogan 107 0.8% Paul Gortmaker 100 0.8% Trond Myklebust 98 0.7% David Hildenbrand 92 0.7% Christoph Hellwig 90 0.7% Krzysztof Kozlowski 88 0.7% Ville Syrjälä 86 0.6% Daniel Lezcano 82 0.6% Ben Dooks 80 0.6% Linus Walleij 76 0.6% Wolfram Sang 75 0.6% Christian König 75 0.6%
By changed lines Mauro Carvalho Chehab 110741 13.2% Markus Heiser 77196 9.2% Hans Verkuil 17868 2.1% Wolfram Sang 15211 1.8% Moni Shoua 13039 1.6% Christoph Hellwig 12535 1.5% Yuval Mintz 12467 1.5% Jani Nikula 12397 1.5% Chris Wilson 11003 1.3% Darrick J. Wong 7453 0.9% Arnaldo Carvalho de Melo 7204 0.9% Marc Zyngier 6514 0.8% Daniel Vetter 6499 0.8% Megha Dey 5844 0.7% Florian Fainelli 5697 0.7% Krzysztof Kozlowski 5600 0.7% Gavin Shan 5343 0.6% Bryant G. Ly 5019 0.6% Arnd Bergmann 4914 0.6% Adrian Hunter 4906 0.6%
Mauro Carvalho Chehab, the maintainer for the media subsystem, is traditionally a highly active developer. To understand his position at the top of both columns this time around, one need only to look back to the 4.8-rc1 announcement, where Linus said:
Many of those documentation updates, part of the transition in the kernel's formatted documentation subsystem, came from Mauro, who jumped on the task of converting the (considerable) media documentation with gusto. Other developers at the top of the "by changesets" column include Chris Wilson, whose work was focused on the Intel i915 driver; Arnd Bergmann who, when he's not maintaining the arm-soc subsystem, stays busy eliminating warnings from the kernel build; Daniel Vetter, an active DRM developer, and Geert Uytterhoeven, who did a lot of system-on-chip support work.
In the "changed lines" column, Markus Heiser worked on the media document conversion — and contributed a fair amount of code to make the new documentation system work. Hans Verkuil did a lot of media driver work (including removing some unused drivers), Wolfram Sang spent time on on the ks7010 driver in the staging tree (along with maintaining the I2C subsystem), and Moni Shoua contributed a single patch adding the "RDMA over converged Ethernet" driver to the InfiniBand subsystem.
Normally, work in the staging tree figures prominently in these statistics, but it is almost absent this time around. Indeed, only 386 patches have been applied to the staging tree in the 4.8 cycle, far less than the 916 seen in 4.7, or the 1,852 in 4.6. One might be tempted to think that the staging tree is slowing down, but that seems likely to be a temporary state of affairs. Indeed, it appears that the 4.9 development cycle will see over 2,300 staging commits for the addition of the greybus subsystem alone.
Work on the 4.8 kernel was supported by 217 employers that we were able to identify. The most active employers this time around were:
Most active 4.8 employers
By changesets Intel 1960 14.8% Red Hat 1143 8.6% (Unknown) 806 6.1% (None) 746 5.6% Linaro 662 5.0% IBM 654 4.9% Samsung 637 4.8% SUSE 338 2.6% 294 2.2% AMD 281 2.1% Oracle 259 2.0% Texas Instruments 258 1.9% Mellanox 243 1.8% Renesas Electronics 223 1.7% Broadcom 217 1.6% ARM 204 1.5% Huawei Technologies 170 1.3% NVidia 166 1.3% NXP Semiconductors 163 1.2% (Consultant) 157 1.2%
By lines changed Samsung 120693 14.4% Intel 104291 12.4% (None) 102848 12.3% Red Hat 48563 5.8% IBM 42298 5.0% Mellanox 29226 3.5% (Unknown) 27671 3.3% Linaro 22960 2.7% Broadcom 18040 2.2% Cisco 17868 2.1% MediaTek 16292 1.9% QLogic 15986 1.9% ARM 14397 1.7% Renesas 14283 1.7% (Consultant) 14146 1.7% Free Electrons 11227 1.3% Oracle 10982 1.3% Texas Instruments 9789 1.2% 9534 1.1% Renesas Electronics 9482 1.1%
The documentation work has shifted the numbers around here a bit but, for the most part, this table is as boring and unsurprising as usual. Samsung's position at the top of the "lines changed" column is, once again, the result of the formatted documentation transition.
In summary, this would appear to be another relatively normal busy development cycle. The kernel development machine appears to continue to hum along smoothly, with no serious process problems evident at this level though, as the recent discussion on backporting showed, there are issues elsewhere in the community. Both the 4.8 kernel and the community that produce it appear to be working well.
A low-level hibernation bug hunt
This is a story about how several obscure and nasty hibernation bugs were fixed over the last few months and how hibernation on x86-64 was made to work correctly with kernel address space layout randomization (KASLR) at the same time. It is a success story, but it did not look like that in the beginning. That success would not have been possible without a series of bug reports that happened to appear just in the right order, one after another. Fortunately enough, in each case the bug in question was reliably reproducible on at least one system, which allowed it to be narrowed down to a particular kernel change or a specific piece of code. It also would not have been possible without the persistence and determination of the bug reporters and developers involved.
For me, it started with a problem report from Logan Gunthorpe forwarded to the Linux power-management development list by Ingo Molnar. In that report, Gunthorpe said that hibernation broke for him after a security-related change that had made the kernel set the "no execute" (NX) flag on memory pages in the gap between the kernel code and the read-only data section following it.
My initial idea about why that change might cause hibernation to fail was related to how resume from hibernation worked on x86-64, so let me explain that briefly to begin with.
Hibernation on x86-64
Hibernation is generally regarded as a power-management feature, but it really is a checkpoint/restore mechanism working on the system as a whole. When triggered, it creates a snapshot of all memory pages in use at that time and saves it in persistent storage. Of course, the snapshot of each page has to be saved along with the number of the page frame occupied by it, so that it can be put into the same page frame later on. All of that information combined is referred to as a "hibernation image".
Next, the system is turned off (that can be done in a few different ways which are not relevant here). When turned on again later, it undergoes full initialization, starting with the platform firmware, which invokes the bootloader that, in turn, loads a new kernel (that is what happens in Linux; the resume control flow in other operating systems may be different). That new kernel is then responsible for loading the hibernation image created earlier back into memory and for restoring its previous state, so it will be referred to as the "restore kernel" in what follows. In turn, the kernel that created the hibernation image and, therefore, is included in it will be referred to as the "image kernel".
Of course, the restore kernel is always different from the image kernel, but it may come from the same kernel binary, in which case the kernel code is the same in both of them. That is not a requirement on x86-64, though. Moreover, even if the kernel code (often referred to as the "kernel text") is the same, the layout of code and data in memory created by the restore kernel may be different from what the image kernel had used. For instance, if kernel address space layout randomization is in use, the physical location of the kernel code in the restore and image kernels usually will be different. Moreover, in Linux 4.8-rc1 (and later) KASLR will cause the virtual base address of the kernel identity mapping (the one that maps the entire physical address space of the system into the kernel's virtual address space) to be different in each of them as a rule.
When the restore kernel runs, it will first initialize itself and the hardware; then it will look for a hibernation image header. If it finds one, it reads image description data from there and, if all looks good, it will start to load the image.
The goal here is to put each memory page included in the image into the page frame it occupied before hibernation and pass control to the image kernel, which can take over from that point on (as the memory will then look the same as before hibernation to it). That is not as straightforward as it sounds, however, because at least some of the page frames in question will be occupied by the restore kernel itself or its data. To overcome that difficulty, the restore kernel takes several steps that each get it closer to its goal.
First of all, it allocates enough memory to hold all of the data pages and metadata (basically consisting of the page frame numbers to put those data pages into eventually) from the image. It uses two bitmaps to track the memory allocated in this step, to keep a record of (1) which page frames have been allocated and (2) which of them were in use before hibernation. The allocated ones that were not used before hibernation (i.e. their numbers are not included in the image metadata) are referred to as "safe", because they won't be overwritten with data coming from the image going forward.
Second, all of the image data pages are loaded into the allocated memory. The trick here is to store as many data pages from the image as possible in the page frames they occupied before hibernation; the bitmaps mentioned above are used for that. Namely, before loading a data page from the image, the page frame it occupied before hibernation is looked up in the bitmaps and, if it is present there (i.e. it was allocated in the previous step), the data page is loaded into it directly without the need to remember where it has been stored. If the page frame occupied by that data page before hibernation was not allocated in the previous step, the data page has to be stored in a safe page frame whose number has to be recorded along with the "target" location of the data page stored in it.
The next step is to quiesce all devices and all CPUs except for one and, having done that, the restore kernel prepares to copy all of the image data pages stored in "safe" page frames previously to their "target" locations. That has to be done in an architecture-specific way and it has to take into account the fact that the restore kernel itself and its data will be overwritten in the process, so the following step will not be reversible.
On x86-64, the restore kernel creates temporary page tables consisting of safe pages only, so that they will not get overwritten with image data. These page tables only need to cover two mappings: the identity mapping necessary for the image data pages copying operation itself and the kernel text mapping allowing the restore kernel to pass control back to the image kernel. This transfer of control is done by jumping to an address representing the image kernel's entry point (that can be read from the image header). In addition, the code that will copy the image data pages and perform the final jump to the image kernel's entry point has to be relocated to a safe page in order to prevent it from overwriting itself inadvertently; the page it has been relocated to must be marked as executable. With all that in place, the restore kernel only needs to jump to the relocated code that will switch over to the temporary page tables, copy the image data pages still held in "safe" page frames to their "target" locations, and jump to the image kernel's entry point.
Where things went wrong
That should sound reasonable enough — but it is what the restore kernel does today. At the time of the Gunthorpe's bug report, however, the code in question was somewhat less straightforward.
Namely, it also created temporary page tables but, while the identity mapping covered by those tables was set up from scratch, the restore kernel's own text mapping was reused by hooking it up directly into the topmost page directory of the new page tables. That allowed the restore kernel to switch over to the temporary page tables before jumping to the relocated code, but it also imposed serious limitations on the final jump to the image kernel's entry point such that it would only work in quite specific conditions. As it turned out, those conditions were not guaranteed to be met in general; that was the source of the problem seen by Gunthorpe.
My first idea about what might have gone wrong was that, perhaps, the security change identified by Gunthorpe as the one that introduced the problem caused the page containing the image kernel's entry point to become non-executable in the restore kernel's text mapping. With that in mind I prepared a patch that would mark that page as executable at the right time and asked Gunthorpe to test it, but it did not make any difference.
That caused me to look at the addresses involved more closely; I quickly realized that reusing the restore kernel's text mapping in the temporary page tables was a mistake, because that mapping might very well be corrupted in the process of copying image data pages to their target locations. If that happened, the final jump to the image kernel's entry point would go to nowhere, triggering a page fault that couldn't be handled at that point. Clearly, the temporary page tables needed a kernel text mapping set up from scratch consisting of only safe pages, just like the identity mapping. I noticed, though, that it didn't have to cover the entire kernel text. In fact, it didn't have to cover the kernel text at all. It only had to cover the image kernel's entry point itself.
That was the case because the code performing the final jump to the image kernel's entry point would be relocated and it would be running from a page covered by the identity mapping, so it didn't need the kernel text mapping to run. Moreover, the virtual address of the image kernel's entry point passed in the image header had to be mapped to the physical address of its location in memory, but that might not match the restore kernel's text mapping. Hence, the kernel text mapping used for the final jump to the image kernel's entry point had to be based on the information provided by the image kernel. For that reason, I changed the image header format to include the physical address of the image kernel's entry point too.
It didn't take me too much time to come up with a patch implementing that idea. With that patch, however, the restore kernel would still switch over to the temporary page tables before jumping to the relocated code, so its text mapping still had to be reused to start with. It would be replaced with a new minimum kernel text mapping that covered the image kernel's entry point just prior to the final jump to it.
The plot thickens
That patch fixed the resume problem for Gunthorpe, but it wasn't perfect. Namely, Borislav Petkov reported that it introduced a strange memory corruption during resume from hibernation for him. That new problem occurred on every resume from hibernation on his system and manifested itself as a corruption of the context of a user-space process that attempted to run after the image kernel had brought all CPUs back online and had completed the resume of I/O devices.
That was really unusual, so we spent quite a lot of time on trying to understand why and how it might happen. Linus Torvalds suspected that the problem might be related to the way the patch played with the kernel-text mapping and he clearly didn't like that part of it anyway, so I decided to change the code flow to first jump to the relocated code and then switch over to the temporary page tables from there. That still allowed the kernel-text mapping in the temporary page tables to be minimal, but it avoided the need to replace one version of the kernel-text mapping with another one on the fly which, admittedly, had been an ugly hack.
I posted a patch created along these lines and, again, it worked for Gunthorpe, but it still triggered memory corruption during resume from hibernation for Petkov, so we went into a long debug session trying to figure out what was going on. Theories taken into consideration included platform firmware involvement, a hardware issue, or a bitmap implementation error in the hibernation core, but there were substantial weaknesses in every one of them.
Eventually, we were able to narrow the breakage down to a single line of code in a new function added by my patch, but it was completely unclear why that particular line of code would lead to the observed symptoms. Since that line of code looked like it might be using a local variable on the stack, I decided to check whether changing the new function to use fewer local variables would make any difference (the theory was that the stack might have been corrupted somehow, although how exactly that could have happened was still a mystery). Surprisingly enough, that change appeared to fix the problem for Petkov (in fact, it only hid the problem, but that was found to be the case quite a bit later). It did that so effectively that the memory corruption went away and could not be reproduced on Petkov's machine any more.
In the meantime, Yu Chen analyzed Gunthorpe's original report in detail and explained why the security-related kernel commit identified as the one that introduced the problem could actually make a difference. According to Chen, the setting of the NX flag on the gap between the kernel text and the read-only data was not as straightforward as it looked because it might cause kernel page tables to be split. Specifically, if the end of the kernel text fell into a large (2M) page, that page had to be split into normal (4K) pages for the NX bit to be set on the gap only. That required more page-table memory to be allocated dynamically; that allocation happened within the kernel-text mapping that would be overwritten by image data during resume from hibernation, so reusing it in the restore kernel's temporary page tables would lead to an unrecoverable error.
In addition to that, Kees Cook reported that the fix for the issue reported by Gunthorpe also made hibernation work with KASLR on x86-64. At that time, KASLR worked on the kernel's text mapping only and randomized its physical base. As a result, the physical address of the base of the kernel text mapping used by the restore kernel would be different from what the image kernel had used most of the time. That prevented the restore kernel from mapping the virtual address of the image kernel's entry point (passed in the image header) to the correct physical address and resume from hibernation didn't work. That changed with the introduction of the minimal kernel-text mapping used for the final jump to the image kernel's entry point in my patch, because it mapped virtual addresses to physical addresses in the same way as the image kernel did.
In the face of this, and because the memory corruption seen by Petkov was apparently not reproducible with the last version of the resume fix (and I was quite confident that it could not be introduced by that fix itself anyway), I decided to go ahead with the fix and it finally landed in Linux 4.7 as kernel commit 65c0554b73c9. While the immediate problem was fixed, it was quite possible that the previous versions of the resume fix simply uncovered some obscure latent bug, so I made a few changes in the hibernation core to make it easier to debug in case the memory corruption problem or anything similar to it showed up again in the future. When I did that, though, I wasn't expecting the memory corruption issue to reappear a few days later in a report pointing to the kernel commit that was the true source of it. But, first, another problem had to be solved.
MWAIT vs. HLT
Meanwhile, my attention had been caught by another serious bug related to resume from hibernation on x86-64, but limited to Intel CPUs. At that point it had already been investigated for several weeks by Chen who had posted a couple of RFC patches to address it, but the reviewers looking at them pointed out some valid concerns to him.
That issue was related to the use of the MONITOR and MWAIT instructions of the CPU in the code that takes CPUs offline, in particular during resume from hibernation. CPU offlining is a complicated matter that involves migrating tasks and interrupts from the CPU going offline to ensure that it won't have anything to do from that point on. The last stage of the process is to make the CPU appear as though it is not functional from a software perspective. That is achieved by making it execute a "wait for something to happen" instruction in a tight endless loop with locally disabled interrupts.
There are two flavors of such "wait for something to happen" instructions in the Intel processors' instruction set. The first one is the old-school HLT instruction that causes the CPU to go into a relatively shallow low-power state and wait for an interrupt; if interrupts are locally disabled on the CPU, it will become almost completely unresponsive after executing that instruction (the only interrupts that can "revive" the CPU then are the non-maskable ones, but those are only used in very special situations). The second type of a "wait for something to happen" instruction is MWAIT, which goes together with MONITOR.
First, MONITOR takes an address identifying a range of memory that corresponds to a single line in the CPU's cache. Next, the MWAIT following it causes the CPU to enter a low-power state (and that state may be much deeper than the HLT-induced one) and wait for an event like an interrupt or a write to one of the MONITORed memory locations from another CPU in the system. Thus, from an energy consumption perspective, the MONITOR/MWAIT combination is much better than HLT, but that really wasn't important in the resume from hibernation case since CPUs stay offline for a very short time then. The important fact was that, during resume from hibernation, the memory locations MONITORed by the offline CPUs were almost guaranteed to be written to by the only online CPU that carries out the final resume stages described earlier.
Recall that, during those stages, the image data pages still held in safe page frames are copied into their target locations, which generally overlap with memory occupied by the restore kernel itself and by its data. In particular, with CPUs offline using MONITOR/MWAIT, they might (and usually did) overlap with the memory MONITORed by those offline CPUs. That was a recipe for disaster; because the page tables used by those CPUs might have been overwritten too at that point, an attempt to fetch the next instruction by any of them would lead to a page fault that could not be handled, so the kernel would panic and crash. Worse yet, the code those CPUs would be executing if woken up from the MWAIT-induced state inadvertently might have been overwritten at that point too.
The problem was figured out and a rough consensus about how to fix it had formed during the review of Chen's patches: everyone involved seemed to agree that, during resume from hibernation, the CPU offline code should use the HLT instruction instead of MONITOR/MWAIT. The question was how to implement that idea in the cleanest way possible.
Chen had already posted a couple of patches going in that direction when I started to look at the details of the code in question, but none of those approaches had been particularly attractive. My first attempts at fixing this issue were not any better, until I realized that the function to execute at the last stage of CPU offline was a callback pointed to by the play_dead field in the smp_ops structure, so replacing that callback temporarily with a special one using HLT during resume from hibernation would do the trick. The change needed for that was relatively isolated and, most importantly, it didn't add any overhead to the CPU offline code, so it was approved by Molnar and the final patch making the change shipped in Linux 4.8-rc1 as kernel commit 406f992e4a37.
The mystery bug returns
At that point, I was thinking that the worst problems related to resume from hibernation on x86-64 were fixed, but I forgot about the mystery memory corruption issue previously reported by Petkov. To my surprise, just then it was reported again by Andre Reinke. For Reinke, however, it was a regression introduced in Linux 4.6 and he was able to identify kernel commit ef0f3ed5a4ac as the source of it.
In retrospect, it was quite obvious that resume from hibernation would be broken by that commit, because it added a FRAME_BEGIN macro to the assembly code that would run as the first thing after the restore kernel had jumped to the image kernel's entry point. Among other things, that macro generated a PUSH instruction that would be executed before writing the address of the original image kernel's page tables into the CR3 register of the CPU. Thus the CPU would still be using the temporary page tables created by the restore kernel when executing it and the value of its stack pointer would contain the address of a memory area that might contain image data now. In that case, the PUSH instruction would corrupt those image data pages by overwriting them with a stale value read from another CPU register.
Ironically enough, the FRAME_BEGIN macro was there all the time when the memory corruption reported by Petkov was being investigated and nobody saw the problem with it then. It looks like everyone, myself included, was mentally blinded by the fact that it was a macro and no one could see the real sequence of CPU instructions it was resolving to. Had the PUSH instruction been located directly in that code, the issue probably would have been resolved earlier without a need for a pointer to the kernel commit that introduced it. That pointer did help a lot, though, because it made everyone look at the right places in the code and the bug was readily fixed by Josh Poimboeuf. His fix went into Linux 4.8-rc1 as kernel commit 4ce827b4cc58.
That would have ended the x86-64 hibernation saga, had KASLR not been extended during the Linux 4.8-rc1 merge window. That did happen, however, and it affected Petkov again, breaking resume from hibernation for him on another machine. He noticed that unsetting the new CONFIG_RANDOMIZE_MEMORY kernel configuration option (set by default) made hibernation work again on that system, so the investigation of the problem focused on the interactions between hibernation and the new KASLR-related changes.
After those changes, KASLR on x86-64 randomizes not only the (physical) base address of the kernel text mapping, but also the (virtual) base address of the kernel identity mapping, among other things. That obviously might not play well with resume from hibernation which, in principle, might not be prepared to deal with differences in kernel identity mapping base address between the restore and image kernels. Indeed, that turned out to be the case; two problems in that area were quickly found by KASLR developer Thomas Garnier, who posted prototype patches to fix them.
First, the assembly code carrying out the switch over to temporary page tables during resume from hibernation contained a direct reference to the __PAGE_OFFSET symbol, used with the assumption that it would always resolve to a number. However, with CONFIG_RANDOMIZE_MEMORY set that symbol resolves to a variable name and the code generated in that case was invalid. Clearly, it was necessary to avoid using __PAGE_OFFSET this way, but Garnier's prototype patch did that with the help of preprocessor directives, which wasn't particularly clean. There was a better way: pass the physical rather than the virtual address of the page tables to the assembly code. That physical address might be computed by the code written in C and passed to the assembly in the same variable that previously had been used to pass the virtual address of the temporary page tables. With that, the problematic reference to __PAGE_OFFSET from assembly would simply go away, so I posted a patch making that change which landed in Linux 4.8-rc1 as kernel commit c226fab47429.
Second, the kernel_ident_mapping_init() function called by the low-level code that creates temporary page tables during resume from hibernation made an assumption regarding the alignment of the base address of the kernel identity mapping that generally wasn't satisfied with CONFIG_RANDOMIZE_MEMORY set. That was easy enough to fix, but Garnier's prototype patch overlooked a corner case that was pointed out by Yinghai Lu, who posted his own version of that fix. Lu's patch worked, but it increased the complexity of the code in question which wasn't strictly necessary, so I prepared and posted yet another version of it that was approved by everyone involved and went into Linux 4.8-rc2 as kernel commit e4630fdd4763.
Still, those two fixes turned out to be insufficient to make the issue reported by Petkov go away. Moreover, the same issue was reported by Jiri Kosina in the meantime (the symptom seemed to be a triple fault during resume meaning, probably, an unhandled page fault). It was puzzling because it was reproducible on the affected systems 100% of the time, while other, similar, systems hibernated and resumed without any problems at all.
Fortunately, I had a test system that was similar to Petkov's failing one, so I was able to use his configuration file to generate a kernel for it. That allowed me to reproduce the problem locally and to verify that it was triggered by setting the CONFIG_DEBUG_LOCK_ALLOC configuration option. It still was not particularly clear why and how that option might lead to the observed failure, but Garnier was also able to reproduce it, and he found the reason why it appeared. That turned out to be a bug in the hibernation core introduced during the Linux 3.16 development cycle that caused a tracing function to be called before the processor state had been restored completely. As a result, a stale value of the GS register was used by that tracing function; that led to the observed triple fault, which Garnier was able to fix by simply changing the ordering of the code in question. That fix went into Linux 4.8-rc2 as kernel commit 62822e2ec4ad.
Working, at last
That finally made hibernation work for Petkov and Kosina again, even with both CONFIG_RANDOMIZE_MEMORY and CONFIG_DEBUG_LOCK_ALLOC set; only one thing remained unknown: why would CONFIG_DEBUG_LOCK_ALLOC make a difference before? That was explained by Kosina, who looked at the assembly output generated by the compiler for the affected code both with and without CONFIG_DEBUG_LOCK_ALLOC set and found that it was different in those two cases. Next, he was able to track the difference down to the definition of the __DECLARE_TRACE() macro, which generated additional code with CONFIG_DEBUG_LOCK_ALLOC set; that additional code used GS-relative addressing, which would lead to the observed failure if the GS value was stale.
In the end, in Linux 4.8-rc3 (and later) resume from hibernation on x86-64 works at last and it works with KASLR enabled. It took a couple of months to get to this point due to the nature of the bugs that needed to be fixed and due to the complexity of the affected code. As said in the beginning, that wouldn't have been possible without all of the developers and bug reporters involved and in particular I'd like to thank the following contributors for their input that shaped the final code changes: Logan Gunthorpe, Ingo Molnar, Borislav Petkov, Linus Torvalds, Chen Yu, Kees Cook, Andre Reinke, Josh Poimboeuf, Thomas Garnier, Yinghai Lu, and Jiri Kosina.
Patches and updates
Kernel trees
Architecture-specific
Core kernel code
Device drivers
Device driver infrastructure
Filesystems and block I/O
Memory management
Networking
Security-related
Miscellaneous
Page editor: Jonathan Corbet
Distributions
ARC++
At the 2016 X.Org Developers Conference (XDC) in Helsinki, David Reveman gave a talk about the ARC++ project, which allows Android apps to run unchanged on Chrome OS. In order to make that work, there was some significant impedance matching that needed to be done. Reveman described how it all worked in a session on the first day of the conference.
The name "ARC++" comes from a previous project, called the "App Runtime for Chrome" or ARC, that was launched in 2014. It was a plugin for the Chrome browser, but required developers to change their apps to run in that environment. The plugin had to emulate multiple Android layers, which had performance implications. In the end, ARC "never really took off", he said.
So a new project was started, ARC++, with the goal of allowing access to all of the Play Store apps without requiring changes to those apps and with minimal changes to the underlying Android framework. The goals also included keeping Chrome OS secure, while maintaining the Chrome OS update model. In ARC++, Android apps are isolated from Chrome OS as much as possible, by using Linux containers. The apps run in the containers, while Chrome OS runs as it normally does.
Reveman then went into an overview of the graphics stack for ARC++. It is a complicated stack, as can be seen in the diagram from his slides [PDF] below on the left. A YouTube video of the talk is also available for those interested in further details.
Android apps typically use the hardware-accelerated Canvas API that has been available since Android 4.0. Some other apps, especially games, use OpenGL ES (GLES) directly, though they may use the new Vulkan 3D graphics API in the future.
Everything in Android is rendered to a Surface; those Surfaces are produced by apps and placed into a queue that is consumed by SurfaceFlinger. The gralloc hardware abstraction layer (HAL) is used to allocate the buffers that underlie Surfaces, both in Android and ARC++. For ARC++, gralloc and the GLES driver use the Direct Rendering Manager (DRM) subsystem in the kernel for rendering. That allows apps to use fully accelerated GLES or to use other rendering APIs (e.g. Canvas) as needed. Some day, the apps may use Vulkan, but ARC++ doesn't care so long as the target is a gralloc buffer, he said.
For compositing in Android, Surfaces are sent to SurfaceFlinger, which uses GLES to do the compositing. For ARC++, though, the Hardware Composer HAL (HWComposer) handles all of the Surfaces. They are forwarded to Chrome OS for compositing along with the rest of the Chrome OS user interface.
For window management, ARC++ takes advantage of some of the recent multi-window work that has been done for Android. Certain operations are handled by Android and the others are managed by Chrome OS. The absolute positioning and resizing of windows are done by Android, while maximize, minimize, and full-screen operations are managed by Chrome OS. In addition, app switching, multiple profiles, screen magnifiers, and the like are handled by Chrome OS.
DRM and kernel modesetting (KMS) are used on Chrome OS. DRM is also used by Android and that is what allows efficiently integrating Android and Chrome OS, Reveman said. For both, DRM is used for rendering and buffer allocation; that is what allows easily sharing graphics buffers between the two. Chrome OS is a DRM master, so it can program the display controller, while Android does not need the modesetting capabilities, as it can just needs access to the GPU via a render node.
Low-level input and graphics on Chrome OS are handled by the Ozone abstraction layer that targets everything from embedded system-on-chip (SoC) graphics to X11 and its alternatives (e.g. Wayland). It uses a GpuMemoryBuffer object to hold DRM buffers that have been allocated using gralloc on the Android side or DRM itself on the Chrome OS side. That abstraction allows platform-independent code, such as the Chrome browser compositor, to take advantage of low-level graphics buffers.
The pixel formats (i.e. the color schemes and sizes used for in-memory pixel representation) in Chrome OS are limited and more had to be added for Android apps. In order for a DRM buffer to be imported into Chrome OS from Android, the pixel format of the buffer has to be supported. Some formats were only supported by falling back to converting them in software, though that is rare now, he said.
Exosphere is a Chrome OS component that allows other clients to connect to the user interface. It protects Chrome OS from potentially malicious clients, such as Android apps, by doing validation of the operations requested. It is built on top of the GpuMemoryBuffer framework.
Applications on Chrome OS run within the Chrome browser. Its compositor has a multi-process architecture, with one browser process that starts a renderer process for each tab. Those renderer processes produce frames that get sent back to the browser process. One difference between Chrome and Android is in synchronization. It is relatively simple for Chrome, where there is just one process that talks to the DRM driver from a single thread. In ARC++, though, multiple threads in the Android container make things more complicated. Right now, there is something of a "fence dance" that is done to ensure Android does not reuse a buffer before the GPU has finished with it; in the future, it is expected that explicit synchronization will allow that dance to be removed.
For the graphics, window management, and input communication between Android and Chrome OS, the Wayland protocol was chosen. There are a number of benefits to that approach, including Wayland's limited API that allows easier validation from a security point of view, Reveman said. Most of the interfaces needed were already present, but ARC++ did add a few. Another advantage is that Wayland is well-tested and has a set of existing clients that could be used to test and validate the ARC++ implementation.
The project is currently going through the process of deciding which of the new interfaces should go upstream and which should be discarded in favor of upstream. Some existing Wayland interfaces did not do quite what was needed for ARC++ and the developers did not have time to work with upstream at the time. There is also interest in adding a few more interfaces, for things like explicit synchronization for releasing buffers and presentation timing, as well as protected buffers for digital rights management.
The code for ARC++ can be found in the Chromium source tree. For example, Exosphere can be found here and the Wayland extensions can be found in this repository.
Reveman gave a few demonstrations of ARC++, including the Play Store app running on Chrome OS and multiple YouTube apps running while switching between them. There is gamepad support as well. When running a system in developer mode, you can have normal Wayland applications communicate with the compositor and run in Chrome OS when it is running an environment that allows running regular Linux applications on a Chrome OS device, such as crouton.
In answer to some questions from the audience, Reveman said that there were really no problems for regular Wayland applications due to the extensions made to the protocol. Chrome OS supports regular Wayland just fine; applications could even take advantage of the ARC++ extensions, though he doesn't recommend doing that. So far, there is a single Wayland application on Chrome OS—Android—and there are no plans to change that right now, but that could perhaps change down the road.
[I would like to thank the X.Org Foundation for sponsoring my travel to Helsinki for XDC.]
Brief items
Distribution quotes of the week
Here is a list of repositories that other people have found will help you meet certain needs. Fedora makes no guarantee that it won't eat your system, but we also don't make any such guarantee about anything we ship. We try our best but someday the Grue is going to eat you no matter what. Thanks for playing.
Debian Project mourns the loss of Kristoffer H. Rose
Ana Guerrero Lopez sadly reports that Kristoffer H. Rose died on September 17. "Kristoffer was a Debian contributor from the very early days of the project, and the upstream author of several packages that are still in the Debian archive nowadays, such as the LaTeX package Xy-pic and FlexML. On his return to the project after several years' absence, many of us had the pleasure of meeting Kristoffer during DebConf15 in Heidelberg. The Debian Project honours his good work and strong dedication to Debian and Free Software. Kristoffer's broad technical knowledge and his ability to share that knowledge with others will be missed. The contributions of Kristoffer will not be forgotten, and the high standards of his work will continue to serve as an inspiration to others."
Ubuntu 16.10 (Yakkety Yak) Final Beta released
The Ubuntu team has announced the final beta release of Ubuntu 16.10 Desktop, Server, and Cloud products. Beta images are also available for Kubuntu, Lubuntu, Ubuntu GNOME, Ubuntu Kylin, Ubuntu MATE, and Ubuntu Studio. The final release is expected on October 13.
Distribution News
Ubuntu family
Ubuntu Online Summit
The next Ubuntu Online Summit will take place November 15-16. "At the event we are going to celebrate the 16.10 release and all the great things which are new and get to talk about what's coming up in Ubuntu 17.04."
Newsletters and articles of interest
Distribution newsletters
- DistroWatch Weekly, Issue 680 (September 26)
- Lunar Linux weekly news (September 23)
- openSUSE news (September 22)
- openSUSE news (September 23)
- openSUSE Tumbleweed – Review of the Week (September 23)
- Ubuntu Weekly Newsletter, Issue 482 (September 25)
Firefox OS, B2G OS, and Gecko
Ari Jaaksi and David Bryant posted a note to the B2G (Boot to Gecko) OS community looking at the end of Firefox OS development and at what happens to the code base going forward. "In the spring and summer of 2016 the Connected Devices team dug deeper into opportunities for Firefox OS. They concluded that Firefox OS TV was a project to be run by our commercial partner and not a project to be led by Mozilla. Further, Firefox OS was determined to not be sufficiently useful for ongoing Connected Devices work to justify the effort to maintain it. This meant that development of the Firefox OS stack was no longer a part of Connected Devices, or Mozilla at all. Firefox OS 2.6 would be the last release from Mozilla. Today we are announcing the next phase in that evolution. While work at Mozilla on Firefox OS has ceased, we very much need to continue to evolve the underlying code that comprises Gecko, our web platform engine, as part of the ongoing development of Firefox. In order to evolve quickly and enable substantial new architectural changes in Gecko, Mozilla’s Platform Engineering organization needs to remove all B2G-related code from mozilla-central. This certainly has consequences for B2G OS. For the community to continue working on B2G OS they will have to maintain a code base that includes a full version of Gecko, so will need to fork Gecko and proceed with development on their own, separate branch." (Thanks to Paul Wise)
Page editor: Rebecca Sobol
Development
Systemd programming, 30 months later
Some time ago, we published a pair of articles about systemd programming that extolled the value of providing high-quality unit files in upstream packages. The hope was that all distributions would use them and that problems could be fixed centrally rather than each distribution fixing its own problems independently. Now, 30 months later, it seems like a good time to see how well that worked out for nfs-utils, the focus of much of that discussion. Did distributors benefit from upstream unit files, and what sort of problems were encountered?
Systemd unit files for nfs-utils first appeared in nfs-utils-1.3.0, released in March 2014. Since then, there have been 26 commits that touched files in the systemd subdirectory; some of those commits are less interesting than others. Two, for example, make changes to the set of unit files that are installed when you run "make install". If distributors maintained their unit files separately (like they used to maintain init scripts separately), this wouldn't have been an issue at all, so these cannot be seen as a particular win for upstreaming.
Most of the changes of interest are refinements to the ordering and dependencies between various services, which is hardly surprising given that dependencies and ordering are a big part of what systemd provides. With init scripts we didn't need to think about ordering very much, as those scripts ran the commands in the proper order. Systemd starts different services in parallel as much as possible, so it should be no surprise that more thought needs to be given to ordering and more bugs in that area are to be expected.
As hoped, the fixes came from a range of sources, including one commit from an Ubuntu developer that removed the default dependency on basic.target. That enabled the NFS service to start earlier, which is particularly useful when /var is mounted via NFS. Another, from a Red Hat developer, removed an ordering cycle caused by the nfs-client.target inexplicably being told to start before the GSS services it relies on, rather than after. A third, from the developer of OSTree, made sure that /var/lib/nfs/rpc-pipefs wasn't mounted until after the systemd-tmpfiles.service had a chance to create that directory. This is important in configurations where /var is not permanent.
Each of these changes involved subtle ordering dependencies that were not easy to foresee when the unit files were initially assembled. Some of them have the potential to benefit many users by improving robustness or startup time. Others have much narrower applicability, but still benefit developers by documenting the needs that others have. This makes it less likely that future changes will break working use cases and can allow delayed collaboration, as the final example will show.
rpcbind dependencies
There were two changes deserving of special note, partly because they required multiple attempts to get right and partly because they both involve dependencies that are affected by the configuration of the NFS services; they take quite different approaches to handling those dependencies. The first of these changes revised the dependency on rpcbind, which is a lookup service that maps an ONC-RPC service number into a Internet port number. When RPC services start, they choose a port number and register with rpcbind, so it can tell clients which port each service can be reached on.
When version 2 or version 3 of NFS is in use, rpcbind is required. It is necessary for three auxiliary protocols (MOUNT, LOCK, and STATUS), and is the preferred way to find the NFS service, though in practice that service always uses port 2049. When only version 4 of NFS is in use, rpcbind is not necessary, since NFSv4 incorporates all the functionality that was previously included in the three extra protocols and it mandates the use of port 2049. Some system administrators prefer not to run unnecessary daemons and so don't want rpcbind started when only NFSv4 is configured. There are two requirements to bear in mind when meeting this need; one is to make sure the service isn't started, the other is to ensure the main service starts even though rpcbind is missing.
As discussed in the earlier articles, systemd doesn't have much visibility into non-systemd configuration files, so it cannot easily detect if NFSv3 is enabled and start rpcbind only if it is. Instead it needs to explicitly be told to disable rpcbind with:
systemctl mask rpcbind
There is subtlety hiding behind this command. rpcbind uses three unit files: rpcbind.target, rpcbind.service, and rpcbind.socket. Previously, I recommended using the target file to activate rpcbind but that was a mistake. Target files can be used for higher-level abstractions as described then, but there is no guarantee that they will be. rpcbind.target is defined by systemd only to provide ordering with rpcbind (or equally "portmap"). This provides compatibility with SysV init, which has a similar concept. rpcbind.target cannot be used to activate those services, and so should be ignored by nfs-utils. rpcbind.socket describes how to use socket-activation to enable rpcbind.service, the main service. nfs-utils only cares about the sockets being ready to listen, so it should only have (and now does only have) dependencies on rpcbind.socket.
Masking rpcbind ensures that rpcbind.service doesn't run. The socket activation is not directly affected, but systemd sorts this out soon enough. Systemd will still listen on the various sockets at first but, as soon as some process tries to connect to one of those sockets, systemd will notice the inconsistency and will shut down the sockets as well. So this simple and reasonably obvious command does what you might expect.
Ensuring that other services cope with rpcbind being absent is as easy as using a Wants dependency rather than a Requires dependency. These ask the service to start, but won't fail if it doesn't. Some parts of NFS only "want" rpcbind to be running, but one, rpc.statd, cannot function without it, so it still Requires rpcbind. This has the effect of implicitly disabling rpc.statd when rpcbind is masked.
It's worth spending a while reflecting on why the command is
"systemctl mask" rather than "systemctl disable", as
I've often come across the expectation that enable and
disable are the commands to enable or disable a unit file. As
a concrete example, Martin Pitt stated in Ubuntu
bug 1428486 that they are "the canonical way to
enable/disable a unit
", but this was not the first place that I
found this expectation.
The reality is that enable is the canonical way to request activation of a unit file. It doesn't actually start it ("systemctl start" will do that), and it isn't the only way to activate a unit file, as some other unit file can do so with a Requires directive. This may seem to be splitting hairs, but the distinction is more clear with the disable command, which does not disable a unit file. Instead, it only reverts any explicit request made by enable that a unit be activated. It is quite possible that a unit file will still be fully functional even after running "systemctl disable" on it.
If you want to be sure that a unit file will be activated, then "systemctl enable" is probably the right thing to do. If you want to be sure that it is not activated, then "systemctl disable" won't provide that guarantee; you need "systemctl mask" instead. This command ensures that the unit file won't run even if some other unit file Requires it. So that is the command that we use to ensure rpcbind isn't running, and it could also be used to ensure rpc.statd isn't running, though that isn't really needed as masking rpcbind effectively masked rpc.statd as mentioned.
Ordering nfsd with respect to filesystem mounting using a generator
One dependency for the NFS server, which is particularly obvious in hindsight, is that it should only be started after the filesystems that it is exporting have been mounted. Without this ordering, an NFS client might manage to mount the filesystem that is about to have something mounted on top of it, which can cause confusion — or worse. The default dependencies imposed by systemd will start services after local-fs.target, which ensures all local filesystems are mounted. When the commit mentioned above removed the default dependencies to allow NFS to start earlier, it explicitly added local-fs.target. So this seems well in hand.
For remote filesystems mounted over NFS, we need the reverse ordering. In particular, if a filesystem is NFS mounted from the local host (a "loopback" mount), the NFS server should be started before the filesystem is mounted. This is particularly important during system shutdown when ordering is reversed. If the NFS server is stopped before the loopback NFS filesystem is unmounted, that unmount can hang indefinitely.
To avoid this hang, Pitt added a dependency so that nfs-server.service would start before (and so be stopped after) remote-fs-pre.target. This ensures that the NFS server will be running whenever a loopback NFS filesystem might be mounted. This seems like it makes perfect sense, but there is a wrinkle: sometimes, filesystems that are considered by systemd to be "remote" can be exported by NFS. A particular example is filesystems mounted from a network-attached block device, such as one accessed over iSCSI.
Had I confronted the need to export iSCSI filesystems before Pitt had added the dependency on remote-fs-pre.service, I probably would have simply told systemd to start nfs-server.service "After remote-fs.target". This would have solved the iSCSI situation, but broken the loopback NFS situation. Had the unit files not been upstream, this is undoubtedly what would have happened.
Instead, a more general solution was needed. The NFS server needs to start after the mounting of any filesystems that are exported, but before any NFS filesystem is mounted. Systemd is not able to make this determination itself, but fortunately it has a flexible extension mechanism so it can have the details explained to it. Using this extension mechanism isn't quite as easy as adding a script to /etc/init.d, but perhaps that is a good thing. It should probably only be used as a last resort, but it is good to have it when that resort is needed.
Before systemd reads all its unit files, either at startup or in response to "systemctl daemon-reload", it will run any programs found in various "generator" directories such as /usr/lib/systemd/system-generators. These programs are run in parallel, are expected to complete quickly, and will normally read a foreign (i.e. non-systemd) configuration file and create new unit files or drop-ins (which extend existing unit files) in a directory given to the program, typically /run/systemd/generator. These will then be read when other unit files and drop-ins are read, so they can exercise a large degree of control over systemd.
For the nfs-server dependency, with respect to various mount points, we want to read /etc/exports and add a RequiresMountsFor= directive for each exported directory. Then we want to read /etc/fstab and add a Before=MOUNT_POINT.mount directive for each MOUNT_POINT of an nfs or nfs4 filesystem. As library code already exists for reading both of these files, this all comes to less than 200 lines of code. Once the problem is understood, the answer is easy.
Generators everywhere?
Having experienced the power of systemd generators, I immediately started to wonder how else I might use them. It is tempting to use a generator to automatically disable rpcbind when only NFSv4 is in use, but I think that is a temptation best avoided. rpcbind isn't only used by NFS. NIS, the Network Information Service (previously called "yellow pages") makes use of it, and sites could easily have their own local RPC services. It is best if disabling rpcbind remains a separate administrative decision, for which the "mask" function seems well suited.
In the earlier articles I described a modest amount of complexity required to pass local configuration through systemd to affect the parameters passed to various programs. Using a generator to process the configuration file could make all of that more transparent, or it might just replace one sort of complexity with another. While I don't agree with all the advice the systemd developers provide, this advice from the systemd.generator manual page is certainly worth considering:
Upstream now!
The evidence presented here supports the claim that keeping systemd unit files upstream can benefit all developers and users. The different experiences generated in different contexts were brought together into a single conversation so all could benefit from, and respond to, all the changes. This should not be surprising when one thinks of unit files as just another sort of code used to write the whole system. The only part that seems to be missing from upstream is a place to document the advice that "systemctl mask rpcbind" is the appropriate way to disable rpcbind and rpc-statd when only NFSv4 is in use. Maybe we need an nfs.systemd man page.
Brief items
Development quotes of the week
The developer community, hidden behind screens, have been bought by the big man money, and act to the people as blind policemen throwing bombs at the demanding population.
Newsletters and articles
Development newsletters
- Emacs News (September 26)
- These Weeks in Firefox (September 22)
- What's cooking in git.git (September 21)
- What's cooking in git.git (September 23)
- What's cooking in git.git (September 27)
- OCaml Weekly News (September 27)
- OpenStack Developer Mailing List Digest (September 23)
- Perl Weekly (September 26)
- PostgreSQL Weekly News (September 25)
- Python Weekly (September 22)
- Ruby Weekly (September 22)
- This Week in Rust (September 27)
- Wikimedia Tech News (September 26)
Mitchell: The MIT License, Line by Line
At his blog, Kyle E. Mitchell ("who is not your attorney
") takes a close, line-by-line reading of the popular MIT software license. The details he points out begin on line one with the license's title: "'The MIT License' is a not a single license, but a family of license forms derived from language prepared for releases from the Massachusetts Institute of Technology. It has seen a lot of changes over the years, both for the original projects that used it, and also as a model for other projects. The Fedora Project maintains a kind of cabinet of MIT license curiosities, with insipid variations preserved in plain text like anatomical specimens in formaldehyde, tracing a wayward kind of evolution.
"
Despite the license being only 171 words, Mitchell finds quite a bit to expand on, such as the ambiguities of the phrase "to deal in the Software without restriction": "As a result of this mishmash of legal, industry, general-intellectual-property, and general-use terms, it isn’t clear whether The MIT License includes a patent license. The general language 'deal in' and some of the example verbs, especially 'use', point toward a patent license, albeit a very unclear one. The fact that the license comes from the copyright holder, who may or may not have patent rights in inventions in the software, as well as most of the example verbs and the definition of 'the Software' itself, all point strongly toward a copyright license.
" Nevertheless, Mitchell notes, "despite some crusty verbiage and lawyerly affectation, one hundred and seventy one little words can get a hell of a lot of legal work done
".
Prodromou: Adopt a pump.io server
Evan Prodromou, creator of identi.ca and pump.io, has put a call out for interested parties to adopt the administration of public pump.io microblogging servers, which he is currently funding out of his own pocket. "Almost all of them are on $5/month Digital Ocean droplets, which makes them relatively cheap for a single person to support. If you decide you want to adopt a server, E14N will sell you the domain and all the software and data for $1. But you'll be obligated to keep the server running pump.io for at least a year, and if you decide you don't want to run it, you have to sell it back to me.
" There are currently around 25 servers in the federated network initially started by Prodromou, which does not count other pump.io instances. He notes that one important exception is the identi.ca site, which is significantly larger than the rest, and which he would like to find a trusted non-profit organization to maintain.
Page editor: Rebecca Sobol
Announcements
Brief items
Announcing the KDE Advisory Board
KDE e.V. introduces the KDE Advisory Board. "One of the core goals of the Advisory Board is to provide KDE with insights into the needs of the various organizations that surround us. We are very aware that we need the ability to combine our efforts for greater impact and the only way we can do that is by adopting a more diverse view from outside of our organization on topics that are relevant to us. This will allow all of us to benefit from one another's experience."
Articles of interest
Garrett: Microsoft aren't forcing Lenovo to block free operating systems
Matthew Garrett looks at the real problem behind the inability of some Lenovo laptops to run Linux. "The real problem here is that Intel do very little to ensure that free operating systems work well on their consumer hardware - we still have no information from Intel on how to configure systems to ensure good power management, we have no support for storage devices in "RAID" mode and we have no indication that this is going to get better in future. If Intel had provided that support, this issue would never have occurred."
Calls for Presentations
LibrePlanet call for proposals
The next LibrePlant will be held March 25-26, 2017 in the Boston, MA area. "This year, the theme of LibrePlanet is "The Roots of Freedom." This encompasses the historical "roots" of the free software movement -- the Four Freedoms, the GNU General Public License and copyleft, and a focus on strong security and privacy protections -- and the concept of roots as a strong foundation from which the movement grows." The call for proposals closes November 14.
CFP Deadlines: September 29, 2016 to November 28, 2016
The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.
| Deadline | Event Dates | Event | Location |
|---|---|---|---|
| September 30 | November 12 November 13 |
T-Dose | Eindhoven, Netherlands |
| September 30 | December 3 | NoSlidesConf | Bologna, Italy |
| September 30 | November 5 November 6 |
OpenFest 2016 | Sofia, Bulgaria |
| September 30 | November 29 November 30 |
5th RISC-V Workshop | Mountain View, CA, USA |
| September 30 | December 27 December 30 |
Chaos Communication Congress | Hamburg, Germany |
| October 1 | October 22 | 2016 Columbus Code Camp | Columbus, OH, USA |
| October 19 | November 19 | eloop 2016 | Stuttgart, Germany |
| October 25 | May 8 May 11 |
O'Reilly Open Source Convention | Austin, TX, USA |
| October 26 | November 5 | Barcelona Perl Workshop | Barcelona, Spain |
| October 28 | November 25 November 27 |
Pycon Argentina 2016 | Bahía Blanca, Argentina |
| October 30 | February 17 | Swiss Python Summit | Rapperswil, Switzerland |
| October 31 | February 4 February 5 |
FOSDEM 2017 | Brussels, Belgium |
| November 11 | November 11 November 12 |
Linux Piter | St. Petersburg, Russia |
| November 11 | January 27 January 29 |
DevConf.cz 2017 | Brno, Czech Republic |
| November 13 | December 10 | Mini Debian Conference Japan 2016 | Tokyo, Japan |
| November 15 | March 2 March 5 |
Southern California Linux Expo | Pasadena, CA, USA |
| November 15 | March 28 March 31 |
PGConf US 2017 | Jersey City, NJ, USA |
| November 18 | February 18 February 19 |
PyCaribbean | Bayamón, Puerto Rico, USA |
| November 20 | December 10 December 11 |
SciPy India | Bombay, India |
| November 21 | January 16 | Linux.Conf.Au 2017 Sysadmin Miniconf | Hobart, Tas, Australia |
| November 21 | January 16 January 17 |
LCA Kernel Miniconf | Hobart, Australia |
If the CFP deadline for your event does not appear here, please tell us about it.
Upcoming Events
Netdev 1.2 updates
Netdev 1.2 takes place October 5-7 in Tokyo, Japan. The final program is available, plus some travel tips, and more.SFLC 2016 Annual Conference at Columbia Law School
The Software Freedom Law Center has announced the program for its conference, to be held October 28 in New York, NY. "We have assembled what we think will be a very lively and interesting program, which you can find summarized below. The ideas are free as in freedom, and---as always---attendance, NYS continuing legal education credits, lunch, and the various drinks are free as in beer."
Events: September 29, 2016 to November 28, 2016
The following event listing is taken from the LWN.net Calendar.
| Date(s) | Event | Location |
|---|---|---|
| September 27 September 29 |
OpenDaylight Summit | Seattle, WA, USA |
| September 28 September 30 |
Kernel Recipes 2016 | Paris, France |
| September 28 October 1 |
systemd.conf 2016 | Berlin, Germany |
| September 30 October 2 |
Hackers Congress Paralelní Polis | Prague, Czech Republic |
| October 1 October 2 |
openSUSE.Asia Summit | Yogyakarta, Indonesia |
| October 3 October 5 |
OpenMP Conference | Nara, Japan |
| October 4 October 6 |
LinuxCon Europe | Berlin, Germany |
| October 4 October 6 |
ContainerCon Europe | Berlin, Germany |
| October 5 October 7 |
International Workshop on OpenMP | Nara, Japan |
| October 5 October 7 |
Netdev 1.2 | Tokyo, Japan |
| October 6 October 7 |
PyConZA 2016 | Cape Town, South Africa |
| October 7 October 8 |
Ohio LinuxFest 2016 | Columbus, OH, USA |
| October 8 October 9 |
Gentoo Miniconf 2016 | Prague, Czech Republic |
| October 8 October 9 |
LinuxDays 2016 | Prague, Czechia |
| October 10 October 11 |
GStreamer Conference | Berlin, Germany |
| October 11 | Real-Time Summit 2016 | Berlin, Germany |
| October 11 October 13 |
Embedded Linux Conference Europe | Berlin, Germany |
| October 12 | Tracing Summit | Berlin, Germany |
| October 13 | OpenWrt Summit | Berlin, Germany |
| October 13 October 14 |
Lua Workshop 2016 | San Francisco, CA, USA |
| October 17 October 19 |
O'Reilly Open Source Convention | London, UK |
| October 18 October 20 |
Qt World Summit 2016 | San Francisco, CA, USA |
| October 21 October 23 |
Software Freedom Kosovo 2016 | Prishtina, Kosovo |
| October 22 | 2016 Columbus Code Camp | Columbus, OH, USA |
| October 22 October 23 |
Datenspuren 2016 | Dresden, Germany |
| October 25 October 28 |
OpenStack Summit | Barcelona, Spain |
| October 26 October 27 |
All Things Open | Raleigh, NC, USA |
| October 27 October 28 |
Rust Belt Rust | Pittsburgh, PA, USA |
| October 28 October 30 |
PyCon CZ 2016 | Brno, Czech Republic |
| October 29 October 30 |
PyCon.de 2016 | Munich, Germany |
| October 29 October 30 |
PyCon HK 2016 | Hong Kong, Hong Kong |
| October 31 | PyCon Finland 2016 | Helsinki, Finland |
| October 31 November 1 |
Linux Kernel Summit | Santa Fe, NM, USA |
| October 31 November 2 |
O’Reilly Security Conference | New York, NY, USA |
| November 1 November 4 |
PostgreSQL Conference Europe 2016 | Tallin, Estonia |
| November 1 November 4 |
Linux Plumbers Conference | Santa Fe, NM, USA |
| November 3 | Bristech Conference 2016 | Bristol, UK |
| November 4 November 6 |
FUDCon Phnom Penh | Phnom Penh, Cambodia |
| November 5 | Barcelona Perl Workshop | Barcelona, Spain |
| November 5 November 6 |
OpenFest 2016 | Sofia, Bulgaria |
| November 7 November 9 |
Velocity Amsterdam | Amsterdam, Netherlands |
| November 9 November 11 |
O’Reilly Security Conference EU | Amsterdam, Netherlands |
| November 11 November 12 |
Seattle GNU/Linux Conference | Seattle, WA, USA |
| November 11 November 12 |
Linux Piter | St. Petersburg, Russia |
| November 12 November 13 |
T-Dose | Eindhoven, Netherlands |
| November 12 November 13 |
Mini-DebConf | Cambridge, UK |
| November 12 November 13 |
PyCon Canada 2016 | Toronto, Canada |
| November 13 November 18 |
The International Conference for High Performance Computing, Networking, Storage and Analysis | Salt Lake City, UT, USA |
| November 14 November 16 |
PGConfSV 2016 | San Francisco, CA, USA |
| November 14 | The Third Workshop on the LLVM Compiler Infrastructure in HPC | Salt Lake City, UT, USA |
| November 14 November 18 |
Tcl/Tk Conference | Houston, TX, USA |
| November 16 November 18 |
ApacheCon Europe | Seville, Spain |
| November 16 November 17 |
Paris Open Source Summit | Paris, France |
| November 17 | NLUUG (Fall conference) | Bunnik, The Netherlands |
| November 18 November 20 |
GNU Health Conference 2016 | Las Palmas, Spain |
| November 18 November 20 |
UbuCon Europe 2016 | Essen, Germany |
| November 19 | eloop 2016 | Stuttgart, Germany |
| November 21 November 22 |
Velocity Beijing | Beijing, China |
| November 24 | OWASP Gothenburg Day | Gothenburg, Sweden |
| November 25 November 27 |
Pycon Argentina 2016 | Bahía Blanca, Argentina |
If your event does not appear here, please tell us about it.
Page editor: Rebecca Sobol
