LWN.net Weekly Edition for August 27, 2015 [LWN.net]

A look at The Machine

By Jake Edge
August 26, 2015

In what was perhaps one of the shortest keynotes on record (ten minutes), Keith Packard outlined the hardware architecture of "The Machine"—HP's ambitious new computing system. That keynote took place at LinuxCon North America in Seattle and was thankfully followed by an hour-long technical talk by Packard the following day (August 18), which looked at both the hardware and software for The Machine. It is, in many ways, a complete rethinking of the future of computers and computing, but there is a fairly long way to go between here and there.

The hardware

The basic idea of the hardware is straightforward. Many of the "usual computation units" (i.e. CPUs or systems on chip—SoCs) are connected to a "massive memory pool" using photonics for fast interconnects. That leads to something of an equation, he said: electrons (in CPUs) + photons (for communication) + ions (for memory storage) = computing. Today's computers transfer a lot of data and do so over "tiny little pipes". The Machine, instead, can address all of its "amazingly huge pile of memory" from each of its many compute elements. One of the underlying principles is to stop moving memory around to use it in computations—simply have it all available to any computer that needs it.

Some of the ideas for The Machine came from HP's DragonHawk systems, which were traditional symmetric multiprocessing systems, but packed a "lot of compute in a small space". DragonHawk systems would have 12TB of RAM in an 18U enclosure, while the nodes being built for The Machine will have 32TB of memory in 5U. It is, he said, a lot of memory and it will scale out linearly. All of the nodes will be connected at the memory level so that "every single processor can do a load or store instruction to access memory on any system".

Nodes in this giant cluster do not have to be homogeneous, as long as they are all hooked to the same memory interconnect. The first nodes that HP is building will be homogeneous, just for pragmatic reasons. There are two circuit boards on each node, one for storage and one for the computer. Connecting the two will be the "next generation memory interconnect" (NGMI), which will also connect both parts of the node to the rest of the system using photonics.

The compute part of the node will have a 64-bit ARM SoC with 256GB of purely local RAM along with a field-programmable gate array (FPGA) to implement the NGMI protocol. The storage part will have four banks of memory (each with 1TB), each with its own NGMI FPGA. A given SoC can access memory elsewhere without involving the SoC on the node where the memory resides—the NGMI bridge FPGAs will talk to their counterpart on the other node via the photonic interface. Those FPGAs will eventually be replaced by application-specific integrated circuits (ASICs) once the bugs are worked out.

ARM was chosen because it was easy to get those vendors to talk with the project, Packard said. There is no "religion" about the instruction set architecture (ISA), so others may be used down the road.

Eight of these nodes can be collected up into a 5U enclosure, which gives eight processors and 32TB of memory. Ten of those enclosures can then be placed into a rack (80 processors, 320TB) and multiple racks can all be connected on the same "fabric" to allow addressing up to 32 zettabytes (ZB) from each processor in the system.

The storage and compute portions of each node are powered separately. The compute piece has two 25Gb network interfaces that are capable of remote DMA. The storage piece will eventually use some kind of non-volatile/persistent storage (perhaps even the fabled memristor), but is using regular DRAM today, since it is available and can be used to prove the other parts of the design before switching.

SoCs in the system may be running more than one operating system (OS) and for more than one tenant, so there are some hardware protection mechanisms built into the system. In addition, the memory-controller FPGAs will encrypt the data at rest so that pulling a board will not give access to the contents of the memory even if it is cooled (à la cold boot) or when some kind of persistent storage is used.

At one time, someone said that 640KB of memory should be enough, Packard said, but now he is wrestling with the limits of the 48-bit addresses used by the 64-bit ARM and Intel CPUs. That only allows addressing up to 256TB, so memory will be accessed in 8GB "books" (or, sometimes, 64KB "booklettes"). Beyond the SoC, the NGMI bridge FPGA (which is also called the "Z bridge") deals with two different kinds of addresses: 53-bit logical Z addresses and 75-bit Z addresses. Those allow addressing 8PB and 32ZB respectively.

The logical Z addresses are used by the NGMI firewall to determine the access rights to that memory for the local node. Those access controls are managed outside of whatever OS is running on the SoC. So the mapping of memory is handled by the OS, while the access controls for the memory are part of the management of The Machine system as a whole.

NGMI is not intended to be a proprietary fabric protocol, Packard said, and the project is trying to see if others are interested. A memory transaction on the fabric looks much like a cache access. The Z address is presented and 64 bytes are transferred.

The software

Packard's group is working on GPL operating systems for the system, but others can certainly be supported. If some "proprietary Washington company" wanted to port its OS to The Machine, it certainly could. Meanwhile, though, other groups are working on other free systems, but his group is made up of "GPL bigots" that are working on Linux for the system. There will not be a single OS (or even distribution or kernel) running on a given instance of the The Machine—it is intended to support multiple different environments.

Probably the biggest hurdle for the software is that there is no cache coherence within the enormous memory pool. Each SoC has its own local memory (256GB) that is cache coherent, but accesses to the "fabric-attached memory" (FAM) between two processors are completely uncoordinated by hardware. That has implications for applications and the OS that are using that memory, but OS data structures should be restricted to the local, cache-coherent memory as much as possible.

For the FAM, there is a two-level allocation scheme that is arbitrated by a "librarian". It allocates books (8GB) and collects them into "shelves". The hardware protections provided by the NGMI firewall are done on book boundaries. A shelf could be a collection of books that are scattered all over the FAM in a single load-store domain (LSD—not Packard's favorite acronym, he noted), which is defined by the firewall access rules. That shelf could then be handed to the OS to be used for a filesystem, for example. That might be ext4, some other Linux filesystem, or the new library filesystem (LFS) that the project is working on.

Talking to the memory in a shelf uses the POSIX API. A process does an open() on a shelf and then uses mmap() to map the memory into the process. Underneath, it uses the direct access (DAX) support to access the memory. For the first revision, LFS will not support sparse files. Also, locking will not be global throughout an LSD, but will be local to an OS running on a node.

For management of the FAM, each rack will have a "top of rack" management server, which is where the librarian will run. That is a fairly simple piece of code that just does bookkeeping and keeps track of the allocations in a SQLite database. The SoCs are the only parts of the system that can talk to the firewall controller, so other components communicate with a firewall proxy that runs in user space, which relays queries and updates. There are a "whole bunch of potential adventures" in getting the memory firewall pieces all working correctly, Packard said.

The lack of cache coherence makes atomic operations on the FAM problematic, as traditional atomics rely on that feature. So the project has added some hardware to the bridges to do atomic operations at that level. There is a fam_atomic library to access the operations (fetch and add, swap, compare and store, and read), which means that each operation is done at the cost of a system call. Once again, this is just the first implementation; other mechanisms may be added later. One important caveat is that the FAM atomic operations do not interact with the SoC cache, so applications will need to flush those caches as needed to ensure consistency.

Physical addresses at the SoC level can change, so there needs to be support for remapping those addresses. But the SoC caches and DAX both assume static physical mappings. A subset of the physical address space will be used as an aperture into the full address space of the system and books can be mapped into that aperture.

Flushing the SoC cache line by line would "take forever", so a way to flush the entire cache when the physical address mappings change has been added. In order to do that, two new functions have been added to the Intel persistent memory library (libpmem): one to check for the presence of non-coherent persistent memory (pmem_needs_invalidate()) and another to invalidate the CPU cache (pmem_invalidate()).

In a system of this size, with the huge amounts of memory involved, there needs to be well-defined support for memory errors, Packard said. Read is easy—errors are simply signaled synchronously—but writes are trickier because the actual write is asynchronous. Applications need to know about the errors, though, so SIGBUS is used to signal an error. The pmem_drain() call will act as a barrier, such that errors in writes before that call will signal at or before the call. Any errors after the barrier will be signaled post-barrier.

There are various areas where the team is working on free software, he said, including persistent memory and DAX. There is also ongoing work on concurrent/distributed filesystems and non-coherent cache management. Finally, reliability, availability, and serviceability (RAS) are quite important to the project, so free software work is proceeding in that area as well.

Even with two separate sessions, it was a bit of a whirlwind tour of The Machine. As he noted, it is an environment that is far removed from the desktop world Packard had previously worked in. By the sound, there are plenty of challenges to overcome before The Machine becomes a working computing device—it will be an interesting process to watch.

[I would like to thank the Linux Foundation for travel assistance to Seattle for LinuxCon North America.]

Comments (20 posted)

Data visualizations in text

By Nathan Willis
August 26, 2015

TypeCon

Data visualization is often thought of in terms of pixels; considerable work goes into shaping large data sets into a form where spatial relationships are made clear and where colors, shapes, intensity, and point placement encode various quantities for rapid understanding. At TypeCon 2015 in Denver, though, researcher Richard Brath presented a different approach: taking advantage of readers' familiarity with the written word to encode more information into text itself.

Brath is a PhD candidate at London South Bank University where, he said, "I turn data into shapes and color and so on." Historically speaking, though, typography has not been a part of that equation. He showed a few examples of standard data visualizations, such as "heatmap" diagrams. Even when there are multiple variables under consideration, the only typography involved is plain text labels. "Word clouds" are perhaps the only well-known example of visualizations that involve altering text based on data, but even that is trivial: the most-frequent words or tags are simply bigger. More can certainly be done.

Indeed, more has been done—at least on rare occasion; Brath has cataloged and analyzed instances where other type attributes have been exploited to encode additional information in visualizations. An oft-overlooked example, he said, is cartography: subtle changes in spacing, capitalization, and font weight are used to indicate many distinct levels of map features. The reader may not consciously recognize it, but the variations give cues as to which neighboring text labels correspond to which map features. Some maps even incorporate multiple types of underline and reverse italics in addition to regular italics (two features that are quite uncommon elsewhere).

Brath also showed several historical charts and diagrams (some dating back to the 18th Century) that used typographic features to encode information. Royal family trees, for example, would sometime vary the weight, slant, and style of names to indicate the pedigree and status of various family members. A far more modern example of signifying information with font attributes, he said, can be seen in code editors, where developers take it for granted that syntax highlighting will distinguish between symbols, operators, and structures—hopefully without adversely impacting readability.

On the whole, though, usage of these techniques is limited to specific niches. Brath set out to catalog the typographic features that were employed, then to try an apply them to entirely new data-visualization scenarios. The set of features available for encoding information included standard properties like weight and slant, plus capitalization, x-height, width (i.e., condensed through extended), spacing, serif type, stroke contrast, underline, and the choice of typeface itself. Naturally, some of those attributes map well to quantitative data (such as weight, which can be varied continuously throughout a range), while others would only be useful for encoding categorical information (such as whether letters are slanted or upright).

He then began creating and testing a variety of visualizations in which he would encode information by varying some of the font attributes. Perhaps the most straightforward example was the "text-skimming" technique: a preprocessor varies the weight of individual words in a document based on their overall frequency in the language used. Unusual words are bolder, common words are lighter, with several gradations incorporated. Articles and pronouns can even be put into italics to further differentiate them from the more critical parts of the text. The result is a paragraph that, in user tests, readers can skim through at significantly higher speed; it is somewhat akin to overlaying a word-frequency cloud on top of the text itself.

A bit further afield was Brath's example of encoding numeric data linearly in a line of text. He took movie reviews from the Rotten Tomatoes web site and used each reviewer's numeric rating as the percentage of the text rendered in bold. The result, when all of the reviews for a particular film are placed together, effectively maps a histogram of the reviews onto the reviews themselves. In tests, he said, participants typically found it easier to extract information from this form than from Rotten Tomatoes's default layout, which places small numbers next to review quotes in a grid, intermingled with various images.

He also showed examples of visualization techniques that varied multiple font attributes to encode more than one variable. The first was a response to limitations of choropleth maps—maps where countries or other regions are colored (or shaded) to indicate a particular score on some numeric scale. While choropleths work fine for single variables, it is hard to successfully encode multiple variables using the technique, and looking back and forth between multiple single-variable choropleth maps makes it difficult for the reader to notice any correlations between them.

Brath's technique encoded three variables (health expenditure as a percentage of GDP, life expectancy, and prevalence of HIV) into three font attributes (weight, case, and slant), using the three-letter ISO country codes as the text for each nation on the map. The result makes it easier to zero in on particular combinations of the variables (for example, countries with high health expenditures and short life expectancies) or, at least, easier than flipping back and forth between three maps.

His final example of multi-variable encoding used x-height and font width to encode musical notation into text. The use case presented was how to differentiate singing from prose within a book. Typically, the only typographic technique employed in a book is to offset the sung portion of the text and set it in italics. Brath, instead, tested varying the heights of the letters to indicate note pitch and the widths to indicate note duration.

The reaction to this technique from the audience at TypeCon was, to say the least, mixed. While it is clear that the song text encodes some form of rhythmic changes and variable intensity, it does not map easily to notes, and the rendered text is not exactly easy to look at. Brath called it a work in progress; his research is far from over.

He ended the session by encouraging audience members to visit his research blog and take the latest survey to test the impact of some of the visualization techniques firsthand. He also posed several questions to the crowd, such as why there were many font families that come with a variety of different weights, but essentially none that offer multiple x-height options or italics with multiple angles of slant.

Brath's blog makes for interesting reading for anyone concerned with data visualizations or text. He often explores practical issues—for example, how overuse of color and negatively impact text legibility, which could have implications for code markup tools, or the difficulties to overcome when trying to slant text at multiple angles. Programmers, who spend much of their time staring at text, are no doubt already familiar with many ways in which typographic features can encode supplementary information (in this day and age, who does not associate a hyperlink closely with an underline, after all?). But there are certainly still many places where the attributes of text might be used to make data easier to find or understand.

Comments (3 posted)

Topics from the LLVM microconference

By Jake Edge
August 26, 2015

Linux Plumbers Conference

A persistent theme throughout the LLVM microconference at the 2015 Linux Plumbers Conference was that of "breaking the monopoly" of GCC, the GNU C library (glibc), and other tools that are relied upon for building Linux systems. One could quibble with the "monopoly" term, since it is self-imposed and not being forced from the outside, but the general idea is clear: using multiple tools to build our software will help us in numerous ways.

Kernel and Clang

Most of the microconference was presentation-oriented, with relatively little discussion. Jan-Simon Möller kicked things off with a status report on the efforts to build a Linux kernel using LLVM's Clang C compiler. The number of patches needed for building the kernel has dropped from around 50 to 22 "small patches", he said. Most of those are in the kernel build system or are for little quirks in driver code. Of those, roughly two-thirds can likely be merged upstream, while the others are "ugly hacks" that will probably stay in the LLVM-Linux tree.

There are currently five patches needed in order to build a kernel for the x86 architecture. Two of those are for problems building the crypto code (the AES_NI assembly code will not build with the LLVM integrated assembler and there are longstanding problems with variable-length arrays in structures). The integrated assembler also has difficulty handling some "assembly" code that is used by the kernel build system to calculate offsets; GCC sees it as a string, but the integrated assembler tries to actually assemble it.

The goal of building an "allyesconfig" kernel has not yet been realized, but a default configuration (defconfig) can be built using the most recent Git versions of LLVM and Clang. It currently requires disabling the integrated assembler for the entire build, but the goal is to disable it just for the files that need it.

Other architectures (including ARM for the Raspberry Pi 2) can be built using roughly half-a-dozen patches per architecture, Möller said. James Bottomley was concerned about the "Balkanization" of kernel builds once Linus Torvalds and others start using Clang for their builds; obsolete architectures and those not supported by LLVM may stop building altogether, he said. But microconference lead Behan Webster thought that to be an unlikely outcome. Red Hat and others will always build their kernels using GCC, he said, so that will be supported for quite a long time, if not forever.

Using multiple compilers

Kostya Serebryany is a member of the "dynamic testing tools" team at Google, which has the goal of providing tools for the C++ developers at the company to find bugs without any help from the team. He was also one of the proponents of the "monopoly" term for GCC, since it is used to build the kernel, glibc, and all of the distribution binaries. But, he said, making all of that code buildable using other compilers will allow various other tools to also be run on the code.

For example, the AddressSanitizer (ASan) can be used to detect memory errors such as stack overflow, use after free, using stack memory after a function has returned, and so on. Likewise, ThreadSanitizer (TSan), MemorySanitizer (MSan), and UndefinedBehaviorSanitizer (UBSan) can find various kinds of problems in C and C++ code. ~~But all are based on Clang and LLVM, so only code that can be built with that compiler suite can be sanitized using these tools.~~

GCC already has some similar tools and the Linux kernel has added some as well (the kernel address sanitizer, for example), which have found various bugs, including quite a few security bugs. GCC's support has largely come about because of the competition with LLVM and still falls short in some areas, he said.

Beyond that, though, there are other techniques beyond "best effort" tools like the sanitizers. For example, fuzzing and hardening are two techniques that can be used to either find more bugs or eliminate certain classes of bugs. He stated that coverage-guided fuzzing can be used to narrow in on problem areas in the code. LLVM's LibFuzzer can be used to perform that kind of fuzzing. He noted that the Heartbleed bug can be "found" using LibFuzzer in roughly five seconds on his laptop.

Two useful hardening techniques are also available with LLVM: control flow integrity (CFI) and SafeStack. CFI will abort the program when it detects certain kinds of undesired behavior—for example that the virtual function table for a program has been altered. SafeStack protects against stack overflows by placing local variables on a separate stack. That way, the return address and any variables are not contiguous in memory.

Serebryany said that it was up to the community to break the monopoly. He was not suggesting simply switching to using LLVM exclusively, but to ensuring that the kernel, glibc, and distributions all could be built with it. Furthermore, he said that continuous integration should be set up so that all of these pieces can always be built with both compilers. When other compilers arrive, they should also be added into the mix.

To that end, Webster asked if Google could help getting the kernel patches needed to build with Clang upstream. Serebryany said that he thought that, by showing some of the advantages of being able to build with Clang (such as the fuzzing support), Google might be able to help get those patches merged.

BPF and LLVM

The "Berkeley Packet Filter" (BPF) language has expanded its role greatly over the years, moving from simply being used for packet filtering to now providing the in-kernel virtual machine for security (seccomp), tracing, and more. Alexei Starovoitov has been the driving force behind extending the BPF language (into eBPF) as well as expanding its scope in the kernel. LLVM can be used to compile eBPF programs for use by the kernel, so Starovoitov presented about the language and its uses at the microconference.

He began by noting wryly that he "works for Linus Torvalds" (in the same sense that all kernel developers do). He merged his first patches into GCC some fifteen years ago, but he has "gone over to Clang" in recent years.

The eBPF language is supported by both GCC and LLVM using backends that he wrote. He noted that the GCC backend is half the size of the LLVM version, but that the latter took much less time to write. "My vote goes to LLVM for the simplicity of the compiler", he said. The LLVM-BPF backend has been used to demonstrate how to write a backend for the compiler. It is now part of LLVM stable and will be released as part of LLVM 3.7.

GCC is built for a single backend, so you have to specifically create a BPF version, but LLVM has all of its backends available using command-line arguments (--target bpf). LLVM also has an integrated assembler that can take the C code describing the BPF and turn it into in-memory BPF bytecode that can be loaded into the kernel.

BPF for tracing is currently a hot area, Starovoitov said. It is a better alternative to SystemTap and runs two to three times faster than Oracle's DTrace. Part of that speed comes from LLVM's optimizations plus the kernel's internal just-in-time compiler for BPF bytecode.

Another interesting tool is the BPF Compiler Collection (BCC). It makes it easy to write and run BPF programs by embedding them into Python (either directly as strings in the Python program or by loading them from a C file). Underneath the Python "bpf" module is LLVM, which compiles the program before the Python code loads it into the kernel. A simple printk() can easily be added into the kernel without recompiling it (or rebooting). He noted that Brendan Gregg has added a bunch of example tools to show how to use the C+Python framework.

Under the covers, the framework uses libbpfprog that compiles a C source file into BPF bytecode using Clang/LLVM. It can also load the bytecode and any BPF maps to the kernel using the bpf() system call and attach the program(s) to various types of hooks (e.g. kprobes, tc classifiers/actions). The Python bpf module simply provides bindings for the library.

The presentation was replete with examples, which are available in the slides [PDF] as well.

Alternatives for the core

There was a fair amount of overlap between the last two sessions I was able to sit in on. Both Bernhard Rosenkraenzer and Khem Raj were interested in replacing more than just the compiler in building a Linux system. Traditionally, building a Linux system starts with GCC, glibc, and binutils, but there are now alternatives to those. How much of a Linux system can be built using those alternatives?

Some parts of binutils are still needed, Rosenkraenzer said. The binutils gold linker can be used instead of the traditional ld. (Other linker options were presented in Mark Charlebois's final session of the microconference, which I unfortunately had to miss.) The gas assembler from binutils can be replaced with Clang's integrated assembler for the most part, but there are still non-standard assembly constructs that require gas.

Tools like nm, ar, ranlib, and others will need to be made to understand three different formats: regular object files, LLVM bytecode, and the GCC intermediate representation. Rosenkraenzer showed a shell-script wrapper that could be used to add this support to various utilities.

For the most part, GCC can be replaced by Clang. OpenMandriva switched to Clang as its primary compiler in 2014. The soon-to-be-released OpenMandriva 3 is almost all built with Clang 3.7. Some packages are still built with gcc or g++, however. OpenMandriva still needed to build GCC, though, to get libraries that were needed such as libgcc, libatomic, and others (including, possibly, libstdc++).

The GCC compatibility claimed by Clang is too conservative, Rosenkraenzer said. The __GNUC__ macro definition in Clang is set to 4.2.1, but switching that to 4.9 produces better code. There were several thoughts on why Clang has chosen 4.2.1, though both are related: 4.2.1 was the last GPLv2 release of GCC, so some people may not be allowed to look at later versions; in addition, GCC 4.2.1 was the last version that was used to build the BSD portions of OS X.

There are a whole list of GCC-isms that should be avoided for compatibility with Clang. Rosenkraenzer's slides [PDF] list many of them. He noted that there have been a number of bugs found via Clang warnings or errors when building various programs—GCC did not complain about those problems.

Another "monopoly component" that one might want to replace would be glibc. The musl libc alternative is viable, but only if binary compatibility with other distributions is not required. But musl cannot be built with Clang, at least yet.

Replacing GCC's libstdc++ with LLVM's libc++ is possible but, again, binary compatibility is sacrificed. That is a bigger problem than it is for musl, though, Rosenkraenzer said. Using both is possible, but there are problems when libraries (e.g. Qt) are linked to, say, libc++ and a binary-only Qt program uses libstdc++, which leads to crashes. libc++ is roughly half the size of libstdc++, however, so environments like Android (which never used libstdc++) are making the switch.

Cross-compiling under LLVM/Clang is easier since all of the backends are present and compilers for each new target do not need to be built. There is still a need to build the cross-toolchains, though, for binutils, libatomic, and so on. Rosenkraenzer has been working on a tool to do automated bootstrapping of the toolchain and core system.

Conclusion

It seems clear that use of LLVM within Linux is growing and that growth is having a positive effect. The competition with GCC is helping both to become better compilers, while building our tools with both is finding bugs in critical components like the kernel. Whether it is called "breaking the monopoly" or "diversifying the build choices", this trend is making beneficial changes to our ecosystem.

[I would like to thank the Linux Plumbers Conference organizing committee for travel assistance to Seattle for LPC.]

Comments (17 posted)

Reviving the Hershey fonts

By Nathan Willis
August 26, 2015

TypeCon

At the 2015 edition of TypeCon in Denver, Adobe's Frank Grießhammer presented his work reviving the famous Hershey fonts from the Mid-Century era of computing. The original fonts were tailor-made for early vector-based output devices but, although they have retained a loyal following (often as a historical curiosity), they have never before been produced as an installable digital font.

Grießhammer started his talk by acknowledging his growing reputation for obscure topics—in 2013, he presented a tool for rapid generation of the Unicode box-drawing characters—but argued that the Hershey fonts were overdue for proper recognition. He first became interested in the fonts and their peculiar history in 2014, when he was surprised to find a well-designed commercial font that used only straight line segments for its outlines. The references indicated that this choice was inspired by the Hershey fonts, which led Grießhammer to dig into the topic further.

The fonts are named for their creator, Allen V. Hershey (1910–2004), a physicist working at the US Naval Weapons Laboratory in the 1960s. At that time, the laboratory used one of the era's most advanced computers, the IBM Naval Ordnance Research Calculator (NORC), a vacuum-tube and magnetic-tape based machine. NORC's output was provided by the General Dynamics S-C 4020, which could either plot on a CRT display or directly onto microfilm. It was groundbreaking for the time, since the S-C 4020 could plot diagrams and charts directly, rather than simply outputting tables that had to be hand-drawn by draftsmen after the fact.

By default, the S-C 4020 would output text by projecting light through a set of letter stencils, but Hershey evidently saw untapped potential in the S-C 4020's plotting capabilities. Using the plotting functions, he designed a set of high-quality Latin fonts (both upright and italics), followed by Greek, a full set of mathematical and technical symbols, blackletter and Lombardic letterforms, and an extensive set of Japanese glyphs—around 2,300 characters in total. Befitting the S-C 4020's plotting capabilities, the letters were formed entirely by straight line segments.

The format used to store the coordinates of the curves is, to say the least, unusual. Each coordinate point is stored as pair of ASCII letters, where the numeric value of each letter is found by taking its offset from the letter R. That is, "S" has a value of +1, while "L" has a value of -6. The points are plotted with the origin in the center of the drawing area, with x increasing to the right and y increasing downward.

Typographically, Hershey's designs were commendable; he drew his characters based on historical samples, implemented his own ligatures, and even created multiple optical sizes. Hershey then proceeded to develop four separate styles that each used different numbers of strokes (named "simplex," "duplex," "complex," and "triplex").

The project probably makes Hershey the inventor of "desktop publishing" if not "digital type" itself, Grießhammer said, but Hershey himself is all but forgotten. There is scant information about him online, Grießhammer said; he has still not even been able to locate a photograph (although, he added, Hershey may be one of the unnamed individuals seen in group shots of the NORC room, which can be found online).

Hershey's vector font set has lived on as a subject for computing enthusiasts, however. The source files are in the public domain (a copy of the surviving documents is available from the Ghostscript project, for example) and there are a number of software projects online that can read their peculiar format and reproduce the shapes. At his GitHub page, Grießhammer has links to several of them, such as Kamal Mostafa's libhersheyfont. Inkscape users may also be familiar with the Hershey Text extension, which can generate SVG paths based on a subset of the Hershey fonts. In that form, the paths are suitable for use with various plotters, laser-cutters, or CNC mills; the extension was developed by Evil Mad Scientist Laboratories for use with such devices.

Nevertheless, there has never been an implementation of the designs in PostScript, TrueType, or OpenType format, so they cannot be used to render text in standard widgets or elements. Consequently, Grießhammer set out to create his own. He wrote a script to convert the original vector instructions into Bézier paths in UFO format, then had to associate the resulting shapes with the correct Unicode codepoints—Hershey's work having predated Unicode by decades.

The result is not quite ready for release, he said. Hershey's designs are zero-width paths, which makes sense for drawing with a CRT, but is not how modern outline fonts work. To be usable in TrueType or OpenType form, each line segment needs to be traced in outline form to make a thin rectangle. That can be done, he reported, but he is still working out what outlining options create the most useful final product. The UFO files, though, can be used to create either TrueType or OpenType fonts.

When finished, Grießhammer said, he plans to release the project under an open source license at github.com/adobe-fonts/hershey. He hopes that it will not only be useful, but will also bring some more attention to Hershey himself and his contribution to modern digital publishing.

Comments (7 posted)

Nested NMIs lead to CVE-2015-3290

By Jake Edge
August 26, 2015

Non-maskable interrupts (or NMIs) are a hardware feature that is typically used to signal hardware errors or other unrecoverable faults. They differ from regular interrupts in that they can occur when interrupts are otherwise blocked (i.e. they are not maskable). NMIs can be caused by user-space programs, though, so their handling in the kernel needs to be bulletproof or it can lead to security holes. Since the beginning of 2014, it would seem that NMI handling has been subject to races that allow user-space programs to elevate their privileges—a bug that is known as CVE-2015-3290.

NMIs are used by profiling and debugging tools, such as perf, to determine where in the code the CPU is currently executing. NMIs also get nested, effectively, when an NMI handler causes an exception like a breakpoint or a page fault. Handling that nesting is complicated, which is what went astray and led to the bug.

The first notification about the problem came in a July 22 post to the oss-security mailing list from Andy Lutomirski about a number of NMI-handling security bugs. All are security-related, but one was embargoed to allow distributions to fix it before releasing any details. So he mentioned CVE-2015-3290 without giving any details, though he did include something of a warning: "*Patch your systems*".

The details came in a post-embargo advisory from Lutomirski on August 4. In some detail, he described the problem, but also provided a proof-of-concept program to tickle the bug. It requires that user space be able to do two things: arrange for nested NMIs to occur and for those NMIs return to a 16-bit stack, which is normally done for running 16-bit binaries using programs like DOSEMU. A 16-bit stack can be arranged via the modify_ldt() system call. One way to generate the NMIs required is to be run with a heavy perf load, as the proof-of-concept exploit suggests.

The Linux nested-NMI handling relies on a small section of code that needs to be run atomically. That works fine on x86_64 when using iret to return to a 64-bit stack (which effectively does the needed steps in an atomic manner), but when the NMI is returning to a segment with a 16-bit stack, iret does not restore the register state correctly. So the kernel has a workaround (called "espfix64") that tries to handle that situation by doing a complicated stack-switching dance.

That stack switching is where the problem lies. There are approximately 19 instructions where a second (i.e. nested) NMI will cause the "atomic" section to not be atomic. Furthermore, an attacker who can arrange (or luck into) landing in a two-instruction window in those instructions will be able to reliably elevate their privileges to that of root. During that window, the attacker controls the address where the return from interrupt will go. Outside of the window, a nested NMI will cause various failures and crashes, which Lutomirski's exploit will fix up while it waits for one to hit the window:

A careful exploit (attached) can recover from all the crashy failures and can regenerate a valid *privileged* state if a nested NMI occurs during the two-instruction window. This exploit appears to work reasonably quickly across a fairly wide range of Linux versions.

The espfix64 code was added in Linux 3.13, which was released over a year and a half ago in January 2014. Given that Lutomirski's proof of concept works quickly, that means there are (or, hopefully, were) a lot of systems that could be easily affected by this flaw.

The fix uses a "sneaky trick", according to Lutomirski. Instead of checking the value of the 64-bit stack pointer register (i.e. RSP) to see if it points at the NMI stack to determine if there is a nested NMI, a different test is used. As he pointed out, malicious user-space code could point RSP there, issue a system call, then cause an NMI to happen, which would fool the kernel into believing it was processing a nested NMI when it wasn't.

Lutomirski uses the fact that the "direction flag" (DF) bit in the FLAGS register is atomically reset by the iret instruction, so he sets that bit to indicate that the kernel is processing an NMI. His fix also changes the system-call entry point so that a user-space program cannot set DF while it still controls the value of RSP.

CVE-2015-3290 and the rest of the NMI-handling problems that Lutomirski has found seem a little concerning, overall. NMIs are complex beasties and their handling even more so. It would be surprising if there are not other problems lurking there. But, for now, taking Lutomirski's advice should be high on everyone's list.

Comments (5 posted)

Security quotes of the fortnight

Google has been ordered by the [UK] Information Commissioner’s office to remove nine links to current news stories about older reports which themselves were removed from search results under the ‘right to be forgotten’ ruling.

The search engine had previously removed links relating to a 10 year-old criminal offence by an individual after requests made under the right to be forgotten ruling. Removal of those links from Google’s search results for the claimant’s name spurred new news posts detailing the removals, which were then indexed by Google’s search engine.

Google refused to remove links to these later news posts, which included details of the original criminal offence, despite them forming part of search results for the claimant’s name, arguing that they are an essential part of a recent news story and in the public interest.

— The Guardian

GOP presidential candidate Donald Trump immediately called FOX News to say that the EU's actions are a crude start but adding that, "When I'm president you're going to have a really wonderful censorship system here in the USA. It's going to make those Russian and European systems look like stupid, ugly women. You're going to forget there ever were mass arrests and deportations here. I know how to do censorship. You're going to love the Trump censorship system!"

An EU spokesperson noted that upon finalization of this global RTBF [right to be forgotten] censorship order, all search and other references to articles, stories, or other materials describing this order, including this posting, would be retroactively deleted.

— Lauren Weinstein

The Snake Oil Competition (SOC) is an effort organized to identify new craptographic schemes in order to improve on the state-of-the-art, and to encourage the use of snake oil cryptography. Snake Oil cryptography is widely used in practice, but recent events show that more research is urgently needed to fill much needed gaps in the field.

The winner(s) will be invited to a special edition of the Journal of Craptology (JoC). The first prize is a bottle of premium snake oil, and 100 trillion ZWR (Third Zimbabwean Dollar), equivalent to 10²⁷ ZWD (First Zimbabwean Dollar). The loser will also be invited to the JoC.

— snakeoil.cr.yp.to committee

Not just terrorists, but terrorists with a submarine! This is why Ft. Leavenworth, a prison from which no one has ever escaped, is unsuitable for housing Guantanamo detainees.

I've never understood the argument that terrorists are too dangerous to house in US prisons. They're just terrorists, it's not like they're Magneto.

— Bruce Schneier reacts to a movie plot threat promulgated by a Kansas senator

TL;DR: doing RSA crypto with a public exponent value of "1" makes crypto very fast. Fast is not always good.

— Kurt Seifried

Comments (none posted)

Stagefright: Mission Accomplished? (Exodus Intelligence)

It would seem that reports of the demise of the Stagefright Android vulnerability may be rather premature. Exodus Intelligence is reporting that at least one of the fixes for integer overflow did not actually fully fix the problem, so MPEG4 files can still crash Android and potentially allow code execution. "Around July 31st, Exodus Intelligence security researcher Jordan Gruskovnjak noticed that there seemed to be a severe problem with the proposed patch. As the code was not yet shipped to Android devices, we had no ability to verify this authoritatively. In the following week, hackers converged in Las Vegas for the annual Black Hat conference during which the Stagefright vulnerability received much attention, both during the talk and at the various parties and events. After the festivities concluded and the supposedly patched firmware was released to the public, Jordan proceeded to investigate whether his assumptions regarding its fallibility were well founded. They were."

Comments (37 posted)

Ruoho: Multiple Vulnerabilities in Pocket

On his blog, Clint Ruoho reports on multiple vulnerabilities he found in the Pocket service that saves articles and other web content for reading later on a variety of devices. Pocket integration has been controversially added to Firefox recently, which is what drew his attention to the service. "The full output from server-status then was synced to my Android, and was visible when I switched from web to article view. Apache’s mod_status can provide a great deal of useful information, such as internal source and destination IP address, parameters of URLs currently being requested, and query parameters. For Pocket’s app, the URLs being requested include URLs being viewed by users of the Pocket application, as some of these requests are done as HTTP GETs. These details can be omitted by disabling ExtendedStatus in Apache. Most of Pocket’s backend servers had ExtendedStatus disabled, however it remained enabled on a small subset, which would provide meaningful information to attackers." He was able to get more information, such as the contents of /etc/passwd on Pocket's Amazon EC2 servers. (Thanks to Scott Bronson and Pete Flugstad.)

Comments (30 posted)

Reports from the Linux Security Summit

The Linux Security Summit was held August 20-21 in Seattle, Washington. Unfortunately, that overlapped Linux Plumbers Conference, so LWN was unable to attend. But both James Morris and Paul Moore have nice writeups of the summit. From Morris's: "As with the previous year, we followed a two-day format, with most of the refereed presentations on the first day, with more of a developer focus on the second day. We had good attendance, and also this year had participants from a wider field than the more typical kernel security developer group. We hope to continue expanding the scope of participation next year, as it’s a good opportunity for people from different areas of security, and FOSS, to get together and learn from each other. This was the first year, for example, that we had a presentation on Incident Response, thanks to Sean Gillespie who presented on GRR, a live remote forensics tool initially developed at Google."

Comments (none posted)

audit: unsafe escape-sequence handling

Package(s):

audit

CVE #(s):

CVE-2015-5186

Created:

August 19, 2015

Updated:

August 31, 2015

Description:

From the CVE entry:

When auditing the filesystem the names of files are logged. These filenames can contain escape sequences, when viewed using the ausearch programs "-i" option for example this can result in the escape sequences being processed unsafely by the terminal program being used to view the data.

Alerts:

Fedora	FEDORA-2015-13526	audit	2015-08-19
Fedora	FEDORA-2015-13471	audit	2015-08-19
Mageia	MGASA-2015-0333	audit	2015-08-30

Comments (none posted)

conntrack: denial of service

Package(s):

conntrack

CVE #(s):

CVE-2015-6496

Created:

August 20, 2015

Updated:

January 4, 2016

Description:

From the Debian advisory:

It was discovered that in certain configurations, if the relevant conntrack kernel module is not loaded, conntrackd will crash when handling DCCP, SCTP or ICMPv6 packets.

Alerts:

Fedora	FEDORA-2015-1aee5e6f0b	conntrack-tools	2016-01-03
Fedora	FEDORA-2015-5eb2131441	conntrack-tools	2016-01-03
openSUSE	openSUSE-SU-2015:1688-1	conntrack-tools	2015-10-06
Debian-LTS	DLA-295-1	conntrack	2015-08-19
Debian	DSA-3341-1	conntrack	2015-08-20
Mageia	MGASA-2015-0363	conntrack-tools	2015-09-13

Comments (none posted)

extplorer: cross-site scripting

Package(s):

extplorer

CVE #(s):

CVE-2015-0896

Created:

August 24, 2015

Updated:

May 4, 2016

Description:

From the CVE entry:

Multiple cross-site scripting (XSS) vulnerabilities in eXtplorer before 2.1.7 allow remote attackers to inject arbitrary web script or HTML via unspecified vectors.

Alerts:

Debian-LTS	DLA-453-1	extplorer	2016-05-03
Debian-LTS	DLA-296-1	extplorer	2015-08-21

Comments (none posted)

firefox: multiple vulnerabilities

Package(s):

firefox

CVE #(s):

CVE-2015-4473 CVE-2015-4474 CVE-2015-4475 CVE-2015-4477 CVE-2015-4478 CVE-2015-4479 CVE-2015-4480 CVE-2015-4481 CVE-2015-4482 CVE-2015-4483 CVE-2015-4484 CVE-2015-4485 CVE-2015-4486 CVE-2015-4487 CVE-2015-4488 CVE-2015-4489 CVE-2015-4490 CVE-2015-4491 CVE-2015-4492 CVE-2015-4493 CVE-2015-4495

Created:

August 17, 2015

Updated:

December 2, 2015

Description:

     * MFSA 2015-79/CVE-2015-4473/CVE-2015-4474 Miscellaneous memory safety
       hazards
     * MFSA 2015-80/CVE-2015-4475 (bmo#1175396) Out-of-bounds read with
       malformed MP3 file
     * MFSA 2015-81/CVE-2015-4477 (bmo#1179484) Use-after-free in MediaStream
       playback
     * MFSA 2015-82/CVE-2015-4478 (bmo#1105914) Redefinition of
       non-configurable JavaScript object properties
     * MFSA 2015-83/CVE-2015-4479/CVE-2015-4480/CVE-2015-4493 Overflow issues
       in libstagefright
     * MFSA 2015-84/CVE-2015-4481 (bmo1171518) Arbitrary file overwriting
       through Mozilla Maintenance Service with hard links (only affected
       Windows)
     * MFSA 2015-85/CVE-2015-4482 (bmo#1184500) Out-of-bounds write with
       Updater and malicious MAR file (does not affect openSUSE RPM packages
       which do not ship the updater)
     * MFSA 2015-86/CVE-2015-4483 (bmo#1148732) Feed protocol with POST
       bypasses mixed content protections
     * MFSA 2015-87/CVE-2015-4484 (bmo#1171540) Crash when using shared
       memory in JavaScript
     * MFSA 2015-88/CVE-2015-4491 (bmo#1184009) Heap overflow in gdk-pixbuf
       when scaling bitmap images
     * MFSA 2015-89/CVE-2015-4485/CVE-2015-4486 (bmo#1177948, bmo#1178148)
       Buffer overflows on Libvpx when decoding WebM video
     * MFSA 2015-90/CVE-2015-4487/CVE-2015-4488/CVE-2015-4489 Vulnerabilities
       found through code inspection
     * MFSA 2015-91/CVE-2015-4490 (bmo#1086999) Mozilla Content Security
       Policy allows for asterisk wildcards in violation of CSP specification
     * MFSA 2015-92/CVE-2015-4492 (bmo#1185820) Use-after-free in
       XMLHttpRequest with shared workers

Alerts:

Gentoo	201512-10	firefox	2015-12-30
Gentoo	201605-06	nss	2016-05-31
openSUSE	openSUSE-SU-2016:0876-1	thunderbird	2016-03-24
Mageia	MGASA-2016-0105	firefox	2016-03-09
Debian	DSA-3410-1	icedove	2015-12-01
SUSE	SUSE-SU-2015:2081-1	firefox	2015-11-23
Fedora	FEDORA-2015-13436	firefox	2015-08-18
Slackware	SSA:2015-226-01	firefox	2015-08-14
openSUSE	openSUSE-SU-2015:1390-1	firefox	2015-08-14
Fedora	FEDORA-2015-13397	firefox	2015-08-15
openSUSE	openSUSE-SU-2015:1389-1	firefox	2015-08-14
CentOS	CESA-2015:1682	thunderbird	2015-08-25
SUSE	SUSE-SU-2015:1528-1	MozillaFirefox, mozilla-nss	2015-09-10
CentOS	CESA-2015:1682	thunderbird	2015-08-25
openSUSE	openSUSE-SU-2015:1454-1	thunderbird	2015-08-28
SUSE	SUSE-SU-2015:1449-1	MozillaFirefox, mozilla-nss	2015-08-28
openSUSE	openSUSE-SU-2015:1453-1	thunderbird	2015-08-28
CentOS	CESA-2015:1682	thunderbird	2015-08-25

Comments (none posted)

golang: HTTP request smuggling

Package(s):

golang

CVE #(s):

CVE-2015-5739 CVE-2015-5740 CVE-2015-5741

Created:

August 18, 2015

Updated:

July 28, 2016

Description:

From the Red Hat bugzilla entry:

There have been found potentially exploitable flaws in Golang net/http library affecting versions 1.4.2 and 1.5.

Problems:
* Double Content-length headers in a request does not generate a 400 error, the second Content-length is ignored.
* Invalid headers are parsed as valid headers (like "Content Length:" with a space in the middle)
Exploitations:
In a situation where the net/http agent HTTP communication with the final http clients is using some reverse proxy (reverse proxy cache, SSL terminators, etc), some requests can be made exploiting the net/http HTTP protocol violations.

Alerts:

openSUSE	openSUSE-SU-2016:1894-1	go	2016-07-27
Fedora	FEDORA-2015-15618	golang	2015-10-01
Fedora	FEDORA-2015-15619	golang	2015-10-01
Fedora	FEDORA-2015-13002	golang	2015-08-18
Fedora	FEDORA-2015-12957	golang	2015-08-18

Comments (none posted)

jasper: denial of service

Package(s):

jasper

CVE #(s):

CVE-2015-5203

Created:

August 26, 2015

Updated:

September 19, 2016

Description:

From the Arch Linux advisory:

A double free issue has been discovered in the function jasper_image_stop_load. This vulnerability can be triggered by loading a specially crafted image through jasper.

A remote attacker is able to send a specially crafted image that triggers a double free leading to denial of service.

Alerts:

openSUSE	openSUSE-SU-2016:2737-1	jasper	2016-11-05
openSUSE	openSUSE-SU-2016:2722-1	jasper	2016-11-04
Fedora	FEDORA-2016-bbecf64af4	jasper	2016-09-21
Fedora	FEDORA-2016-5a7e745a56	jasper	2016-09-18
Mageia	MGASA-2016-0298	jasper	2016-09-16
Fedora	FEDORA-2016-7776983633	jasper	2016-08-15
Arch Linux	ASA-201612-9	jasper	2016-12-09
openSUSE	openSUSE-SU-2016:2833-1	jasper	2016-11-17
Arch Linux	ASA-201508-10	jasper	2015-08-26

Comments (none posted)

kdepim: no attachment encryption

Package(s):

kdepim

CVE #(s):

CVE-2014-8878

Created:

August 18, 2015

Updated:

August 26, 2015

Description:

From the Mageia advisory:

This update fixes a security vulnerability in kdepim : kmail doesn't encrypt attachments when "automatic encryption" is selected

Alerts:

Mageia

MGASA-2015-0315

kdepim

2015-08-18

Comments (none posted)

libstruts1.2-java: unclear vulnerability

Package(s):	libstruts1.2-java	CVE #(s):	CVE-2014-0899
Created:	August 18, 2015	Updated:	August 26, 2015
Description:	From the Debian-LTS advisory: The Validator in Apache Struts 1.1 and later contains a function to efficiently define rules for input validation across multiple pages during screen transitions. This function contains a vulnerability where input validation may be bypassed. When the Apache Struts 1 Validator is used, the web application may be vulnerable even when this function is not used explicitly.
Alerts:	(No alerts in the database for this vulnerability)

Comments (none posted)

mediawiki: multiple vulnerabilities

Package(s):	mediawiki	CVE #(s):
Created:	August 24, 2015	Updated:	August 26, 2015
Description:	From the Mediawiki advisory: I would like to announce the release of MediaWiki 1.25.2, 1.24.3, and 1.23.10. * Internal review discovered that Special:DeletedContributions did not properly protect the IP of autoblocked users. This fix makes the functionality of Special:DeletedContributions consistent with Special:Contributions and Special:BlockList. * Internal review discovered that watchlist anti-csrf tokens were not being compared in constant time, which could allow various timing attacks. This could allow an attacker to modify a user's watchlist via csrf. * John Menerick reported that MediaWiki's thumb.php failed to sanitize various error messages, resulting in xss. Additionally, several extensions have been updated to fix security issues.
Alerts:	(No alerts in the database for this vulnerability)

Comments (none posted)

mysql: unspecified vulnerability

Package(s):

rh-mysql56-mysql

CVE #(s):

CVE-2015-4756

Created:

August 17, 2015

Updated:

August 26, 2015

Description:

From the Red Hat advisory:

CVE-2015-4756 mysql: unspecified vulnerability related to Server:InnoDB

Alerts:

Gentoo	201610-06	mysql	2016-10-11
openSUSE	openSUSE-SU-2015:1629-1	mysql-community-server	2015-09-25
Red Hat	RHSA-2015:1646-01	rh-mariadb100-mariadb	2015-08-20
Red Hat	RHSA-2015:1630-01	rh-mysql56-mysql	2015-08-17

Comments (none posted)

nagios-plugins: three vulnerabilities

Package(s):

nagios-plugins

CVE #(s):

CVE-2014-4702 CVE-2014-4701 CVE-2014-4703

Created:

August 18, 2015

Updated:

August 26, 2015

Description:

From a Red Hat bugzilla entry:

CVE-2014-4702: Similar to the CVE-2014-4701 issue in the check_dhcp plug-in, the same flaw was found to affect check_icmp. A local attacker could obtain sensitive information by using this flaw to read parts of INI configuration files that belong to the root user.

From another Red Hat bugzilla entry:

CVE-2014-4701, CVE-2014-4703: It was reported that check_dhcp plugin allow local unprivileged user to read parts of INI config files belonging to root on a local system. It could allow an attacker to obtain sensitive information like passwords that should only be accessible by root user. The vulnerability is due to check_dhcp plugin having Root SUID permissions and inappropriate access control when reading user provided config file (through --extra-opts= option).

Alerts:

Fedora	FEDORA-2015-12987	nagios-plugins	2015-08-18
Fedora	FEDORA-2015-12972	nagios-plugins	2015-08-18

Comments (none posted)

net-snmp: code execution

Package(s):

net-snmp

CVE #(s):

CVE-2015-5621

Created:

August 18, 2015

Updated:

September 8, 2015

Description:

From the Red Hat advisory:

It was discovered that the snmp_pdu_parse() function could leave incompletely parsed varBind variables in the list of variables. A remote, unauthenticated attacker could use this flaw to crash snmpd or, potentially, execute arbitrary code on the system with the privileges of the user running snmpd. (CVE-2015-5621)

Alerts:

Ubuntu	USN-2711-1	net-snmp	2015-08-17
Scientific Linux	SLSA-2015:1636-1	net-snmp	2015-08-17
Oracle	ELSA-2015-1636	net-snmp	2015-08-17
Oracle	ELSA-2015-1636	net-snmp	2015-08-17
Mandriva	MDVSA-2015:229	net-snmp	2015-05-06
Mageia	MGASA-2015-0187	net-snmp	2015-05-05
CentOS	CESA-2015:1636	net-snmp	2015-08-17
CentOS	CESA-2015:1636	net-snmp	2015-08-17
Red Hat	RHSA-2015:1636-01	net-snmp	2015-08-17
openSUSE	openSUSE-SU-2015:1502-1	net-snmp	2015-09-07

Comments (none posted)

openshift: privilege escalation

Package(s):

openshift

CVE #(s):

CVE-2015-5222

Created:

August 21, 2015

Updated:

August 26, 2015

Description:

From the Red Hat advisory:

An improper permission check issue was discovered in the server admission control component in OpenShift. A user with build permissions could use this flaw to execute arbitrary shell commands on a build pod with the privileges of the root user.

Alerts:

Red Hat

RHSA-2015:1650-01

openshift

2015-08-20

Comments (none posted)

openssh: multiple vulnerabilities

Package(s):

openssh

CVE #(s):

CVE-2015-6565 CVE-2015-6563 CVE-2015-6564

Created:

August 19, 2015

Updated:

August 26, 2015

Description:

From the OpenSSH release notes:

sshd(8): OpenSSH 6.8 and 6.9 incorrectly set TTYs to be world- writable. Local attackers may be able to write arbitrary messages to logged-in users, including terminal escape sequences. Reported by Nikolay Edigaryev. (CVE-2015-6565)

sshd(8): Portable OpenSSH only: Fixed a privilege separation weakness related to PAM support. Attackers who could successfully compromise the pre-authentication process for remote code execution and who had valid credentials on the host could impersonate other users. Reported by Moritz Jodeit. (CVE-2015-6563)

sshd(8): Portable OpenSSH only: Fixed a use-after-free bug related to PAM support that was reachable by attackers who could compromise the pre-authentication process for remote code execution. Also reported by Moritz Jodeit. (CVE-2015-6564)

Alerts:

Scientific Linux	SLSA-2015:2088-6	openssh	2015-12-21
Scientific Linux	SLSA-2016:0741-1	openssh	2016-06-08
Red Hat	RHSA-2016:0741-01	openssh	2016-05-10
Gentoo	201512-04	openssh	2015-12-21
Red Hat	RHSA-2015:2088-06	openssh	2015-11-19
SUSE	SUSE-SU-2015:1581-1	openssh	2015-09-21
Mageia	MGASA-2015-0321	openssh	2015-08-21
Fedora	FEDORA-2015-13520	openssh	2015-08-19
Fedora	FEDORA-2015-13469	openssh	2015-08-27

Comments (none posted)

openstack-neutron: denial of service

Package(s):

openstack-neutron

CVE #(s):

CVE-2015-3221

Created:

August 25, 2015

Updated:

August 26, 2015

Description:

From the Red Hat advisory:

A Denial of Service flaw was found in the L2 agent when using the IPTables firewall driver. By submitting an address pair that will be rejected as invalid by the ipset tool, an attacker may cause the agent to crash.

Alerts:

Red Hat

RHSA-2015:1680-01

openstack-neutron

2015-08-24

Comments (none posted)

owncloud: three vulnerabilities

Package(s):

owncloud

CVE #(s):

CVE-2015-4715 CVE-2015-4717 CVE-2015-4718

Created:

August 14, 2015

Updated:

August 26, 2015

Description:

From the Mageia advisory:

In ownCloud before 6.0.8 and 8.0.4, a bug in the SDK used to connect ownCloud against the Dropbox server might allow the owner of "Dropbox.com" to gain access to any files on the ownCloud server if an external Dropbox storage was mounted (CVE-2015-4715).

In ownCloud before 6.0.8 and 8.0.4, the sanitization component for filenames was vulnerable to DoS when parsing specially crafted file names passed via specific endpoints. Effectively this lead to a endless loop filling the log file until the system is not anymore responsive (CVE-2015-4717).

In ownCloud before 6.0.8 and 8.0.4, the external SMB storage of ownCloud was not properly neutralizing all special elements which allows an adversary to execute arbitrary SMB commands. This was caused by improperly sanitizing the ";" character which is interpreted as command separator by smbclient (the used software to connect to SMB shared by ownCloud). Effectively this allows an attacker to gain access to any file on the system or overwrite it, finally leading to a PHP code execution in the case of ownCloud’s config file (CVE-2015-4718).

Alerts:

Debian	DSA-3373-1	owncloud	2015-10-18
Mageia	MGASA-2015-0314	owncloud	2015-08-13

Comments (none posted)

pcre: code execution

Package(s):

pcre

CVE #(s):

CVE-2015-8381

Created:

August 14, 2015

Updated:

December 2, 2015

Description:

From the Red Hat bugzilla entry:

Latest version of PCRE is prone to a Heap Overflow vulnerability which could caused by the following regular expression.

    /(?J:(?|(:(?|(?'R')(\k'R')|((?'R')))H'Rk'Rf)|s(?'R'))))/

Alerts:

Red Hat	RHSA-2016:2750-01	rh-php56	2016-11-15
Gentoo	201607-02	libpcre	2016-07-09
Red Hat	RHSA-2016:1132-01	rh-mariadb100-mariadb	2016-05-26
openSUSE	openSUSE-SU-2016:3099-1	pcre	2016-12-12
Ubuntu	USN-2943-1	pcre3	2016-03-29
Arch Linux	ASA-201508-11	pcre	2015-08-26
Fedora	FEDORA-2015-12921	pcre	2015-08-13
Mageia	MGASA-2015-0343	pcre	2015-09-08
Fedora	FEDORA-2015-14242	pcre	2015-09-11
Fedora	FEDORA-2015-14235	pcre	2015-09-11

Comments (none posted)

php: multiple vulnerabilities

Package(s):	php	CVE #(s):
Created:	August 24, 2015	Updated:	August 26, 2015
Description:	The php package has been updated to version 5.6.12, fixing several bugs and security issues. See the upstream Changelog for more details. Also 5.5.28 has been released: upstream changelog.
Alerts:	(No alerts in the database for this vulnerability)

Comments (none posted)

python-django: multiple vulnerabilities

Package(s):

python-django

CVE #(s):

CVE-2015-5963 CVE-2015-5964

Created:

August 19, 2015

Updated:

October 16, 2015

Description:

From the Debian advisory:

Lin Hua Cheng discovered that a session could be created when anonymously accessing the django.contrib.auth.views.logout view. This could allow remote attackers to saturate the session store or cause other users' session records to be evicted.

Additionally the contrib.sessions.backends.base.SessionBase.flush() and cache_db.SessionStore.flush() methods have been modified to avoid creating a new empty session as well.

Alerts:

Fedora	FEDORA-2015-1dd5bc998f	python-django	2015-11-19
Red Hat	RHSA-2015:1894-01	python-django	2015-10-15
Red Hat	RHSA-2015:1876-01	python-django	2015-10-08
openSUSE	openSUSE-SU-2015:1598-1	python-django	2015-09-22
openSUSE	openSUSE-SU-2015:1580-1	python-Django	2015-09-19
Mageia	MGASA-2015-0327	python-django, python-django14	2015-08-27
Arch Linux	ASA-201508-9	python-django	2015-08-25
Ubuntu	USN-2720-1	python-django	2015-08-18
Debian	DSA-3338-1	python-django	2015-08-18
Red Hat	RHSA-2015:1766-01	python-django	2015-09-10
Debian-LTS	DLA-301-1	python-django	2015-08-26
Red Hat	RHSA-2015:1767-01	python-django	2015-09-10

Comments (none posted)

python-django-horizon: cross-site scripting

Package(s):

python-django-horizon

CVE #(s):

CVE-2015-3219 CVE-2015-3988

Created:

August 25, 2015

Updated:

August 26, 2015

Description:

From the CVE entries:

Cross-site scripting (XSS) vulnerability in the Orchestration/Stack section in OpenStack Dashboard (Horizon) 2014.2 before 2014.2.4 and 2015.1.x before 2015.1.1 allows remote attackers to inject arbitrary web script or HTML via the description parameter in a heat template, which is not properly handled in the help_text attribute in the Field class. (CVE-2015-3219)

Multiple cross-site scripting (XSS) vulnerabilities in OpenStack Dashboard (Horizon) 2015.1.0 allow remote authenticated users to inject arbitrary web script or HTML via the metadata to a (1) Glance image, (2) Nova flavor or (3) Host Aggregate. (CVE-2015-3988)

Alerts:

Debian	DSA-3617-1	horizon	2016-07-06
Red Hat	RHSA-2015:1679-01	python-django-horizon	2015-08-24

Comments (none posted)

qemu: two vulnerabilities

Package(s):

qemu

CVE #(s):

CVE-2015-5166 CVE-2015-5165

Created:

August 18, 2015

Updated:

September 28, 2015

Description:

From a Red Hat bugzilla entry:

CVE-2015-5165: Qemu emulator built with the RTL8139 emulation support is vulnerable to an information leakage flaw. It could occur while processing network packets under RTL8139 controller's C+ mode of operation.

A guest user could use this flaw to read uninitialised Qemu heap memory upto 65K bytes.

From another Red Hat bugzilla entry:

CVE-2015-5166: Qemu emulator built with the IDE Emulation PCI PIIX3/4 support is vulnerable to a use-after-free flaw. It could occur when trying to write data to an I/O port inside guest. This issue is specific to the Xen platform.

A privileged(CAP_SYS_RAWIO) guest user on the Xen platform could use this flaw to crash the Qemu instance or probably attempt to make a guest escape.

Alerts:

Oracle	ELSA-2016-0997	qemu-kvm	2016-05-17
Debian-LTS	DLA-479-1	xen	2016-05-18
Mageia	MGASA-2016-0098	xen	2016-03-07
openSUSE	openSUSE-SU-2015:2003-1	xen	2015-11-17
openSUSE	openSUSE-SU-2015:1964-1	xen	2015-11-12
SUSE	SUSE-SU-2015:1643-1	Xen	2015-09-25
Fedora	FEDORA-2015-15946	xen	2015-09-26
Fedora	FEDORA-2015-15944	xen	2015-09-27
Scientific Linux	SLSA-2015:1833-1	qemu-kvm	2015-09-22
Oracle	ELSA-2015-1833	qemu-kvm	2015-09-22
CentOS	CESA-2015:1833	qemu-kvm	2015-09-22
Red Hat	RHSA-2015:1833-01	qemu-kvm	2015-09-22
Scientific Linux	SLSA-2015:1793-1	qemu-kvm	2015-09-15
Oracle	ELSA-2015-1793	qemu-kvm	2015-09-15
Red Hat	RHSA-2015:1793-01	qemu-kvm	2015-09-15
Mageia	MGASA-2015-0368	qemu	2015-09-15
Ubuntu	USN-2724-1	qemu, qemu-kvm	2015-08-27
Red Hat	RHSA-2015:1683-01	qemu-kvm-rhev	2015-08-25
Red Hat	RHSA-2015:1674-01	qemu-kvm-rhev	2015-08-24
SUSE	SUSE-SU-2015:1421-1	xen	2015-08-21
Fedora	FEDORA-2015-13402	qemu	2015-08-18
Debian	DSA-3349-1	qemu-kvm	2015-09-02
Debian	DSA-3348-1	qemu	2015-09-02
Mageia	MGASA-2015-0369	qemu	2015-09-15
Red Hat	RHSA-2015:1718-01	qemu-kvm-rhev	2015-09-03
SUSE	SUSE-SU-2015:1479-2	xen	2015-09-02
SUSE	SUSE-SU-2015:1479-1	xen	2015-09-02
Fedora	FEDORA-2015-13404	qemu	2015-09-01

Comments (none posted)

request-tracker4: cross-site scripting

Package(s):

request-tracker4

CVE #(s):

CVE-2015-5475

Created:

August 13, 2015

Updated:

August 26, 2015

Description:

From the Debian advisory:

It was discovered that Request Tracker, an extensible trouble-ticket tracking system is susceptible to a cross-site scripting attack via the user [and] group rights management pages (CVE-2015-5475) and via the cryptography interface, allowing an attacker with a carefully-crafted key to inject JavaScript into RT's user interface. Installations which use neither GnuPG nor S/MIME are unaffected by the second cross-site scripting vulnerability.

Alerts:

Debian	DSA-3335-1	request-tracker4	2015-08-13
Fedora	FEDORA-2015-13664	rt	2015-08-27
Fedora	FEDORA-2015-13718	rt	2015-08-27

Comments (none posted)

roundup: multiple vulnerabilities

Package(s):

roundup

CVE #(s):

CVE-2012-6130 CVE-2012-6131 CVE-2012-6132 CVE-2012-6133

Created:

August 24, 2015

Updated:

August 26, 2015

Description:

From the CVE entries:

Cross-site scripting (XSS) vulnerability in the history display in Roundup before 1.4.20 allows remote attackers to inject arbitrary web script or HTML via a username, related to generating a link. (CVE-2012-6130)

Cross-site scripting (XSS) vulnerability in cgi/client.py in Roundup before 1.4.20 allows remote attackers to inject arbitrary web script or HTML via the @action parameter to support/issue1. (CVE-2012-6131)

Cross-site scripting (XSS) vulnerability in Roundup before 1.4.20 allows remote attackers to inject arbitrary web script or HTML via the otk parameter. (CVE-2012-6132)

From the Debian LTS advisory:

XSS flaws in ok and error messages
We solve this differently from the proposals in the bug-report by not allowing *any* html-tags in ok/error messages anymore. (CVE-2012-6133)

Alerts:

Debian-LTS

DLA-298-1

roundup

2015-08-23

Comments (none posted)

ruby: information disclosure

Package(s):

ruby1.8

CVE #(s):

CVE-2009-5147

Created:

August 26, 2015

Updated:

December 17, 2015

Description:

From the Debian LTS advisory:

"sheepman" fixed a vulnerability in Ruby 1.8: DL::dlopen could open a library with tainted name even if $SAFE > 0.

Alerts:

Fedora	FEDORA-2015-c4409eb73a	ruby	2016-01-08
Fedora	FEDORA-2015-eef21b972e	ruby	2015-12-29
Arch Linux	ASA-201512-11	ruby	2015-12-17
Debian-LTS	DLA-300-1	ruby1.9.1	2015-08-26
Debian-LTS	DLA-299-1	ruby1.8	2015-08-26

Comments (none posted)

strongswan: incorrect payload processing

Package(s):

strongswan

CVE #(s):

CVE-2015-3991

Created:

August 19, 2015

Updated:

August 26, 2015

Description:

From the Fedora advisory:

Incorrect payload processing for different IKE versions.

Alerts:

Fedora	FEDORA-2015-5279	strongswan	2015-08-19
Fedora	FEDORA-2015-5247	strongswan	2015-08-19

Comments (none posted)

twig: code execution

Package(s):

twig

CVE #(s):

Created:

August 26, 2015

Updated:

August 26, 2015

Description:

From the Debian advisory:

James Kettle, Alain Tiemblo, Christophe Coevoet and Fabien Potencier discovered that twig, a templating engine for PHP, did not correctly process its input. End users allowed to submit twig templates could use specially crafted code to trigger remote code execution, even in sandboxed templates.

Alerts:

Debian

DSA-3343-1

twig

2015-08-26

Comments (none posted)

uwsgi: denial of service

Package(s):

uwsgi

CVE #(s):

Created:

August 18, 2015

Updated:

August 26, 2015

Description:

From the uwsgi announcement:

Hi, an emergency release fixing an HTTPS resource leak (spotted by André Cruz) is available

http://uwsgi-docs.readthedocs.org/en/latest/Changelog-2.0.11.1.html

If you use the uWSGI https router you should upgrade to avoid excessive file descriptors and memory allocation.

Alerts:

Fedora	FEDORA-2015-12032	uwsgi	2015-08-18
Fedora	FEDORA-2015-12020	uwsgi	2015-08-18

Comments (none posted)

virtualbox: unspecified vulnerability

Package(s):

virtualbox

CVE #(s):

CVE-2015-2594

Created:

August 18, 2015

Updated:

September 14, 2015

Description:

From the SUSE bug tracker:

Unspecified vulnerability in the Oracle VM VirtualBox component in Oracle Virtualization VirtualBox prior to 4.0.32, 4.1.40, 4.2.32, and 4.3.30 allows local users to affect confidentiality, integrity, and availability via unknown vectors related to Core.

Alerts:

Debian-LTS	DLA-313-1	virtualbox-ose	2015-09-29
openSUSE	openSUSE-SU-2015:1400-1	virtualbox	2015-08-18
Debian	DSA-3359-1	virtualbox	2015-09-13

Comments (none posted)

vlc: code execution

Package(s):

vlc

CVE #(s):

CVE-2015-5949

Created:

August 20, 2015

Updated:

February 17, 2016

Description:

From the Debian advisory:

Loren Maggiore of Trail of Bits discovered that the 3GP parser of VLC, a multimedia player and streamer, could dereference an arbitrary pointer due to insufficient restrictions on a writable buffer. This could allow remote attackers to execute arbitrary code via crafted 3GP files.

Alerts:

Gentoo	201603-08	vlc	2016-03-12
openSUSE	openSUSE-SU-2016:0476-1	vlc	2016-02-16
Debian	DSA-3342-1	vlc	2015-08-20
Mageia	MGASA-2015-0324	vlc	2015-08-25
Mageia	MGASA-2015-0329	vlc	2015-08-27

Comments (none posted)

webkitgtk4: three unspecified vulnerabilities

Package(s):

webkitgtk4

CVE #(s):

Created:

August 18, 2015

Updated:

August 26, 2015

Description:

From the Fedora advisory:

WebKitGTK+ 2.8.5 includes fixes for 3 security issues.

Alerts:

Fedora

FEDORA-2015-13001

webkitgtk4

2015-08-18

Comments (none posted)

wireshark: multiple vulnerabilities

Package(s):	wireshark	CVE #(s):
Created:	August 24, 2015	Updated:	August 26, 2015
Description:	From the openSUSE advisory: Wireshark was updated to fix several security vulnerabilities and bugs. - Wireshark 1.12.7 [boo#941500] The following vulnerabilities have been fixed: * Wireshark could crash when adding an item to the protocol tree. wnpa-sec-2015-21 * Wireshark could attempt to free invalid memory. wnpa-sec-2015-22 * Wireshark could crash when searching for a protocol dissector. wnpa-sec-2015-23 * The ZigBee dissector could crash. wnpa-sec-2015-24 * The GSM RLC/MAC dissector could go into an infinite loop. wnpa-sec-2015-25 * The WaveAgent dissector could crash. wnpa-sec-2015-26 * The OpenFlow dissector could go into an infinite loop. wnpa-sec-2015-27 * Wireshark could crash due to invalid ptvcursor length checking. wnpa-sec-2015-28 * The WCCP dissector could crash. wnpa-sec-2015-29 * Further bug fixes and updated protocol support as listed in: https://www.wireshark.org/docs/relnotes/wireshark-1.12.7....
Alerts:	(No alerts in the database for this vulnerability)

Comments (none posted)

zendframework: XML external entity attack

Package(s):

zendframework

CVE #(s):

CVE-2015-5161

Created:

August 20, 2015

Updated:

September 15, 2015

Description:

From the Debian advisory:

Dawid Golunski discovered that when running under PHP-FPM in a threaded environment, Zend Framework, a PHP framework, did not properly handle XML data in multibyte encoding. This could be used by remote attackers to perform an XML External Entity attack via crafted XML data.

Alerts:

SUSE	SUSE-SU-2016:1638-1	php53	2016-06-21
Debian-LTS	DLA-499-1	php5	2016-05-31
Fedora	FEDORA-2015-f1e18131bc	php-ZendFramework	2015-11-09
Fedora	FEDORA-2015-6d70a701bf	php-ZendFramework	2015-11-09
Fedora	FEDORA-2015-2e7c06c639	php-ZendFramework	2015-11-08
Debian	DSA-3340-1	zendframework	2015-08-19
Fedora	FEDORA-2015-13488	php-guzzle-Guzzle	2015-08-27
Fedora	FEDORA-2015-13488	php-ZendFramework2	2015-08-27
Fedora	FEDORA-2015-13529	php-guzzle-Guzzle	2015-08-27
Mageia	MGASA-2015-0370	php-ZendFramework	2015-09-15
Mageia	MGASA-2015-0371	php-ZendFramework	2015-09-15
Fedora	FEDORA-2015-13529	php-ZendFramework2	2015-08-27
Debian-LTS	DLA-302-1	zendframework	2015-08-27

Comments (none posted)

Kernel release status

The current development kernel is 4.2-rc8, released on August 23. In the end, Linus decided to wait one more week before putting out the final 4.2 release. "It's not like there are any real outstanding issues, and I waffled between just doing the release and doing another -rc. But we did have another low-level x86 issue come up this week, and together with the fact that a number of people are on vacation, I decided that waiting an extra week isn't going to hurt. But it was close. It's a fairly small rc8, and I really feel like it could have gone either way."

Previously, 4.2-rc7 came out on August 16.

Stable updates: 4.1.6, 3.14.51, and 3.10.87 were released on August 17.

Comments (none posted)

The bcachefs filesystem

Kent Overstreet, author of the bcache block caching layer, has announced that bcache has metamorphosed into a fully featured copy-on-write filesystem. "Well, years ago (going back to when I was still at Google), I and the other people working on bcache realized that what we were working on was, almost by accident, a good chunk of the functionality of a full blown filesystem - and there was a really clean and elegant design to be had there if we took it and ran with it. And a fast one - the main goal of bcachefs to match ext4 and xfs on performance and reliability, but with the features of btrfs/zfs."

Comments (94 posted)

The bcachefs filesystem

By Jonathan Corbet
August 25, 2015

The Linux kernel does not lack for filesystem support; many dozens of filesystem implementations are available for one use case or another. But, after all these years, Linux arguably lacks an established "next-generation" filesystem with advanced features and a design suited to contemporary hardware. That situation holds despite the existence of a number of competitors for that title; Btrfs remains at the top of the list, but others, such as tux3 and (still!) reiser4, are out there as well. In each case, it has taken rather longer than expected for the code to reach the required level of maturity. The list of putative next-generation filesystems has just gotten longer with the recent announcement of the "bcachefs" filesystem.

Bcachefs is an extension of bcache, which first appeared in LWN in 2010. Bcache was designed as a caching layer that improves block I/O performance by using a fast solid-state drive as a cache for a (slower, larger) underlying storage device. Bcache has been steadily developed over the last five years; it was merged into the mainline kernel during the 3.10 development cycle in 2013.

Mainline bcache is not a filesystem; instead, it looks like a special kind of block device. It manages the movement of blocks of data between fast and slow storage, working to ensure that the most frequently used data is kept on the faster device. This task is complex; bcache must manage data in a way that yields high performance while ensuring that no data is ever lost, even in the face of an unclean shutdown. Even so, at its interface to the rest of the system, bcache looks like a simple block device: give it numbered blocks of data, and it will store (and retrieve) them.

Users typically want something a bit higher-level than that; they want to be able to organize blocks into files, and files into directory hierarchies. That task is handled by a filesystem like ext4 or Btrfs. Thus, on current systems, bcache will be used in conjunction with a filesystem layer to provide a complete solution.

It seems that, over time, bcache has developed the potential to provide filesystem functionality on its own. In the bcachefs announcement, Kent Overstreet said:

Well, years ago (going back to when I was still at Google), I and the other people working on bcache realized that what we were working on was, almost by accident, a good chunk of the functionality of a full blown filesystem - and there was a really clean and elegant design to be had there if we took it and ran with it.

The actual running with this idea appears to have happened relatively recently, with the first publicly visible version of the bcachefs code being committed to the bcache repository in May 2015. Since then, it has seen a steady stream of commits from Kent; it was announced on the bcache mailing list in mid-July, and on linux-kernel just over a month later.

With the bcachefs code added, bcache has gained the namespace and file-management features that, until now, had to be supplied by a separate filesystem layer. Like Btrfs, it is a copy-on-write filesystem, meaning that data is never overwritten. Instead, a block that is overwritten moves to a new location, with the older version persisting as long as any references to it remain. Copy-on-write works well on solid-state storage devices and makes a number of advanced features relatively easy to implement.

Since the original bcache was a block-device management layer, bcachefs has some strong features in this area. Naturally, it offers multi-tier hybrid caching of data, and is able to integrate multiple physical devices into a single logical volume. Bcachefs does not appear to have any sort of higher-level RAID capability at this time, though; a basic replication mechanism is "like 80% done". Features like data checksumming and compression are supported.

The plans for the future include filesystem features like snapshots — an important Btrfs feature that is not yet available in bcachefs. Kent listed erasure coding as well, presumably as an alternative to higher-level RAID support. Native support for shingled magnetic recording drives is on the list, as is support for working with raw flash storage directly.

But none of those features are present in bcachefs now; work has been focused on getting the basic filesystem working in a reliable manner. Performance tuning has not been a priority thus far, but the filesystem claims reasonable performance numbers already — though, as Kent admitted, it suffers from the common (to copy-on-write filesystems) problem of "filling up" well before the underlying storage is actually filled with data. Importantly, the on-disk filesystem format has not yet been finalized — a clear sign that a filesystem is not yet ready for real-world use.

Another important (though unlisted) missing feature is a filesystem integrity checker ("fsck") utility.

Bcachefs looks like a promising filesystem, even if many of the intended features have not yet been implemented. But those who have watched filesystem development for any period of time will know what comes next: a surprisingly long wait while the code matures to the point that it can actually be trusted for production workloads. This process, it seems, cannot be hurried beyond a certain point; that is why other next-generation filesystem efforts are seemingly never quite ready. The low-level device-management code in bcachefs is tested and production-quality, but the filesystem code lacks that pedigree. Kent says that it "won't be done in a month (or a year)", but the truth is that it may not be done for several years yet; that is how filesystem development tends to go.

How many years depends, of course, on how many people test the filesystem and how much development effort it gets. Currently it has a development community of one — Kent — and he has noted that his full-time attention is "only going to last as long as my interest and my savings account hold out". If bcachefs acquires both a commercial sponsor and a wider development community, it may yet develop into that mature next-generation filesystem that we seem to never quite get (though Btrfs is there by some accounts). Until that happens, it should probably be looked at as an interesting idea with some advanced proof-of-concept code.

Comments (7 posted)

Steps toward power-aware scheduling

By Jonathan Corbet
August 25, 2015

Power-aware scheduling appears to have become one of those perennial linux-kernel topics that never quite reach a conclusion. Nobody disputes the existence of a problem to be solved, and potential solutions are not in short supply. But somehow none of those solutions ever quite makes it to the point of being ready for incorporation into the mainline scheduler. A few new patch sets showing a different approach to the problem have made the rounds recently. They may not be ready for merging either, but they do show how the understanding of the problem is evolving.

A sticking point in recent years has been the fact that there are a few subsystems related to power management and scheduling, and they are poorly integrated with each other. The cpuidle subsystem makes guesses about how deeply an idle CPU should sleep, but it does so based on recent history and without a view into the system's current workload. The cpufreq mechanism tries to observe the load on each CPU to determine the frequency and voltage the CPU should be operating at, but it doesn't talk to the scheduler at all. The scheduler, in turn, has no view of a CPU's operating parameters and, thus, cannot make optimal scheduling decisions.

It has become clear that this scattered set of mechanisms needs to be cleaned up before meaningful progress can be made on the current problem set. The scheduler maintainers have made it clear that they won't be interested in solutions that don't bring the various control mechanisms closer together.

Improved integration

One possible part of the answer is this patch set from Michael Turquette, currently in its third revision. Michael's patch replaces the current array of cpufreq governors with a new governor that is integrated with the scheduler. In essence, the scheduler occasionally calls directly into the governor, passing it a value describing the load that, the scheduler thinks, is currently set to run on the CPU. The governor can then select a frequency/voltage pair that enables the CPU to execute that load most efficiently.

The projected load on each CPU is generated by the per-entity load tracking subsystem. Since each process has its own tracked load, the scheduler can quickly sum up the load presented by all of the runnable processes on a CPU and pass that number on to the governor. If a process changes its state or is moved to another CPU, the load values can be updated immediately. That should make the new governor much more responsive than current governors, which must observe the CPU for a while to determine that a change needs to be made.

The per-entity load tracking code was a big step forward when it was added to the scheduler, but it still has some shortcomings. In particular, its concept of load is not tied to the CPU any given process might be running on. If different CPUs are running at different frequencies, the loads computed for processes on those CPUs will not be comparable. The problem gets worse on systems (like those based on the big.LITTLE architecture) where some CPUs are inherently more powerful than others.

The solution to this problem appears to be Morten Rasmussen's compute-capacity-invariant load/utilization tracking patch set. With these patches applied, all load and utilization values calculated by the scheduler are scaled relative to the current CPU capacity. That makes these values uniform across the system, allowing the scheduler to better judge the effects of moving a process from one CPU to another. It also will clearly help the power-management problem: matching CPU capacity to the projected load will work better if the load values are well-calibrated and understood.

With those two patch sets in place, the scheduler will be better equipped to run the system in a relatively power-efficient manner (though related issues like optimal task placement have not yet been addressed here). In the real world, though, not everybody wants to run in the most efficient mode all the time. Some systems may be managed more for performance than for power efficiency; the desired policy on other systems may vary depending on what jobs are running at the time. Linux currently supports a number of CPU-frequency governors designed to implement different policies; if the scheduler-driven governor is to replace all of those, it, too, must be able to support multiple policies.

Schedtune

One possible step in that direction can be seen in this patch set from Patrick Bellasi. It adds a tuning mechanism to the scheduler-driven governor so that multiple policies become possible. At its simplest, this tuning takes the form of a single, global value, stored in /proc/sys/kernel/sched_cfs_boost. The default value for this parameter is zero, which indicates that the system should be run for power efficiency. Higher values, up to 100, bias CPU frequency selection toward performance.

The exact meaning of this knob is fairly straightforward. At any given time, the scheduler can calculate the CPU capacity that it expects the currently runnable processes to require. The space between that capacity and the maximum capacity the CPU can provide is called the "margin." A non-zero value of sched_cfs_boost describes the percentage of the margin that should be made available via a more aggressive CPU-frequency/voltage selection.

So, for example, if the current load requires a CPU running at 60% capacity, the margin is 40%. Setting sched_cfs_boost to 50 will cause 50% of that margin to be made available, so the CPU should run at 80% of its maximum capacity. If sched_cfs_boost is set to 100, the CPU will always run at its maximum speed, optimizing the system as a whole for performance.

What about situations where the desired policy varies over time? A phone handset may want to run with higher performance while a phone call is active or when the user is interacting with the screen, but in the most efficient mode possible while checking for the day's obligatory pile of app updates. One could imagine making the desired power policy a per-process attribute, but Patrick, instead, opted to use the control-group mechanism instead.

With Patrick's patch set comes a new controller called "schedtune". That controller offers a single knob, called schedtune.boost, to describe the policy that should apply to processes within the group. One possible implementation would be to change the CPU's operating parameters every time a new process starts running, but there are a couple of problems with that approach. It could lead to excessive changing of CPU frequency and voltage, which can be counterproductive. Beyond that, though, a process needing high performance could find itself waiting behind another that doesn't; if the CPU runs slowly during that wait, the high-performance process may not get the response time it needs.

To avoid such problems, the controller looks at all running processes on the CPU and finds the one with the largest boost value. That value is then used to run all processes on the CPU.

The schedtune controller as currently implemented has a couple of interesting limitations. It can only handle a two-level control group hierarchy, and it can manage a maximum of sixteen possible groups. Neither of these characteristics fits well with the new, unified-hierarchy model for control groups, so the schedtune controller is highly likely to require modification before this patch set could be considered for merging into the mainline.

But, then, experience says that eventual merging may be a distant prospect in any case. The scheduler must work well for a huge variety of workloads, and cannot be optimized for one at the expense of others. Finding a way to add power awareness to the scheduler in a way that works for all workloads was never going to be an easy task. The latest patches show that progress is being made toward a general-purpose solution that, with luck, leaves the scheduler more flexible and maintainable than before. But whether that progress is reaching the point of being a solution that can be merged remains to be seen.

Comments (14 posted)

Porting Linux to a new processor architecture, part 1: The basics

August 26, 2015

This article was contributed by Joël Porquet

Although a simple port may count as little as 4000 lines of code—exactly 3,775 for the mmu-less Hitachi 8/300 recently reintroduced in Linux 4.2-rc1—getting the Linux kernel running on a new processor architecture is a difficult process. Worse still, there is not much documentation available describing the porting process. The aim of this series of three articles is to provide an overview of the procedure, or at least one possible procedure, that can be followed when porting the Linux kernel to a new processor architecture.

After spending countless hours becoming almost fluent in many of the supported architectures, I discovered that a well-defined skeleton shared by the majority of ports exists. Such a skeleton can logically be split into two parts that intersect a great deal. The first part is the boot code, meaning the architecture-specific code that is executed from the moment the kernel takes over from the bootloader until init is finally executed. The second part concerns the architecture-specific code that is regularly executed once the booting phase has been completed and the kernel is running normally. This second part includes starting new threads, dealing with hardware interrupts or software exceptions, copying data from/to user applications, serving system calls, and so on.

Is a new port necessary?

As LWN reported about another porting experience in an article published last year, there are three meanings to the word "porting".

It can be a port to a new board with an already-supported processor on it. Or it can be a new processor from an existing, supported processor family. The third alternative is to port to a completely new architecture.

Sometimes, the answer to whether one should start a new port from scratch is crystal clear—if the new processor comes with a new instruction set architecture (ISA), that is usually a good indicator. Sometimes it is less clear. In my case, it took me a couple weeks to figure out this first question.

At the time, May 2013, I had just been hired by the French academic computer lab LIP6 to port the Linux kernel to TSAR, an academic processor architecture that the system-on-chip research group was designing. TSAR is an architecture that follows many of the current trends: lots of small, single-issue, energy-efficient processor cores around a scalable network-on-chip. It also adds some nice innovations: a full-hardware cache-coherency protocol for both data/instruction caches and translation lookaside buffers (TLBs) as well as physically distributed but logically shared memory.

My dilemma was that the processor cores were compatible with the MIPS32 ISA, which meant the port could fall into the second category: "new processor from an existing processor family". But since TSAR had a virtual-memory model radically different from those of any MIPS processors, I would have been forced to drastically modify the entire MIPS branch in order to introduce this new processor, sometimes having almost no choice but to surround entire files with #ifndef TSAR ... #endif.

Quickly enough, it came down to the most logical—and interesting—conclusion:

    mkdir linux/arch/tsar

Get to know your hardware

Really knowing the underlying hardware is definitely the fundamental, and perhaps most obvious, prerequisite to porting Linux to it.

The specifications of a processor are often—logically or physically—split into a least two parts (as were, for example, the recently published specifications for the new RISC-V processor). The first part usually details the user-level ISA, which basically means the list of user-level instructions that the processor is able to understand—and execute. The second part describes the privileged architecture, which includes the list of kernel-level-only instructions and the various system registers that control the processor status.

This second part contains the majority—if not the entirety—of the information that makes a port special and thus often prevents the developer from opportunely reusing code from other architectures.

Among the important questions that should be answered by such specifications are:

What are the virtual-memory model of the processor architecture, the format of the page table, and the translation mechanism?

Many processor architectures (e.g. x86, ARM, or TSAR) define a flexible virtual-memory layout. Their virtual address space can theoretically be split any way between the user and kernel spaces—although the default layout for 32-bit processors in Linux usually allocates the lower 3GiB to user space and reserves the upper 1GiB for kernel space. In some other architectures, this layout is strongly constrained by the hardware design. For instance, on MIPS32, the virtual address space is statically split into two regions of the same size: the lower 2GiB is dedicated to user space and the upper 2GiB to kernel space; the latter even contains predefined windows into the physical address space.

The format of the page table is intimately linked to the translation mechanism used by the processor. In the case of a hardware-managed mechanism, when the TLB—a hardware cache of limited size containing recently used translations between virtual and physical addresses—does not contain the translation for a given virtual address (referred to as TLB miss), a hardware state machine will transparently fetch the proper translation from the page table structure in memory and fill the TLB with it. This means that the format of the page table must be fixed—and certainly defined by the processor's specifications. In a software-based mechanism, a TLB miss exception is handled by a piece of code, which theoretically leaves complete liberty as to how the page table is organized—only the format of TLB entries is specified.
How to enable/disable the interrupts, switch from privileged mode to user mode and vice-versa, get the cause of an exception, etc.?

Although all these operations generally only involve reading and/or modifying certain bit fields in the set of available system registers, they are always very particular to each architecture. It is for this reason that, most of the time, they are actually performed by small chunks of dedicated assembly code.
What is the ABI?

Although one could think that the Application Binary Interface (ABI) is only supposed to concern compilation tools, as it defines the way the stack is formatted into stack-frames, the ways arguments and return values are given or returned by functions, etc.; it is actually absolutely necessary to be familiar with it when porting Linux. For example, as the recipient of system calls (which are typically defined by the ABI), the kernel has to know where to get the arguments and how to return a value; or on a context switch, the kernel must know what to save and restore, as well as what constitutes the context of a thread, and so on.

Get to know the kernel

Learning a few kernel concepts, especially concerning the memory layout used by Linux, will definitely help. I admit it took me a while to understand what exactly was the distinction between low memory and high memory, and between the direct mapping and vmalloc regions.

For a typical and simple port (to a 32-bit processor), in which the kernel occupies the upper 1GiB of the virtual address space, it is usually fairly straightforward. Within this 1GiB, Linux defines that the lower portion of it will be directly mapped to the lower portion of the system memory (hence referred to as low memory): meaning that if the kernel accesses the address 0xC0000000, it will be redirected to the physical address 0x00000000.

In contrast, in systems with more physical memory than that which is mappable in the direct mapping region, the upper portion of the system memory (referred to as high memory) is not normally accessible to the kernel. Other mechanisms must be used, such as kmap() and kmap_atomic(), in order to gain temporary access to these high-memory pages.

Above the direct mapping region is the vmalloc region that is controlled by vmalloc(). This allocation mechanism provides the ability to allocate pages of memory in a virtually contiguous way in spite of the fact that these pages may not necessarily be physically contiguous. It is particularly useful for allocating a large amount of memory pages in a virtually contiguous manner, as otherwise it can be impossible to find the equivalent amount of contiguous free physical pages.

Further reading about the memory management in Linux can be found in Linux Device Drivers [PDF] and this LWN article.

How to start?

With your head full of the processor's specifications and kernel principles, it is finally time to add some files to this newly created arch directory. But wait ... where and how should we start? As with any porting or even any code that must respect a certain API, the procedure is a two-step process.

First, a minimal set of files that define a minimal set of symbols (functions, variables, defines) is necessary for the kernel to even compile. This set of files and symbols can often be deduced from compilation failures: if compilation fails because of a missing file/symbol, it is a good indicator that it should probably be implemented (or sometimes that some configuration options should be modified). In the case of porting Linux, this approach is particularly relevant when implementing the numerous headers that define the API between the architecture-specific code and the rest of the kernel.

After the kernel finally compiles and is able to be executed on the target hardware, it is useful to know that the boot code is very sequential. That allows many functions to stay empty at first and to only be implemented gradually until the system finally becomes stable and reaches the init process. This approach is generally possible for almost all of the C functions executed after the early assembly boot code. However it is advised to have the early_printk() infrastructure up and working otherwise it can be difficult to debug.

Finally getting started: the minimal set of non-code files

Porting the compilation tools to the new processor architecture is a prerequisite to porting the Linux kernel, but here we'll assume it has already been performed. All that is left to do in terms of compilation tools is to build a cross-compiler. Since at this point it is likely that porting a standard C library has not been completed (or even started), only a stage-1 cross-compiler can be created.

Such a cross-compiler is only able to compile source code for bare metal execution, which is a perfect fit for the kernel since it does not depend on any external library. In contrast, a stage-2 cross-compiler has built-in support for a standard C library.

The first step of porting Linux to a new processor is the creation of a new directory inside arch/, which is located at the root of the kernel tree (e.g. linux/arch/tsar/ in my case). Inside this new directory, the layout is quite standardized:

configs/: default configurations for supported systems (i.e. *_defconfig files)
include/asm/ for the headers dedicated to internal use only, i.e. Linux source code
include/uapi/asm for the headers that are meant to be exported to user space (e.g. the libc)
kernel/: general kernel management
lib/: optimized utility routines (e.g. memcpy(), memset(), etc.)
mm/: memory management

The great thing is that once the new arch directory exists, Linux automatically knows about it. It only complains about not finding a Makefile, not about this new architecture:

    ~/linux $ make ARCH=tsar
    Makefile: ~/linux/arch/tsar/Makefile: No such file or directory

As shown in the following example, a minimal arch Makefile only has a few variables to specify:

    KBUILD_DEFCONFIG := tsar_defconfig

    KBUILD_CFLAGS += -pipe -D__linux__ -G 0 -msoft-float
    KBUILD_AFLAGS += $(KBUILD_CFLAGS)

    head-y := arch/tsar/kernel/head.o

    core-y += arch/tsar/kernel/
    core-y += arch/tsar/mm/

    LIBGCC := $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name)
    libs-y += $(LIBGCC)
    libs-y += arch/tsar/lib/

    drivers-y += arch/tsar/drivers/

KBUILD_DEFCONFIG must hold the name of a valid default configuration, which is one of the defconfig files in the configs directory (e.g. configs/tsar_defconfig).
KBUILD_CFLAGS and KBUILD_AFLAGS define compilation flags, respectively for the compiler and the assembler.
{head,core,libs,...}-y list the objects (or subdirectory containing the objects) to be compiled in the kernel image (see Documentation/kbuild/makefiles.txt for detailed information)

Another file that has its place at the root of the arch directory is Kconfig. This file mainly serves two purposes: it defines new arch-specific configuration options that describe the features of the architecture, and it selects arch-independent configuration options (i.e. options that are already defined elsewhere in Linux source code) that apply to the architecture.

As this will be the main configuration file for the newly created arch, its content also determines the layout of the menuconfig command (e.g. make ARCH=tsar menuconfig). It is difficult to give a snippet of the file as it depends very much on the targeted architecture, but looking at the same file for other (simple) architectures should definitely help.

The defconfig file (e.g. configs/tsar_defconfig) is necessary to complete the files related to the Linux kernel build system (kbuild). Its role is to define the default configuration for the architecture, which basically means specifying a set of configuration options that will be used as a seed to generate a full configuration for the Linux kernel compilation. Once again, starting from defconfig files of other architectures should help, but it is still advised to refine them, as they tend to activate many more features than a minimalistic system would ever need—support for USB, IOMMU, or even filesystems is, for example, too early at this stage of porting.

Finally the last "not really code but still really important" file to create is a script (usually located at kernel/vmlinux.lds.S) that will instruct the linker how to place the various sections of code and data in the final kernel image. For example, it is usually necessary for the early assembly boot code to be set at the very beginning of the binary, and it is this script that allows us do so.

Conclusion

At this point, the build system is ready to be used: it is now possible to generate an initial kernel configuration, customize it, and even start compiling from it. However, the compilation stops very quickly since the port still does not contain any code.

In the next article, we will dive into some code for the second portion of the port: the headers, the early assembly boot code, and all the most important arch functions that are executed until the first kernel thread is created.

Comments (none posted)

Development statistics for the 4.2 kernel

By Jonathan Corbet
August 18, 2015

As of this writing, the 4.2-rc7 prepatch is out and the final 4.2 kernel looks to be (probably) on-track to be released on August 23. Tradition says that it's time for a look at the development statistics for this cycle. 4.2, in a couple of ways, looks a bit different from recent cycles, with some older patterns reasserting themselves.

At the end of the merge window, there was some speculation as to whether 4.2 would be the busiest development cycle yet. The current record holder is 3.15, which had 13,722 non-merge changesets at the time of its final release. 4.2, which had 13,555 at the -rc7 release, looks to fall a little short of that figure. So we will not have broken the record for the most changesets in any development cycle, but it was awfully close.

One record that did fall, though, is the number of developers contributing code to the kernel. The previous record holder (4.1, at 1,539) didn't keep that position for long; 1,569 developers have contributed to 4.2. Of those developers, 279 have made their first contribution to the Linux kernel. An eye-opening 1.09 million lines of code were added this time around with 285,000 removed, for a total growth of 800,000 lines of code.

The most active developers this time around were:

Most active 4.2 developers

By changesets

Ingo Molnar 304 2.2%

Mauro Carvalho Chehab 203 1.5%

Herbert Xu 171 1.3%

Krzysztof Kozlowski 161 1.2%

Geert Uytterhoeven 149 1.1%

Al Viro 140 1.0%

Lars-Peter Clausen 137 1.0%

H Hartley Sweeten 136 1.0%

Thomas Gleixner 127 0.9%

Hans Verkuil 124 0.9%

Tejun Heo 110 0.8%

Alex Deucher 95 0.7%

Paul Gortmaker 91 0.7%

Vineet Gupta 88 0.7%

Jiang Liu 84 0.6%

Christoph Hellwig 79 0.6%

Hans de Goede 78 0.6%

Arnaldo Carvalho de Melo 77 0.6%

Mateusz Kulikowski 74 0.5%

Takashi Iwai 73 0.5%

By changed lines

Alex Deucher 425501 35.7%

Johnny Kim 33726 2.8%

Raghu Vatsavayi 14484 1.2%

Greg Kroah-Hartman 12500 1.0%

Stephen Boyd 11062 0.9%

Dan Williams 10736 0.9%

Hans Verkuil 10641 0.9%

Narsimhulu Musini 10263 0.9%

Ingo Molnar 9254 0.8%

Jakub Kicinski 8531 0.7%

Herbert Xu 8515 0.7%

Yoshinori Sato 7612 0.6%

Saeed Mahameed 7493 0.6%

Sunil Goutham 7471 0.6%

Christoph Hellwig 7384 0.6%

Vineet Gupta 7171 0.6%

Mateusz Kulikowski 6852 0.6%

Maxime Ripard 6767 0.6%

Sudeep Dutt 6647 0.6%

Mauro Carvalho Chehab 6422 0.5%

Some years ago, Ingo Molnar routinely topped the per-changesets list, but he has been busy with other pursuits recently. That changed this time around, though, with a massive rewrite of the low-level x86 floating-point-unit management code. Mauro Carvalho Chehab continues to be an active maintainer of the media subsystem, and Herbert Xu's work almost entirely reflects his role as the maintainer of the kernel's crypto subsystem. Krzysztof Kozlowski contributed cleanups throughout the driver subsystem, and Geert Uytterhoeven, despite being the m68k architecture maintainer, did most of his work within the ARM tree and related driver subsystems.

On the "lines added" side, Alex Deucher accounted for nearly half of the entire growth of the kernel this time around with the addition of the new amdgpu graphics driver. Johnny Kim added the wilc1000 network driver to the staging tree, Raghu Vatsavayi added support for Cavium Liquidio Ethernet adapters, Greg Kroah-Hartman removed the obsolete i2o subsystem, and Stephen Boyd removed a bunch of old driver code while adding driver support for QCOM SPMI regulators and more.

The top contributor statistics in recent years have often been dominated by developers generating lots of cleanup patches or reworking staging drivers. One might expect to see a lot of that activity in an especially busy development cycle, but that is not the case for 4.2. Instead, the top contributors include many familiar names and core contributors. One might be tempted to think that the cleanup work is finally approaching completion, but one would be highly likely to be disappointed in future development cycles.

The most active companies supporting development in the 4.2 cycle (of 236 total) were:

Most active 4.2 employers

By changesets

Intel 1665 12.3%

Red Hat 1639 12.1%

(Unknown) 884 6.5%

(None) 884 6.5%

Samsung 681 5.0%

SUSE 496 3.7%

Linaro 449 3.3%

(Consultant) 412 3.0%

IBM 391 2.9%

AMD 286 2.1%

Google 246 1.8%

Renesas Electronics 203 1.5%

Free Electrons 203 1.5%

Texas Instruments 191 1.4%

Facebook 176 1.3%

Oracle 163 1.2%

Freescale 156 1.2%

ARM 145 1.1%

Cisco 142 1.0%

Broadcom 138 1.0%

By lines changed

AMD 438094 36.8%

Intel 96331 8.1%

Red Hat 62959 5.3%

(None) 46140 3.9%

(Unknown) 41886 3.5%

Atmel 34942 2.9%

Samsung 29326 2.5%

Linaro 22714 1.9%

Cisco 21170 1.8%

SUSE 18891 1.6%

Code Aurora Forum 18435 1.5%

Mellanox 18044 1.5%

(Consultant) 15234 1.3%

IBM 15095 1.3%

Cavium Networks 14580 1.2%

Free Electrons 13640 1.1%

Unisys 13428 1.1%

Linux Foundation 12617 1.1%

MediaTek 11856 1.0%

Google 11811 1.0%

Once again, there are few surprises here. At 6.5%, the percentage of changes coming from volunteers is at its lowest point ever. AMD, unsurprisingly, dominated the lines-changed column with the addition of the amdgpu driver. Beyond that, it is mostly the usual companies supporting kernel development in the usual way.

The kernel community depends heavily on its testers and bug reporters; at least some of the time, their contribution is recorded as Tested-by and Reported-by tags in the patches themselves. In the 4.2 development cycle, 946 Tested-by credits were placed in 729 patches, and 611 Reported-by credits were placed in 682 patches. The most active contributors in this area were:

Most active 4.2 testers and reporters

Tested-by credits

Joerg Roedel 40 4.2%

Keita Kobayashi 35 3.7%

Krishneil Singh 31 3.3%

Arnaldo Carvalho de Melo 30 3.2%

Ira Weiny 24 2.5%

Doug Ledford 23 2.4%

Alex Ng 22 2.3%

Aaron Brown 21 2.2%

Javier Martinez Canillas 19 2.0%

ZhenHua Li 19 2.0%

Reported-by credits

Wu Fengguang 76 11.1%

Dan Carpenter 41 6.0%

Russell King 23 3.4%

Ingo Molnar 12 1.8%

Stephen Rothwell 10 1.5%

Linus Torvalds 8 1.2%

Hartmut Knaack 7 1.0%

Huang Ying 6 0.9%

Christoph Hellwig 5 0.7%

Sudeep Holla 5 0.7%

The power of Wu Fengguang's zero-day build robot can be seen here; it resulted in 11% of all of the credited bug reports in this development cycle. The work of all of the kernel's testers and bug reporters leads to a more stable kernel release for everybody. The biggest concern with these numbers, perhaps, is that we might still not be doing a thorough job of documenting the contribution of all of our testers and reporters.

All told, the kernel development community continues to run like a well-tuned machine, producing stable kernel releases on a predictable (and fast) schedule. Back in 2010, your editor worried that the community might be headed toward another scalability crisis, but such worries have proved to be unfounded, for now at least. There must certainly be limits to the volume of change that can be managed by the current development model, but we do not appear to have reached them yet.

Comments (6 posted)

Linus Torvalds .. one more last -rc after all ?

Linus Torvalds Linux 4.2-rc7 ?

Greg KH Linux 4.1.6 ?

Sebastian Andrzej Siewior 4.1.5-rt5 ?

Luis Henriques Linux 3.16.7-ckt16 ?

Greg KH Linux 3.14.51 ?

Greg KH Linux 3.10.87 ?

Ben Hutchings Linux 3.2.71 ?

Suzuki K. Poulose arm64: 16K translation granule support ?

Aneesh Kumar K.V KASan ppc64 support ?

Jiang Liu Enable memoryless node support for x86 ?

Sukadev Bhattiprolu perf: Implement group-read of events using txn interface ?

Josh Poimboeuf Compile-time stack validation ?

Morten Rasmussen sched/fair: Compute capacity invariant load/utilization tracking ?

Patrick Bellasi sched: Central, scheduler-driven, power-perfomance control ?

John Kacur rt-tests-0.93 ?

Roy Pledge Freescale DPAA QBMan Drivers ?

Eric Anholt Raspberry Pi KMS-only driver ?

Kenneth Lee net: Hisilicon Network Subsystem support ?

Leo Yan mailbox: hisilicon: add Hi6220 mailbox driver ?

Hongtao Wu Add MMC host driver for Spreadtrum SoC ?

Pi-Cheng Chen Add Mediatek MT8173 cpufreq driver ?

YH Huang Add MediaTek display PWM driver ?

Sascha Hauer Add Mediatek thermal support ?

MaJun Support Mbigen interrupt controller ?

Chen-Yu Tsai ARM: sunxi: Add Reduced Serial Bus support ?

Archit Taneja mtd: Qualcomm NAND controller driver ?

Yakir Yang Add Analogix Core Display Port Driver ?

Vladimir Barinov iio: adc: hi8435: Add Holt HI-8435 threshold detector ?

Robert Baldyga nfc: Add driver for Samsung S3FWRN5 NFC Chip ?

vndao@altera.com [PATCH v5] mtd:spi-nor: Add Altera Quad SPI Driver ?

Sanchayan Maity Add support for touchscreen on Colibri VF50 ?

Yann Cantin Add a new USB eBeam input driver ?

Taku Izumi FUJITSU Extended Socket network device driver ?

Pankaj Dubey Add support for Exynos SROM Controller driver ?

Cyrille Pitchen add driver for Atmel QSPI controller ?

Qais Yousef Add support for img AXD audio hardware decoder ?

fu.wei@linaro.org Watchdog: introduce ARM SBSA watchdog driver ?

Lee Jones remoteproc: Add driver for STMicroelectronics platforms ?

Dan Williams 'struct page' driver for persistent memory ?

Dan Williams memremap for 4.3 ?

Dan Williams [PATCH v5 0/5] introduce __pfn_t for unmapped pfn I/O and DAX lifetime ?

Christoph Hellwig provide more common DMA API functions ?

atull@opensource.altera.com FPGA Manager Framework and Simple FPGA Bus ?

Baolin Wang Introduce usb charger framework to deal with the usb gadget power negotation ?

Mauro Carvalho Chehab Changes on MC core due to MC workshop discussion ?

Jens Wiklander generic TEE subsystem ?

Hans Verkuil HDMI CEC framework ?

Roger Quadros USB: OTG/DRD Core functionality ?

Alexander Holler deps: deterministic driver initialization order ?

Mauro Carvalho Chehab Document the kABI for the media subsystem ?

Mauro Carvalho Chehab [ANNOUNCE] Report for the Media Controller Workshop - Espoo - Aug, 17 2015 ?

David Drysdale fs: add O_BENEATH flag to openat(2) ?

Ming Lei block: loop: improve loop with AIO ?

Kent Overstreet bcachefs - a general purpose COW filesystem ?

Sergey Senozhatsky zram: add zlib compression bckend support ?

Joonsoo Kim zram: introduce crypto compress noctx API and use it on zram ?

Jérôme Glisse HMM (Heterogeneous Memory Management) v10 ?

Jérôme Glisse Implement ODP using HMM v2 ?

Jérôme Glisse HMM anonymous memory migration. ?

Ebru Akagunduz mm: make swapin readahead to gain more thp performance ?

Joonsoo Kim mm/compaction: redesign compaction ?

Mel Gorman Remove zonelist cache and high-order watermark checking v3 ?

Tom Herbert net: Identifier Locator Addressing - Part I ?

David Ahern VRF-lite - v6 ?

Joe Stringer OVS conntrack support ?

Willem de Bruijn packet: add BPF and eBPF fanout modes ?

Jiri Benc lwtunnel: per route ipv6 support for vxlan ?

Willem de Bruijn socket sendmsg MSG_ZEROCOPY ?

Pravin B Shelar Geneve: Add support for tunnel metadata mode ?

Eric W. Biederman Bind mount escape fixes ?

Andreas Gruenbacher Inode security label invalidation ?

Feng Wu Add VT-d Posted-Interrupts support ?

Marek Olšák libdrm 2.4.63 ?

Masami Hiramatsu perf-probe --cache and SDT support ?

Andi Kleen Announcing simple-pt -- a simple Processor Trace implementation for Linux ?

Copyright assignment and license enforcement for Debian

By Nathan Willis
August 26, 2015

DebConf

At DebConf 2015 in Heidelberg, Germany, Bradley Kuhn from the Software Freedom Conservancy (SFC) announced a new collaboration with the Debian project through which Debian contributors can engage the SFC to act on their behalf to conduct license-compliance efforts. The Debian Copyright Aggregation Project (DCAP) is a voluntary program, but it gives interested Debian developers a means to help ensure that others do not violate the licenses under which their work is published.

Kuhn's session was held on August 15. A video recording [WebM] has since been published, and an announcement has been posted on both the Debian and SFC web sites.

In his talk, Kuhn noted that Debian is one of the few free-software projects to have a fully democratic governance model and that it has remained a "staunchly non-commercial" project since the beginning. Those factors underscore Debian's concern for doing "the morally correct" things important to the hobbyist contributor. That is, Debian is composed of volunteers who make their contribution to Debian their first priority, with any affiliation to an employer coming after that. Consequently, Debian still acts on behalf of individual developers' wishes where a commercial entity might not.

Although Debian is fundamentally about people, he said, the project's most visible assets are those people's copyrights on the software in the Debian archive. Partnering with SFC on the Debian Copyright Aggregation Project is a way for individual developers to leverage those assets "to maximize fair treatment of others"—namely, by ensuring that the copyrights of individual Debian contributors are not violated by third parties failing to adhere to the terms of the relevant software license. The DCAP arrangements were agreed to in April by SFC and then Debian Project Leader (DPL) Lucas Nussbaum, with consultation from Software in the Public Interest (SPI).

DCAP has three dimensions. First, SFC can now accept copyright-assignment agreements from any Debian contributors who choose to participate. Participants can assign any subset of their copyrights that they choose to SFC. Second, any contributors who are not interested in assigning copyrights (which is an essentially permanent arrangement) also have the option of signing an enforcement agreement, under which the contributor can ask SFC to represent it as an "authorized copyright agent" in license-enforcement actions. That agreement lets the developer retain all of their copyrights, and merely allows SFC to conduct enforcement work on their behalf.

The enforcement agreement specifically does not empower SFC to pursue litigation, he added in response to an audience member's question. A contributor interested in making that arrangement could raise it with SFC, but it would require coming to a separate agreement.

Third, SFC will provide license consulting, advice, and compliance services to Debian on an ongoing basis, in coordination with the DPL. The consultation service means that SFC will provide a certain number of pro-bono hours each month to answer questions forwarded by the DPL and to provide policy-related advice.

In the long run, Kuhn said, copyright assignment is a practical tool. It is a sad fact that developers (like everyone else) inevitably pass away or drop out of the project, at which point defending their copyrights becomes arduous at best, if not impossible. Others simply forget to defend their copyrights because they get busy. For those that care deeply about protecting their contributions to free software, the aggregation project will hopefully make the process easier.

Copyright assignment can be a thorny issue, Kuhn admitted. Critics will point to copyright assignment as a tool that companies sometimes use to take decision-making power out of the hands of the developers they employ. But that strategy—which tends to involve producing a proprietary version of a software product as well as an open-source version—only works when the company gets 100% of the copyrights involved. Debian will always be a multi-copyright project, so there is no chance that anyone (the SFC included) could turn copyright assignments against it. Furthermore, he said, assigning copyright to SFC is different than assigning it to a company, because SFC is a US charity and, under US law, a charity cannot be sold.

DCAP is designed to be flexible. Participation is entirely optional, and the enforcement agreements can be canceled at any time (with 30 days notice). Kuhn noted that another former DPL, Stefano Zacchiroli, was the first to sign up for DCAP. Zacchiroli assigned all his copyrights in Debian to the SFC, "past, present, and future." Paul Tagliamonte, in contrast, assigned SFC a subset of the copyrights on his Debian contributions via DCAP.

Developers must currently contact SFC by email (at debian-services@sfconservancy.org) to sign up for the project, but the joint announcement indicates that a self-service enrollment system is under development. Kuhn ended the talk by noting that the copyright assignment and enforcement agreement options were there to provide Debian contributors with a range of options. He predicted that the enforcement agreement would be the more popular choice, particularly since Harald Welte's gpl-violations.org project had shut down, but that he was happy to be able to give something back to the Debian community after years of being a happy user.

[The author would like to thank the Debian project for travel assistance to attend DebConf 2015.]

Comments (2 posted)

Debian and binary firmware blobs

By Nathan Willis
August 26, 2015

DebConf

Debian's annual DebConf event is part conference, part hackathon; various teams and ad-hoc groups meet up over the course of the week to discuss future plans, get work done, and make decisions that are best reached with face-to-face conversation. At the 2015 DebConf, one of those face-to-face conversations dealt with the thorny problem of how Debian should handle binary firmware blobs. Because Debian is a dedicated free-software project, including proprietary firmware in the installation images offered to users is out of the question to most contributors. But that stance makes Debian impossible to install for at least some small percentage of would-be users—which is far from ideal. Nevertheless, the project may have hashed out a way to move forward.

The issue was explored in depth at an August 17 round-table discussion entitled "Firmware - a hard or soft problem?" The session was packed to overflowing; more than 40 people (plus one dog) attempted to cram into the meeting room. Moderating the discussion was Debian developer Steve McIntyre, who leads the debian-cd group responsible for creating and publishing the official Debian ISO images.

The root problem with binary-only firmware, he said, is that it is now quite common for computers—particularly laptops—to include components that cannot function at all without a loadable firmware blob. This includes "almost every WiFi chipset" on the market, which in turn makes it impossible for some users to even install Debian using the default ISO images—because those images quite deliberately do not include any non-free software. Various binary firmware blobs are available through Debian; they currently live in the "non-free" archive area alongside the Adobe Flash plugin and various other proprietary programs.

Most Debian project members recognize the inconvenience (and even counter-productivity) of this situation, and Debian has historically relied on an inelegant workaround. Namely, unofficial ISO images are built that include the binary firmware blobs necessary to bootstrap a Debian installation. To avoid being seen as an endorsement of non-free software, though, those unofficial images are not advertised and project members only direct new users to them reluctantly. "It's a pain," McIntyre said in summary.

Several possible solutions have been proposed in the past. One, for instance, was providing a downloadable tar archive of the all of the binary firmware. If the installer determines that a given system requires a firmware blob, it could point the user to the firmware archive URL. This approach was rejected as unworkable because fetching and loading the tar archive during installation may not be practical. Many of the target computers may not have a second USB port (the installation media occupying one, of course) and users may not be able to run off and find a second memory stick.

Putting the tarball on a second partition on the installer USB stick was discussed, but evidently Windows makes it difficult or impossible to use more than one partition on a USB stick. That would inconvenience Windows users trying to switch to Debian. Given those hurdles, the tarball approach was deemed to not make life easier for end users than the current, unofficial-ISO approach.

It had also been suggested that Debian could simply enable the non-free section by default, and count on educating users to keep the distinction between free and proprietary software clear. This, too, was rejected—even if those participating in the discussion favored it (which they did not), the change would require a General Resolution, and many similar votes have happened in the past, all reconfirming Debian's commitment to not enabling non-free by default.

The proposal that did garner support, though, was splitting the non-free section into two or more parts based on the type of content it contained. Non-free firmware would be one of those; possibly other sections (like one for non-free documentation) would be created as well. In any case, such a split would underscore that there is an important difference between non-free firmware needed to get Debian installed and other proprietary applications or libraries.

Most in the room seemed to agree that this split makes sense. After all, one member of the group said, at least hardware that needs loadable firmware offers the possibility that free firmware will be developed and can be used at a later date; firmware that is burned in does not offer that hope. Whether firmware will be the only section to be carved out into a separate archive component is a matter that is still up for debate. Several other divisions were suggested, but deemed out-of-scope for the session.

However the split is implemented (a task which will ultimately be up to the Debian FTP masters), the next question will be how the availability of the firmware archive should be communicated to the user. Enabling it by default remains out of the question, but the attendees agreed that it was best to communicate the situation to the user during the installation process and give them the opportunity to enable the firmware archive and continue. As of now, a user attempting to install Debian on a machine that requires binary firmware will see that install fail, and only find clues to how the situation can be resolved by reading through some not-well-advertised text files.

Most thought that a "friendly" explanation of the issue was needed—one that said, in essence, "Because the hardware manufacturer of this component does not provide software we can distribute, Debian cannot run on this machine unless you install this non-free add-on" and provided links to more details about the free-software issues involved. All agreed that the wording of this message was critical; several commented that they liked the message that Canonical started using several years ago in its alerts when non-free drivers were needed.

A question was raised in the session about including an "email the manufacturer to ask for free-software support" tool in the installer, similar to the "write your Congressperson" advocacy seen in politics. While most agreed that encouraging some form of free-software advocacy was a worthy goal, the consensus was that identifying the correct company to email might not be possible. In many cases, the real culprit is a chipset maker, not the device maker, and problematic devices routinely switch to new chipsets without changing their USB device IDs. Since those device IDs are all Debian has access to, it may not be possible to unambiguously decide who should get the user's advocacy email.

This is a topic with many angles and plenty of nuance; in the interest of simplifying matters, the participants even agreed to avoid the question of how non-free firmware would impact efforts to get Debian endorsed by the Free Software Foundation. The session had to be drawn to a close before any firm plans were put together. But, as of now, Debian does seem prepared to provide separate access to the non-free firmware many users need to start using Debian in the first place. Doing so without compromising the project's longstanding commitment to free software requires a delicate balancing act, but project members appear to willing to undertake the task.

[The author would like to thank the Debian project for travel assistance to attend DebConf 2015.]

Comments (3 posted)

Distribution quotes of the week

NOBODY expects the Debian acquisition!

-- Romain Francoise

Debian's reached the age of 22
I wish I could be there with you
In Heidelberg, fair German city
To share, in person, this my ditty

...

Free software, arguments, warmth, good cheer
Too soon all over 'til next year
All of the best are there / on 'Net
Here's hope that it's the best Debconf yet

-- Andrew Cater

Once I got over the thrill of being the “superuser,” the unspeakable power I had previously seen only behind plate glass, I became enraptured not so much by Linux itself as by the process in which it had been created—hundreds of people hacking away at their own little corner of the system and using the Internet to swap code, slowly but surely making the system better with each change—and set out to make my own contribution to the growing community, a new distribution called Debian that would be easier to use and more robust because it would be built and maintained collaboratively by its users, much like Linux.

-- Ian Murdock

We do work that is important and often unpaid. We tend to have deep technical skills but exercise them in huge communities where interpersonal issues become magnified. We are activists and artists and architects all at once. We're changing the world in ways that are often unnoticed not only by the public, but by ourselves. This is true of the entire FOSS world, but it seems especially true of Gentoo.

-- Rich Freeman

Comments (4 posted)

Debian and Software Freedom Conservancy announce Copyright Aggregation Project

Software Freedom Conservancy's Bradley M. Kuhn has announced the Conservancy's Debian Copyright Aggregation Project. "This new project, formed at the request of Debian developers, gives Debian contributors various new options to ensure the defense of software freedom. Specifically, Debian contributors may chose to either assign their copyrights to Conservancy for permanent stewardship, or sign Conservancy's license enforcement agreement, which delegates to Conservancy authority to enforce Free Software licenses (such as the GNU General Public License). Several Debian contributors have already signed both forms of agreement."

Full Story (comments: none)

Debian Installer Stretch Alpha 2 release

The second alpha of the Debian installer for 'Stretch' (version 9) has been released. The biggest change in this version is the update of the Linux kernel from the 4.0 series to the 4.1 series.

Full Story (comments: none)

Squeeze non-LTS architectures moving to archive.debian.org

The Long Term Support effort for Debian 6.0 'squeeze' only covers i386/amd64 architectures. Non-LTS architectures will move to archive.debian.org. "This does not (yet) affect other Squeeze suites, like backports, but they will follow soonish."

Full Story (comments: none)

Bits from the Wanna Build team

Mehdi Dogguy reports on the Wanna Build team meeting at DebCamp. "We have worked on getting arch:all packages buildable on our autobuilders. We've got a few patches added to make that happen. Architecture independent packages (arch:all) are now auto-built on dedicated amd64 builders. We tested our changes as much as we were able to and enabled arch:all uploads for Sid and Experimental. If your auto-built arch:all package doesn't make it through to ftp-master's archive, please do contact us so that we can have a look and get it fixed quickly."

Full Story (comments: none)

Distribution newsletters

DistroWatch Weekly, Issue 623 (August 17)
DistroWatch Weekly, Issue 624 (August 24)
openSUSE weekly review (August 13)
openSUSE weekly review (August 22)
Ubuntu Weekly Newsletter, Issue 430 (August 16)
Ubuntu Weekly Newsletter, Issue 431 (August 23)

Comments (none posted)

The State of Fedora: 2015 Edition (Fedora Magazine)

Fedora Magazine reports on Fedora project leader Matthew Miller's keynote at Flock, which is the Fedora contributor conference. He outlined the state of the distribution using some graphs and statistics and said "we’re doing very well as a project and it’s thanks to all of you". The use of Internet Relay Chat (IRC) by the project was another topic: "Fedorans do like to work together. Last year there were 1,066 IRC meetings (official meetings, not just being in IRC talking), and 765 IRC meetings in 2015 alone. 'This shows how vibrant we are, but also is buried in IRC. There’s a lot of Fedora activity you don’t see on the Fedora Web site… I want to look at ways to make that more visible,' says Miller. There are efforts to make the activity more visible, says Miller. 'If I want to interact with the project, is somebody there? Yes, but we have millions of dead pages on the wiki… we need to make this more visible.' IRC is 'definitely a measure of engagement' but it’s also a high barrier of entry, says Miller. 'Wow that’s complicated. Wow, that’s still around?' is a common response from new contributors to IRC. The technology, and 'culture' can be confusing."

Comments (21 posted)

Sabayon Linux development in 2015

Sabayon developer Joost Ruis takes a look at recent developments in the Sabayon Linux project, including new Docker images. "These are forked directly from a Gentoo stage3 docker image. The result is a very clean chroot that is even closer to Gentoo. Our docker pulls in the stage3, adds Sabayon overlay, installs Entropy to a point where it can run. Then it checks the Portage database to list what packages are installed and replaces them with the packages from Entropy. (Ain’t that cool?). Now we can keep our minimal chroot current and easy make changes whenever we want. The docker base image is then being “squashed” so we can feed it as an image to our Molecule™ script that will build our iso images for us. With this move we also made the creation of spins more accessible to developers! Go fork us!"

Comments (none posted)

New features and new widgets in GTK+

By Nathan Willis
August 26, 2015

GUADEC

At the 2015 edition of GUADEC in Gothenburg, Sweden, a series of talks addressed the most recent work on the GTK+ widget toolkit. Matthias Clasen covered the most ground, describing updates to eight existing GTK+ widgets, while Timm Bäder and Matthew Waters presented new GTK+ widgets for image handing and GStreamer media pipelines, respectively.

Improved widgets and controls

Clasen styled his talk ("GTK+ can do this?") as a walkthrough of little-known options and tricks available in the toolkit. Many of the examples involve recent additions to the toolkit, though some of them predate the current GTK+ development cycle. There were enough tips and secrets, he said, that he hoped everyone in the audience would be able to say that they had learned of something new.

First, he addressed scrollbar widgets. Recent changes to GTK+ scrollbars include support for kinetic scrolling and GTK+'s frame-synchronization protocol (which ensures that scrolling appears smooth). But there are little-known features available as well. Some users and developers have grumbled that GTK3 scrollbars lacked the up/down "stepper" buttons of earlier versions; Clasen explained that these steppers can now be added with a simple CSS rule:

    .scrollbar {
      -GtkScrollbar-has-forward-stepper: true;
      -GtkScrollbar-has-secondary-backward-stepper: true
      }

Similarly, GTK3 scrollbars now support moving through a window by page-length increments with shift-clicks, and enabling smooth scrolling via right click.

Clasen's second topic was output-only windows. Previously, input events that happened to land in a GTK+ overlay (say, a tooltip or transient popover) would get passed to the application's top-level parent window, which is not always what the developer desired. In many instances, the correct behavior is to pass the event to a window (say, a toolbar) somewhere in the middle of the hierarchy, and GTK3 now supports this with a GtkOverlay::pass-through property. He also noted that the pass-through functionality can be used to draw decorative overlays.

Clasen then showed how developers can add their own content to the popovers that appear when the user triggers a touch-selection event. The signal is called GtkTextView::populate-popup and, in addition to adding custom options for touch-selection popovers, it can be exploited to customize the right-click context menu of any GTK+ widget. That even includes scrollbar widgets, he noted. "I don't know why you would and I'm not sure it's a good idea," he added, "but if you need to do it, you can."

Arguably more practical than context menus on a scrollbar were two enhancements to GTK+ controls that Clasen demonstrated. One is that spinbuttons—which traditionally allow the user to enter a numeric value by clicking + or - buttons, can now be customized. The range of acceptable values has always been configurable, but now the labels can be, too. He showed an example where the underlying values on a "month" spinbutton were restricted to the interval 1–12, but where each value was presented as the corresponding month name. This was followed by examples that displayed the underlying numeric values as clock times and as hexadecimal digits.

The other enhanced control is the slider, where the user moves a handle across a scale to set the value. But a slider may not correspond to a variable that can accept continuous input. In previous versions of GTK+, it was possible to add tick marks to the scale so that the user could see discrete values along the slider, but it was still possible for the user to leave the slider in between the markers. This has now been fixed; developers can mark a slider as accepting a set of discrete values, and the handle will "stick" to the nearest acceptable value as the user moves it along the scale. To make the feature work, developers will also need to set the round-digits property on the scale, so that only discrete steps are returned:

    <object class=”GtkScale”>
      <property name=”round-digits”>0</property>
    </object>

There are also two new text-related features in GTK+, Clasen said. First, text-view widgets now support Pango markup, which allows the text to be styled or colored and allows various font features (like spacing or character variants) to be activated. Second, any Pango text can be turned into a Cairo path using pango_cairo_layout_path(), which allows it to then be manipulated with a wide variety of Cairo tools and transformations. This should be done with great care, he said, particularly since the Pango-to-Cairo conversion is not very efficient.

For each feature he discussed, Clasen showed example code as well as a live demo using the gtk3-demo application. For those who missed the talk (and until a video of the session is published) his slides [PDF] are available online, as is a blog post showing screenshots of many of the demonstrated features.

New widgets

There were, as one might expect, several other talks that addressed new or ongoing work within GTK+. Emmanuelle Bassi gave an update on his work creating the spiritual successor to the Clutter toolkit, GTK+ Scene Graph Kit (GSK), although he said the code was not yet ready to be released for mass consumption. Several of the LibreOffice talks referenced Pranav Kant's Google Summer of Code (GSoC) project with GNOME Documents, in which he made progress toward a new GTK+ widget for accessing a LibreOffice document from any GTK+ program.

Another GsoC intern, Timm Bäder, presented a lightning talk about his project: a GTK+ widget named GtkImageView. The widget's purpose, he said, is to provide developers with a convenient way to show images to users—in particular, large images that are too big for GtkImage. That existing widget is optimized for icons, button images, and similar small content. It starts to break down when loading large images, however.

The new widget can load any image type supported by gdk-pixbuf, and it loads content asynchronously. It also implements scaling and rotation functions, and it supports GTK+'s internal scale-factor setting, so it works on high-DPI displays. Bäder is still at work on the widget, he said, and may add more features in the future, such as touch-gesture support.

The last new GTK+ widget discussed at the event was Matthew Waters's gtkgst, a widget for displaying the output of a GStreamer pipeline in a GTK+ application. Obviously both GTK+ and GStreamer are mature projects at this point, so Waters started off his talk by explaining the difficulties of working with both of them in a single application.

The main difficulty is that GStreamer pipelines are inherently complex beasts: they have to handle a wide range of video codecs, color spaces, scaling factors, and effects when showing a video—and even more variables when generating or editing video. Historically, embedding a video in a GTK+ window has added even more complications, requiring the application to push key and mouse events into GStreamer, to notify the GStreamer video sink of resize events, and to perform careful setup that is highly dependent on the details of the windowing system.

Waters's new widget is an attempt to wrap such details into a more convenient package. The code lives in GStreamer's "plugins-bad" package for now (although, when stable, it will likely move to "plugins-good"). An application developer only needs to set up the GStreamer video pipeline they require and connect it to the gtkgst sink; that will provide a GTK+ widget that can be placed anywhere in the widget hierarchy. That allows for clean separation between the GStreamer and GTK+ sides of the code, which should simplify development and troubleshooting. In response to an audience question, he said that gtkgst renders video far more smoothly than Clutter.

The current implementation renders the video into a GtkDrawingArea, but Waters is in the process of implementing it using OpenGL. That would enable hardware acceleration and multithreading, he said, although there are a number of challenges to overcome before it is ready for general usage. Both GTK+ and GStreamer have supported OpenGL for some time, but hooking them up to one another is not quite trivial. His code works on X11 and Wayland so far, and he hopes to add Mac OS X and Windows support in the future.

GTK+ is approaching 20 years of age, and while there are certainly longer continuously running projects in free software, it can be easy for a project to stop evolving to ever-changing circumstances and developer expectations. Nevertheless, the toolkit seems to be resilient and is still adapting to support new uses with each passing release.

[The author would like to thank the GNOME Foundation for travel assistance to attend GUADEC 2015.]

Comments (2 posted)

Glibc wrappers for (nearly all) Linux system calls

By Jonathan Corbet
August 20, 2015

The GNU C Library (glibc) is a famously conservative project. In the past, that conservatism created a situation where there is no way to directly call a number of Linux system calls from a glibc-using program. As glibc has relaxed a bit in recent years, its developers have started to reconsider adding wrapper functions for previously inaccessible system calls. But, as the discussion shows, adding these wrappers is still not as straightforward as one might think.

A C programmer working with glibc now would look in vain for for a straightforward way to invoke a number of Linux system calls, including futex(), gettid(), getrandom(), renameat2(), execveat(), bpf(), kcmp(), seccomp(), and a number of others. The only way to get at these system calls is via the syscall() function. Over the years, there have been requests to add wrappers for a number of these system calls; in some cases, such as gettid() and futex(), the requests were summarily rejected by the (at-the-time) glibc maintainer in fairly typical style. More recently these requests have been reopened and others have been entertained, but there have been no system-call wrappers added since glibc 2.15, corresponding roughly to the 3.2 kernel.

On the face of it, adding a new system-call wrapper should be a simple exercise. The kernel has already defined an API for the system call, so it is just a matter of writing a simple function that passes the caller's arguments through to the kernel implementation. Things quickly get more complicated than that, though, for a number of reasons, but they all come down to one root cause: glibc is not just a wrapper interface for kernel-supplied functionality. Instead, it provides a (somewhat standard-defined) API that is meant to be consistent and independent of any specific operating system.

There are provisions for adding kernel-specific functions to glibc now; those functions will typically fail (with errno set to ENOSYS) when called on a kernel that does not support them. Examples of such functions include the Linux-specific epoll_wait() and related system calls. As a general rule, though, the glibc developers, as part of their role maintaining the low-level API for the GNU system, would like to avoid kernel-specific additions.

This concern has had the effect of keeping a lot of Linux system-call wrappers out of the GNU C Library. It is not necessarily that the glibc developers do not want that functionality, but figuring out how a new function would fit into the overall GNU API is not a straightforward task. The ideal interface may not (from the glibc point of view) be the one exposed by the Linux kernel, so another may need to be designed. Issues like error handling, thread safety, support on non-Linux systems, and POSIX-thread cancellation points can complicate things considerably. In many cases, it seems that few developers have wanted to run the gauntlet of getting new system-call wrappers into the library, even if the overall attitude toward such wrappers has become markedly more friendly in recent years.

Back in May 2015, Joseph Myers proposed relaxing the rules just a little bit, at least in cases when the functionality provided by a wrapper might be of general interest. In such cases, Joseph suggested, there would be no immediate need to provide support for other operating-system kernels unless somebody found the desire and the time to do the work.

Roland McGrath is, by his own admission, the hardest glibc developer to convince about the value of adding Linux-specific system calls to the library. He still does not see a clear case for adding many Linux system-call wrappers to the core library; it is only clear, he said, when the system call is to be a part of the GNU API:

My top concern is adding cruft to the core libc ABIs. That means specifically symbols in the shared objects for libc, libpthread, librt, libdl, libm, and libutil.

I propose that we rule out adding any symbols to the core libc ABIs that are not entering the OS-independent GNU API.

Roland does not seem to believe that glibc should entirely refuse to support system calls that don't meet the above criterion, though. Instead, he suggested creating another library specifically for them. It would be called something like "libinux-syscalls" (so that one would link with "-linux-syscalls"). Functions relegated to this library should be simple wrappers, without internal state, with the idea that supporting multiple versions of the library would be possible.

There was some discussion on the details of this idea, but the core of it seems to be relatively uncontroversial. Also uncontroversial is the idea that glibc need not provide wrappers for system calls that are obsolete, that cannot be used without interfering with glibc (set_thread_area() is an example), or those that are expected to have a single caller (such as create_module()). So Carlos O'Donell has proposed a set of rules that would clear the way for the immediate addition of operating-system-independent system calls into the core and the addition of a system-dependent library for the rest.

Of course, "immediate" is a relative term. Any system-call wrappers will still need to be properly implemented and documented, with test cases and more. There is also, in some cases, the more fundamental question of what the API should look like. Consider the case of the futex() system call, which provides access to a fast mutual-exclusion mechanism. As defined by the kernel, futex() is a multiplexer interface, with a single entry point providing access to a range of different operations.

Torvald Riegel made the case that exposing this multiplexer interface would do a disservice to glibc users:

Keeping the multiplexing is bad for users. Can you tell me off-hand what goes in "uaddr2", "val", or "val3" for all the ops? Is it easy to remember based on the function signature? Can you remember in which cases "timeout" is actually "val2" and not a pointer but cast to uint32_t? So are we going to expect users to cast uint32_t's to a pointer to call one of the operations and consider that a useful API design? It's a nice way to potentially trigger compiler warnings though.

He proposed exposing a different API based around several functions with names like futex_wake() and futex_wait(); he also posted a patch implementing this interface. Joseph, while not disagreeing with that interface, insisted that the C library should provide direct access to the raw system call, saying: "The fact that, with hindsight, we might not have designed an API the way it was in fact designed does not mean we should embed that viewpoint in the choice of APIs provided to users". In the end, the two seemed to agree that both types of interface should, in some cases, be provided. If the C library can provide a useful higher-level interface, that may be appropriate to add, but more direct access to the system call as provided by the kernel should be there too.

The end result of all this is that we are likely to see a break in the logjam that has kept new system-call wrappers out of glibc. Some new wrappers could even conceivably show up in the 2.23 release, which can be expected sometime around February 2016. Even if the attitude and rules have changed, though, this is still glibc we are talking about, so catching up with the kernel may take a while yet. But one can take comfort in the fact that a path is now visible, even if it may yet be a slow one.

Comments (36 posted)

Quotes of the week

"Software as a service" is a competitor to "software."

— Asheesh Laroia at DebConf

You're trying, awesome! But when your customer service video for Linux support starts with "Apply kernel patches", you have already failed.

— Sarah Sharp

Comments (5 posted)

Glibc 2.22 released

Version 2.22 of the GNU C Library is out. The biggest user-visible changes are an update to Unicode 7.0.0 and the addition of a vectorized math library for the x86_64 architecture. Beyond that, of course, there is a pile of bug fixes, a few of which address security-related problems.

Full Story (comments: 22)

Rkt 0.8 released

Version 0.8 of the rkt container specification has been released. The changelog notes that this version adds support for running under the LKVM hypervisor and adds experimental support for user namespaces. Other features include improved integration with systemd and additional functional tests. An accompanying blog post goes into further detail for many of these new features.

Comments (1 posted)

WordPress 4.3 released

Version 4.3 of the WordPress blogging platform has been released. New features include keyboard shortcuts for formatting text while editing posts, a site-icon creator, and support for sending password-reset links to users (rather than emailing users their lost passwords).

Comments (none posted)

Go 1.5 released

Version 1.5 of the Go language has been released. "This release includes significant changes to the implementation. The compiler tool chain was translated from C to Go, removing the last vestiges of C code from the Go code base. The garbage collector was completely redesigned, yielding a dramatic reduction [PDF] in garbage collection pause times. Related improvements to the scheduler allowed us to change the default GOMAXPROCS value (the number of concurrently executing goroutines) from 1 to the number of available CPUs. Changes to the linker enable distributing Go packages as shared libraries to link into Go programs, and building Go packages into archives or shared libraries that may be linked into or loaded by C programs (design doc)."

Comments (162 posted)

KDE Ships Plasma 5.4.0, Feature Release for August

KDE has released Plasma 5.4 with some new features. "This release of Plasma brings many nice touches for our users such as much improved high DPI support, KRunner auto-completion and many new beautiful Breeze icons. It also lays the ground for the future with a tech preview of Wayland session available. We're shipping a few new components such as an Audio Volume Plasma Widget, monitor calibration tool and the User Manager tool comes out beta."

Comments (16 posted)

ArgyllCMS 1.8.0 released with support for SwatchMate Cube colorimeter (Libre Graphics World)

Libre Graphics World has posted a look at the latest release of the ArgyllCMS color-management system. New in the release is support for several new hardware colorimeters, from a low-cost Kickstarter-funded device to a € 2,800 professional tool.

Comments (none posted)

ownCloud Desktop Client 2.0 is available

Version 2.0 of the desktop client for ownCloud has been released. This update adds support for working with multiple ownCloud accounts and a setting to let users synchronize only files underneath a specified file size.

Comments (none posted)

Development newsletters from the past two weeks

What's cooking in git.git (August 12)
What's cooking in git.git (August 17)
What's cooking in git.git (August 25)
LLVM Weekly (August 17)
LLVM Weekly (August 24)
OCaml Weekly News (August 25)
OpenStack Community Weekly Newsletter (August 21)
Perl Weekly (August 17)
Perl Weekly (August 24)
PostgreSQL Weekly News (August 22)
PostgreSQL Weekly News (August 23)
Python Weekly (August 13)
Python Weekly (August 20)
Ruby Weekly (August 13)
Ruby Weekly (August 20)
This Week in Rust (August 17)
This Week in Rust (August 24)
Tor Weekly News (August 14)
Tor Weekly News (August 20)
Wikimedia Tech News (August 17)
Wikimedia Tech News (August 24)

Comments (none posted)

Schaller: An Open Letter to Apache Foundation and Apache OpenOffice team

Christian Schaller has posted an open letter to the Apache Software Foundation with a non-trivial request: "So dear Apache developers, for the sake of open source and free software, please recommend people to go and download LibreOffice, the free office suite that is being actively maintained and developed and which has the best chance of giving them a great experience using free software. OpenOffice is an important part of open source history, but that is also what it is at this point in time."

In this context, it's interesting to note that OpenOffice project chair Jan Iverson recently stepped down, listing resistance to an effort to cooperate with LibreOffice as one of the main reasons. The project currently looks set to name Dennis Hamilton (who is running unopposed) as its new chair.

Comments (146 posted)

Mozilla: The Future of Developing Firefox Add-ons

Mozilla has announced a significant set of changes for authors of Firefox add-ons. These include a new API (and the deprecation of XUL and XPCOM), a process-based sandboxing mechanism, mandatory signing of extensions, and more. "For our add-on development community, these changes will bring benefits, like greater cross-browser add-on compatibility, but will also require redevelopment of a number of existing add-ons. We’re making a big investment by expanding the team of engineers, add-on reviewers, and evangelists who work on add-ons and support the community that develops them. They will work with the community to improve and finalize the WebExtensions API, and will help developers of unsupported add-ons make the transition to newer APIs and multi-process support."

Comments (90 posted)

The Open Mainframe Project

The Linux Foundation has announced the launch of the Open Mainframe Project. "In just the last few years, demand for mainframe capabilities have drastically increased due to Big Data, mobile processing, cloud computing and virtualization. Linux excels in all these areas, often being recognized as the operating system of the cloud and for advancing the most complex technologies across data, mobile and virtualized environments. Linux on the mainframe today has reached a critical mass such that vendors, users and academia need a neutral forum to work together to advance Linux tools and technologies and increase enterprise innovation."

Comments (15 posted)

GUADEC videos released

GUADEC was held in Gothenburg, Sweden on August 7–9. Videos of the presentations are available.

Comments (none posted)

FSF30: Get in on the party and User Freedom Summit

The Free Software Foundation will have a 30th birthday party in Boston, Massachusetts on October 3. There will be a User Freedom Summit in the daytime, before the party. "We know that not every free software fan can join us in person in Boston -- so we're hosting a party network where you can promote your own party (we'll even offer some ideas for making your event lots of fun!) We'll have a livestream of the Boston party, and welcome photos and reports from your own parties, too!"

Full Story (comments: none)

Happy 24th birthday, Linux kernel (Opensource.com)

Opensource.com wishes Linux a happy 24th birthday, with a brief timeline of Linux history. "There's some debate in the Linux community as to whether we should be celebrating Linux's birthday today or on October 5 when the first public release was made, but Linus says he is O.K. with you celebrating either one, or both! So as we say happy birthday, let's take a quick look back at the years that have passed and how far we have come."

Comments (2 posted)

Ubuntu on the Mainframe: Interview with Canonical's Dustin Kirkland (Linux.com)

Linux.com has an interview with Dustin Kirkland of Canonical's Ubuntu Product and Strategy team, about Ubuntu on the mainframe and more. "Canonical is doing a lot of different things in the enterprise space, to solve different problems. One of the interesting works going on at Canonical is Fan networking. We all know that the world is running out of IPv4 addresses (or already has). The obvious solution to this problem is IPv6, but it’s not universally available. Kirkland said, "There are still places where IPv6 doesn't exist -- little places like Amazon web services where you end up finding lots of containers." The problem multiplies as many instances in cloud need IP addresses. "Each of those instances can run hundreds of containers, each of those containers then needs to be addressable," said Kirkland."

Comments (none posted)

Speaking Opportunities: O'Reilly Fluent 2016 - Developing the Web

The O'Reilly Fluent Conference will take place March 8-10, 2016 in San Francisco, CA. "Fluent covers the full scope of the Web Platform and associated technologies, including WebGL, CSS3, mobile APIs, Node.js, AngularJS, ECMAScript 6, and more." The call for papers closes September 21.

Full Story (comments: none)

CFP Deadlines: August 27, 2015 to October 26, 2015

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

Deadline	Event Dates	Event	Location
August 31	November 21 November 22	PyCon Spain 2015	Valencia, Spain
August 31	October 19 October 22	Perl Dancer Conference 2015	Vienna, Austria
August 31	November 5 November 7	systemd.conf 2015	Berlin, Germany
August 31	October 9	Innovation in the Cloud Conference	San Antonio, TX, USA
August 31	November 10 November 11	Open Compliance Summit	Yokohama, Japan
September 1	October 1 October 2	PyConZA 2015	Johannesburg, South Africa
September 6	October 10	Programistok	Białystok, Poland
September 12	October 10	Poznańska Impreza Wolnego Oprogramowania	Poznań, Poland
September 15	November 9 November 11	PyData NYC 2015	New York, NY, USA
September 15	November 14 November 15	NixOS Conference 2015	Berlin, Germany
September 20	October 26 October 28	Samsung Open Source Conference	Seoul, South Korea
September 21	March 8 March 10	Fluent 2016	San Francisco, CA, USA
September 25	December 5 December 6	openSUSE.Asia Summit	Taipei, Taiwan
September 27	November 9 November 11	KubeCon	San Francisco, CA, USA
September 28	November 14 November 15	PyCon Czech 2015	Brno, Czech Republic
September 30	November 28	Technical Dutch Open Source Event	Eindhoven, The Netherlands
September 30	November 7 November 8	OpenFest 2015	Sofia, Bulgaria
September 30	December 27 December 30	32. Chaos Communication Congress	Hamburg, Germany
October 1	April 4 April 6	Web Audio Conference	Atlanta, GA, USA
October 2	October 29	FOSS4G Belgium 2015	Brussels, Belgium
October 2	December 8 December 9	Node.js Interactive	Portland, OR, USA
October 15	November 21	LinuxPiter Conference	Saint-Petersburg, Russia

If the CFP deadline for your event does not appear here, please tell us about it.

Events: August 27, 2015 to October 26, 2015

The following event listing is taken from the LWN.net Calendar.

Date(s)	Event	Location
August 28 September 3	ownCloud Contributor Conference	Berlin, Germany
August 29	EmacsConf 2015	San Francisco, CA, USA
September 2 September 6	End Summer Camp	Forte Bazzera (VE), Italy
September 10 September 12	FUDcon Cordoba	Córdoba, Argentina
September 10 September 13	International Conference on Open Source Software Computing 2015	Amman, Jordan
September 11 September 13	vBSDCon 2015	Reston, VA, USA
September 15 September 16	verinice.XP	Berlin, Germany
September 16 September 18	PostgresOpen 2015	Dallas, TX, USA
September 16 September 18	X.org Developer Conference 2015	Toronto, Canada
September 19 September 20	WineConf 2015	Vienna, Austria
September 21 September 23	Octave Conference 2015	Darmstadt, Germany
September 21 September 25	Linaro Connect San Francisco 2015	San Francisco, CA, USA
September 22 September 24	NGINX Conference	San Francisco, CA, USA
September 22 September 23	Lustre Administrator and Developer Workshop 2015	Paris, France
September 23 September 25	LibreOffice Conference	Aarhus, Denmark
September 23 September 25	Surge 2015	National Harbor, MD, USA
September 24	PostgreSQL Session 7	Paris, France
September 25 September 27	PyTexas 2015	College Station, TX, USA
September 28 September 30	Nagios World Conference 2015	Saint Paul, MN, USA
September 28 September 30	OpenMP Conference	Aachen, Germany
September 29 September 30	Open Source Backup Conference 2015	Cologne, Germany
September 30 October 2	Kernel Recipes 2015	Paris, France
October 1 October 2	PyConZA 2015	Johannesburg, South Africa
October 2 October 4	PyCon India 2015	Bangalore, India
October 2 October 3	Ohio LinuxFest 2015	Columbus, OH, USA
October 5 October 7	LinuxCon Europe	Dublin, Ireland
October 5 October 7	Qt World Summit 2015	Berlin, Germany
October 5 October 7	Embedded Linux Conference Europe	Dublin, Ireland
October 8	OpenWrt Summit	Dublin, Ireland
October 8 October 9	CloudStack Collaboration Conference Europe	Dublin, Ireland
October 8 October 9	GStreamer Conference 2015	Dublin, Ireland
October 9	Innovation in the Cloud Conference	San Antonio, TX, USA
October 10	Programistok	Białystok, Poland
October 10	Poznańska Impreza Wolnego Oprogramowania	Poznań, Poland
October 10 October 11	OpenRISC Conference 2015	Geneva, Switzerland
October 14 October 16	XII Latin American Free Software	Foz do Iguacu, Brazil
October 17	Central Pennsylvania Open Source Conference	Lancaster, PA, USA
October 18 October 20	2nd Check_MK Conference	Munich, Germany
October 19 October 23	Tcl/Tk Conference	Manassas, VA, USA
October 19 October 22	ZendCon 2015	Las Vegas, NV, USA
October 19 October 22	Perl Dancer Conference 2015	Vienna, Austria
October 21 October 22	Real Time Linux Workshop	Graz, Austria
October 23 October 24	Seattle GNU/Linux Conference	Seattle, WA, USA
October 24 October 25	PyCon Ireland 2015	Dublin, Ireland

If your event does not appear here, please tell us about it.

LWN.net Weekly Edition for August 27, 2015

The hardware

The software

Kernel and Clang

Using multiple compilers

BPF and LLVM

Alternatives for the core

Conclusion

Security

Brief items

New vulnerabilities

audit: unsafe escape-sequence handling

conntrack: denial of service

extplorer: cross-site scripting

firefox: multiple vulnerabilities

golang: HTTP request smuggling

jasper: denial of service

kdepim: no attachment encryption

libstruts1.2-java: unclear vulnerability

mediawiki: multiple vulnerabilities

mysql: unspecified vulnerability

nagios-plugins: three vulnerabilities

net-snmp: code execution

openshift: privilege escalation

openssh: multiple vulnerabilities

openstack-neutron: denial of service

owncloud: three vulnerabilities

pcre: code execution

php: multiple vulnerabilities

python-django: multiple vulnerabilities

python-django-horizon: cross-site scripting

qemu: two vulnerabilities

request-tracker4: cross-site scripting

roundup: multiple vulnerabilities

ruby: information disclosure

strongswan: incorrect payload processing

twig: code execution

uwsgi: denial of service

virtualbox: unspecified vulnerability

vlc: code execution

webkitgtk4: three unspecified vulnerabilities

wireshark: multiple vulnerabilities

zendframework: XML external entity attack

Kernel development

Brief items

Kernel development news

Improved integration

Schedtune

Is a new port necessary?

Get to know your hardware

Get to know the kernel

How to start?

Finally getting started: the minimal set of non-code files

Conclusion

Patches and updates

Kernel trees

Architecture-specific

Core kernel code

Development tools

Device drivers

Device driver infrastructure

Documentation

Filesystems and block I/O

Memory management

Networking

Security-related

Virtualization and containers

Miscellaneous

Distributions

Brief items

Distribution News

Debian GNU/Linux

Newsletters and articles of interest

Development

Improved widgets and controls

New widgets

Brief items

Newsletters and articles

Announcements

Brief items