|
|
Log in / Subscribe / Register

LWN.net Weekly Edition for September 22, 2016

Variations fonts and OpenType 1.8

By Nathan Willis
September 21, 2016

ATypI

The first day of the 2016 ATypI conference was marked by the release of a major update to the OpenType font format. A panel of speakers representing the major players in font technology over the past several decades announced the update together. Although there are several changes under the hood, the key new feature is the ability for a single font file to encode sets of delta values that programs can use to automatically interpolate changes in font weight, width, and other features, each throughout a continuous range. When software support is fully rolled out, that will all but eliminate the need to distribute font families as collections of individual font files, as well as removing many longstanding assumptions about font usage.

Variations

Peter Constable from Microsoft started the session off, noting that the results being presented were a year in the making, based on [Peter Constable] discussions held at ATypI last year. But, he added, "there is nothing new under the sun." Most of the ideas encapsulated in the new revision have existed in other forms for quite some time.

A variations font (or, alternatively, variable font) is a single font file that can behave like multiple fonts. At runtime, any combination of settings on a continuous variation range (requested by the user or by software) can be used. Variations fonts can have multiple axes of variation (such as weight, width, and slant). There are no intrinsic limits to how many variation axes a font can provide, and there is no need for any particular glyph to be altered by every axis. For example, an "optical size" axis might be expected to increase the relative heights of lower-case glyphs with respect to the upper-case glyphs, but leave the punctuation unaltered.

The variations can affect not just glyph outlines, but font metrics, too: widths, font-wide ascent and descent values, left and right side bearings, horizontal and vertical kerning, and so on. Variations can also affect OpenType layout rules—for instance, by substituting simpler versions of a glyph at small optical sizes or at extremely heavy weights, or by adjusting the positioning rules used in complex scripts like Indic and Arabic. Furthermore, the variation data is stored in a set of newly defined tables, leaving the basic table set untouched. That means that any program incapable of reading the variation data will still see a valid font—just one that has only the default glyph shapes and metrics.

The new format is designed to be compact, Constable said. In [Variation axes] tests, a two-axis variations font that would be suitable to replace a family of ten traditional instance fonts took up just 30% of the size. Ultimately, the savings could be slightly better than that, he said, but the test font had not yet implemented hinting. This means smaller file sizes for operating systems, embedding in documents, and delivering over the web.

Constable also noted that the new 1.8 specification was an example of cross-industry collaboration. Much of the groundwork now implemented in the OpenType variations font specification comes from ideas implemented in the 1990s in Apple's TrueType GX. In 2015, Microsoft noticed the increasing interest in variable font techniques (such as the responsive typography trend) and reached out to Apple to discuss the matter. Little did they know at that time that Google and Adobe were also discussing the idea but, within a few months, all of the players got together and began working in earnest on the revision.

One benefit of that approach is that the new revision supports all of the various flavors of OpenType, including the TrueType flavor preferred by Microsoft and the Compact Font Format (CFF) flavor initiated by Adobe. Just as importantly, the specification defines not just the data structures used in the files, but the algorithms used to interpret them. So how the values in a font instance are derived from the base font and the variations tables is precisely defined, meaning that there should be no ambiguity in how they are interpreted. This was an issue with TrueType GX, Constable noted: GX fonts could specify their own range of possible weight values, but how Apple chose to interpret that range was not specified. In the OpenType 1.8 implementation, by contrast, weights and the ranges on all other axes must first be normalized to the interval [-1,1] before any interpolation is done.

Axes and instances

Apple's Ned Holbrook spoke next, delving a bit further into the specifics. Each axis of variation in a particular variations font has to be declared with a default, minimum, and maximum value. The axes [Ned Holbrook] are specified in a table named fvar. Five axes are predefined in the specification: weight, width, slant, italics (as distinct from slant), and optical size. If more axes prove generally useful, they can be added to the predefined list. In the meantime, designers can create and name any axes they wish to—say, stroke contrast.

The font's basic glyph, metric, and layout numbers (in the existing tables, that is) correspond to the default setting. In font-interpolation terms, each variation design included in the font file is called a "master." For each master, the font will include a set of delta values relative to the default. So, for each glyph outline that changes, there will be the default outlines as in a regular font, which are defined by a set of Bézier curve points. Then, for each master, a separate table would include a set of (delta X, delta Y) values, each of which corresponds to the adjustments made to one of the Bézier curve points. In OpenType fonts that use TrueType curves, the name of this variations table is gvar; in CFF-based OpenType fonts, the table is named CFF2.

The generic case used in most examples is a font with two variation axes: weight and width. That provides a two-dimensional [-1,1] space through which the proportions of the font can be varied. The "regular" version of the font would typically be at the origin, [0,0], and the font would include a set of deltas for the four corners: [1,1], [1,-1], [-1,-1], and [-1,1]. Those corners correspond to "maximum weight, maximum width," "maximum weight, minimum width," and so on.

So there are, in essence, five versions of the font encoded into the file. But the real benefit of the format is that a new font instance can be generated by interpolating between the masters provided, enabling a new instance to be created at runtime based only on a (weight, width) tuple requested by the user or software.

Naturally, the existence of that many possible variations can present a challenge for fitting information into the traditional font menu—at least for non-graphic-design apps like word processors. The solution, Holbrook [Variations deltas] explained, is that variations font files can also include a set of "named instances" that include a traditional font name (e.g., Source Sans Extra Bold) and its associated tuples for the relevant variation axes. The system can choose to present only these options to the user, simplifying usage.

Looking a bit beyond the generic case, Holbrook explained that a font does not have to declare [0,0] to be its default instance (nor [0,0,0] for a three-axis font, and so on), nor does it have to set its masters at 1 and -1. If the designer chooses to make a font that only covers a small range of weights, for example, that is perfectly acceptable.

There is also a mechanism available to specify more complicated interpolation. By default, the interpolation between any two master designs is linear. But the font can include an optional avar table that specifies a scaling factor to use between specific masters. By including several such scaling regions, one can approximate non-linear interpolation if desired.

Metrics and details

Next, Google's Behdad Esfahbod discussed how interpolation between masters can affect things other than the glyph outlines. Essentially, [Behdad Esfahbod] any data in the font can be interpolated between masters. As is the case with the Bézier points that make up glyph outlines, changes relative to the default font instance are stored in new tables. Deltas to changes in font-wide metrics are stored in the MVAR table, changes to horizontal metrics in HVAR, vertical metrics in VVAR, and hinting in cvar.

Fonts may also need to vary positioning data, Esfahbod said, such as the attachment points used for diacritics or for glyphs in connected scripts. Glyph substitution rules can also change between masters; for example, in extra-bold weights the designer might wish to swap in a simplified version of a glyph (in comparison to the default) simply to prevent that glyph from turning into a difficult-to-read black blob. Unlike the other kinds of variation data discussed, these positioning and substitution changes are encoded in an existing table, GDEF, but in a new subtable called Item Variation Store.

Constable then spoke briefly about hinting. Hinting data, he said, consists mainly of rules that define relationships between different outline points (such as defining two points as belonging to the same stem) or between an outline point and a font metric (such as defining that a point belongs on the base line). Because those relationships do not usually change, interpolating the hinting between two masters is a relatively simple affair, affecting only some key values that are stored in the control value table (CVT). Hints for masters in a variations font can thus be generated almost automatically (although manual review remains a good idea); in the new revision, any such changes are encoded in the new cvar table.

Finally, Adobe's David Lemon spoke about the new CFF2 table, which encodes the glyph-variation deltas for CFF fonts. The original CFF table, he said, was created for PostScript [David Lemon] Type 1 fonts, and was grafted into OpenType after the fact. Thus, while its name ("Compact Font Format") belies its original intent, the table format actually includes quite a few redundancies usually found elsewhere in an OpenType file.

20 years later, he said, CFF is only used in OpenType. So the changes in OpenType 1.8 presented Adobe with a chance to do a little "housecleaning" of the format. CFF2 drops the redundant fields, further compacts internal data representations, and makes other improvements—in addition to supporting variations fonts. Among the other notable changes is that the new table format deprecates the rather unloved CID encoding originally developed for Chinese, Japanese, and Korean fonts.

Adobe contributed a CFF2 parser and rasterizer to FreeType, and is on schedule to update its font-building tools to support the new format by the end of September.

There were several other changes made in the new revision of the OpenType specification—Constable mentioned, for example, the MERG table, which lets fonts indicate to the text renderer that certain composite glyphs (e.g., a letter with an attached diacritic) should be cached after they are rendered, not before as is the default. That will prevent rendering glitches seen when the base character exists in the renderer's cache but the diacritic does not. The format now also fully supports all of the chromatic font approaches proposed by various players in recent years.

But variations fonts are by far the biggest change; in fact, Constable called it the largest change to OpenType since the format's inception. Considerable work remains to be done before end users can reap all of the benefits of the new format, but there are plenty of places where improvements will be noticed. Smaller file sizes will benefit distributors and network services, fully adjustable fonts will benefit layout tools and publishers, and more flexibility in fonts will, hopefully, benefit end users.

Comments (5 posted)

BubbleKern

By Nathan Willis
September 21, 2016

ATypI

At ATypI 2016 in Warsaw, Toshi Omagari presented an open-source tool he has developed to partially automate the repetitive task of generating kerning data for fonts. The program is called BubbleKern and, although it does not fully automate the kerning process, it may strike a balance that many font designers find useful.

Kerning traditionally referred to carving cutouts into the physical metal sorts for particular letters, Omagari said, which we would today refer to as negative sidebearings. It enabled letters with an overhang, like "f", which could otherwise not fit next to standard letters. The wood-type printing era used an approach more like what is done in digital type today, however. Many standard blocks would be cut down so that would fit together more closely. The visual gap between problematic pairs like "AW" is a common example; Omagari showed images of wood-type blocks cut down with carpentry tools to produce a better fit. That fine-tuning is akin to what designers do in digital type.

But, generally speaking, kerning today refers to any kind of conditional spacing. Type designers regularly do kerning for standard [Toshi Omagari] alphabetic fonts, while it is a far more complicated process for some specialty fonts like those for typesetting mathematics. Most font editors provide only a basic mechanism for kerning: there is a text preview window, and the user can enter adjustments to change the distance between a pair of letters, as many times as is necessary to achieve an appropriate result.

Periodically, someone makes an attempt to automate or otherwise improve the kerning process in a font editor. Omagari showed screenshots of the oldest such tool he has found: Calamus Type Art, a 1989 program for the Atari that, against all odds, is still available for purchase today; the most recent update being 1999's version 2.0.

Coincidentally, the kerning tool in Calamus Type Art uses some similar steps to BubbleKern's approach, although Omagari noted that he only learned about the other program after the fact.

BubbleKern is implemented as a set of scripts for the Glyphs font editor. Using it requires three steps. First, the user creates a drawing layer called "bubble" and, for each glyph of interest, draws a shape that encloses the glyph. These bubbles, he said, are meant to represent the necessary space each glyph requires. Unlike sidebearings, which are a single signed value, the bubbles can take the shape and slope of the glyph into consideration. Drawing the bubbles is a free-form operation; they can include as many straight lines or curves as necessary.

Second, before the kerning step itself, the user first selects all glyphs of interest—running the script against every glyph in a large font is generally regarded as a bad idea given the enormous number of permutations. Finally, the script then reduces each bubble to a series of horizontal, rectangular bands 20 points high. Since a standard [BubbleKern bubbles] OpenType font is designed on a 1000-point grid, there are thus 50 bands per glyph. The script then compares pairs of selected glyphs, moving them toward each other until there is a collision between the bands. The adjustment required to make the bubbles touch is then saved as a kern in the font source file.

The process is inevitably an iterative one, Omagari said; the user will want to edit the bubbles and re-run the kerning script. So BubbleKern provides a visual preview of the bubble layer. It is also possible to run the kerning step on a selected sample text rather than to generate all permutations for selected glyphs; that may also save time for some designers.

Several corner cases require special attention. Composite glyphs (such as accented characters) simply inherit the bubbles of their base glyphs (that is, the unaccented character). Glyph pairs that will not ever collide are a more problematic case. The period and quotation marks, for instance, probably require some kerning, but because they do not overlap at any horizontal band, would slide past each other. So BubbleKern's algorithm clamps the maximum possible kerning adjustment to be one half of the width of the narrower glyph in the pair. Omagari reports that this seems to be a sensible limit, and is hardcoded into the script.

But there are some unsolved problems as well, even ignoring the fact that sloppily created bubbles will generate poor results. First, if the user draws bubbles around the serifs on their glyphs, they may get unexpected results, because serifs usually extend almost all the way to the glyph's sidebearings. Thus, the user needs to clip bubbles at the sidebearings (even if that means the bubble is much thinner around the serifs than it is elsewhere). Another issue is that the script currently cannot make positive kerns, such as might be needed between W and a quotation mark or apostrophe. Finally, the script is limited to generate kerning values in the TrueType-style kern table format; users who want to produce kerning data in other formats (namely the GPOS table used by CFF-flavored OpenType fonts) will have to convert the results with another tool.

Omagari estimates that BubbleKern takes care of 70 to 90% of the kerning required when making a font. That is not a perfect solution by any means, but it also has side benefits. For example, it can catch glyph pairs that need kerning even when the designer did not notice them. And it could be extended to cover more situations, such as kerning trigraphs or kerning fonts in the Nastaliq writing style, where collisions can occur between glyphs that are several characters apart in the underlying string of text.

There have been other attempts to automate the kerning process prior to Omagari's. Charles M. Chen's command-line tool Autokern, for example, used related ideas. BubbleKern has one obvious advantage over a program like Autokern, however: it can be integrated into the popular font editor used by a rapidly growing number of working font designers, including integration with the on-canvas drawing tools. So far, BubbleKern is only available for Glyphs (which is proprietary and runs only on Mac OS X), but its Apache 2.0 license would certainly allow it to be ported to other tools and platforms.

Comments (3 posted)

Page editor: Jonathan Corbet

Security

On the way to safe containers

By Jake Edge
September 21, 2016

Linux Security Summit

Stéphane Graber and Tycho Andersen gave a presentation at the Linux Security Summit in Toronto to introduce the LXD container hypervisor. They also outlined some of the security concerns and other problems they have encountered while developing it. Graber is the LXD maintainer and project lead at Canonical, while Andersen works on the team, mostly on checkpoint/restore for containers.

[Stéphane Graber]

LXD is a container-management tool that uses LXC containers, Graber said. It is designed to be simple to use, but comprehensive in what it covers. LXD is a root-privileged daemon, which gives it additional capabilities compared to the command-line-oriented LXC. It has a REST API that allows it to be easily scriptable, as well.

LXD is also considered "safe", though Graber did use air quotes when he said that. It uses every available kernel security feature "that we can get our hands on" for that, though user namespaces is one the primary features it depends on. LXD also scales from a single container on a developer's laptop to a corporate data center with many hosts and containers to an OpenStack installation with thousand of compute nodes.

From his slides [PDF], he showed a diagram of how all the pieces fit together (seen below at right). Multiple hosts, all running Linux (and all running the same kernel version for those interested in container migration, he cautioned), with the LXD daemon (using the LXC library) atop the kernel. The LXD REST API can then be used from the LXD command-line tool, the nova-lxd OpenStack plugin, from scripts, or even using curl, he said.

So, that is what LXD is, Graber said, but there is a list of things that it is not as well. It is not a virtualization technology itself; it uses existing virtualization technologies to implement "system containers"—those running a full Linux distribution, rather than simply an application. It is not a fork of LXC; instead it uses the LXC API to manage its containers. It is also not another application container manager; it will work with Docker, rkt, and other application container systems.

[LXD diagram]

As part of its security measures, LXD uses all of the different namespace types. Graber said that a lot of work had gone into the control groups (cgroups) namespace over the last year, since LXD/LXC needed support for the cgroups version 1 (v1) API, which was not part of the original cgroup namespace work. For Linux Security Modules (LSMs), LXC supports both AppArmor and SELinux, though LXD only supports AppArmor at this point.

As far as capabilities go, LXD does drop a few, but must keep most of them available to the container since the application(s) that will be running in the system container are unknown. Those capabilities that the container will not need (e.g. CAP_MAC_ADMIN to override mandatory access control or CAP_SYS_MODULE to allow loading and unloading kernel modules) are dropped.

LXD also uses cgroups extensively and much of the talk will be about "why they're great and why they're really bad and hopefully what we can do to try and make them better", Graber said. He has spent some time over the last year trying to work out user-friendly ways to express kernel resource limits. For example, LXD uses the CPU cgroup controller to handle CPU limits for the containers, which can be expressed as a number of cores or as a limit based on CPU time. Those time limits can be configured as a percentage of the CPU or in terms of a time budget (e.g. 10ms out of every 150ms).

Similarly, memory limits can be set using a percentage or a fixed amount. LXD does not expose the kernel memory limits, since "no one knows how to set those correctly". Swapping can be enabled on a per-container basis. Disk quotas can be used if the underlying filesystem supports them in the right way for LXD; for now, that means Btrfs and ZFS. Network traffic can also be limited on a per-container basis. Containers can get an overall priority that will be applied to scheduling and other decisions based on the relative priorities of all of the containers on the system.

There are shared kernel resources that can cause problems when multiple containers are running, not necessarily because of malicious activity, but simply by accident. For example, inotify handles (used to track filesystem changes) are limited to 512 per user, which in practice means 512 shared between all of the containers. That is not enough when you are running systemd, which uses a lot of inotify handles and fails when it cannot get one rather than falling back to polling. There is no good reason to have this global limit, however, so tying the number of inotify handles to the user namespace is probably the right way to fix that.

Another problem area is the shared network tables. For example, Graber runs a "capture the flag" competition annually that uses 15,000 or so containers to simulate the internet. That creates a routing table with 3.3 million entries, which is too large for the kernel limits. The way the network tables are shared in the system means that a container user could fill up these tables so that other containers or the host can no longer open sockets. There is a similar problem with pseudoterminal slave (PTS) devices, he said.

Ulimits pose a related problem. Unless each container has its own range of user and group IDs (UIDs and GIDs), which would need to include 64K IDs per container to be "vaguely POSIX-compliant", ulimits will apply across container boundaries. Tying them to some kind of namespace would be better, but it is not entirely clear which namespace would make sense, he said.

The main isolation used for LXD containers is a user namespace. In addition, an AppArmor profile is installed to prevent cross-container access to files and other resources, though it is largely just a safety net. Some system calls are blacklisted using seccomp, as well.

Container checkpoint/restore

[Tycho Andersen]

At that point, Andersen took over the presentation to discuss checkpoint/restore for containers. He began with some oddities that he has come across—for sysctl handling, in particular. For example, sysctls for network namespaces change the value for the namespace that opened the /proc/sys file, while the IPC and UTS namespaces change the value for whichever namespace does the write() of the value. But the only user that can open() the IPC/UTS sysctl files is real root, which would imply that file descriptors for those files would be passed to unprivileged containers, but that won't work.

He then moved on to some other checkpoint/restore problem areas. In reality, checkpoint/restore is almost the antithesis of security, Andersen said. It requires looking inside the state of a process, which needs privileges, but there are some things that even root cannot do. Checkpoint/restore uses ptrace() to scrape a process's state, but there are security technologies that block some of that access.

For example, seccomp will kill a process if a blocked system call is made, so seccomp might need to be suspended while the checkpoint is being done. Similarly, LSMs can prevent some actions that a checkpoint or restore might need to do, so LSM access control might need to be paused during those operations. Andersen did note that when discussing this idea with Kees Cook it was not entirely well-received—in fact, Cook said the feature "gave him the creeps". Beyond those problems, there is also a need to handle the checkpoint of nested namespaces, he said.

Graber then gave a demo of LXD that included migrating running containers from one host to another. As with most demos, it was a bit hard to describe; those interested can check out the YouTube video of the talk. It did serve to show some of the capabilities of LXD, its command-line interface, and the ease of setting, running, and managing containers using it. LXD is implemented in Go, while LXC is written in C.

As a recap, Graber summed up the LXD project and its wider implications. Unprivileged containers are safe by design, he said. LSMs can be used to provide a safety net to help ensure the security of those containers. It is still too easy to make a denial-of-service attack against the kernel, however, using PTSes, network tables, and other shared resources. Unprivileged APIs are regularly requested by container users, some of which are reasonable, though many others are not. Finally, checkpoint/restore for containers is working in some configurations, but there are lots of things still to be worked out.

[I would like to thank the Linux Foundation for travel support to attend the Linux Security Summit in Toronto.]

Comments (3 posted)

Brief items

Security quotes of the week

My new neighbor was using AirDrop to move some files from his phone to his iMac. I hadn't introduced myself yet, but I already knew his name. Meanwhile, someone with a Pebble watch was walking past, and someone named "Johnny B" was idling at the stoplight at the corner in their Volkswagen Beetle, following directions from their Garmin Nuvi. Another person was using an Apple Pencil with their iPad at a nearby shop. And someone just turned on their Samsung smart television.

I knew all this because each person advertised their presence wirelessly, either over "classic" Bluetooth or the newer Bluetooth Low Energy (BTLE) protocol—and I was running an open source tool called Blue Hydra, a project from the team at Pwnie Express.

Sean Gallagher

The FBI needs computer-security expertise, not backdoors.
Bruce Schneier

Comments (none posted)

New vulnerabilities

autotrace: code execution

Package(s):autotrace CVE #(s):CVE-2016-7392
Created:September 15, 2016 Updated:September 28, 2016
Description: From the Debian-LTS advisory:

Autotrace is a program for converting bitmaps to vector graphics. It had a bug that caused an out-of-bounds write. This was caused by not allocating sufficient memory to store the terminating NULL pointer in an array.

Alerts:
Mageia MGASA-2016-0327 autotrace 2016-09-28
Debian-LTS DLA-621-1 autotrace 2016-09-15

Comments (none posted)

chromium-browser: multiple vulnerabilities

Package(s):chromium-browser CVE #(s):CVE-2016-5170 CVE-2016-5171 CVE-2016-5172 CVE-2016-5173 CVE-2016-5174 CVE-2016-5175 CVE-2016-7395
Created:September 15, 2016 Updated:September 21, 2016
Description: From the Debian advisory:

CVE-2016-5170: A use-after-free issue was discovered in Blink/Webkit.

CVE-2016-5171: Another use-after-free issue was discovered in Blink/Webkit.

CVE-2016-5172: Choongwoo Han discovered an information leak in the v8 javascript library.

CVE-2016-5173: A resource bypass issue was discovered in extensions.

CVE-2016-5174: Andrey Kovalev discoved a way to bypass the popup blocker.

CVE-2016-5175: The chrome development team found and fixed various issues during internal auditing.

CVE-2016-7395: An uninitialized memory read issue was discovered in the skia library.

Alerts:
Gentoo 201610-09 chromium 2016-10-29
Fedora FEDORA-2016-2e50862950 chromium 2016-10-13
Ubuntu USN-3091-1 oxide-qt 2016-10-07
Mageia MGASA-2016-0309 chromium-browser-stable 2016-09-21
Fedora FEDORA-2016-b15185b72a chromium 2016-09-16
Arch Linux ASA-201609-13 chromium 2016-09-17
Red Hat RHSA-2016:1905-01 chromium-browser 2016-09-16
openSUSE openSUSE-SU-2016:2309-1 chromium 2016-09-15
openSUSE openSUSE-SU-2016:2310-1 chromium 2016-09-15
openSUSE openSUSE-SU-2016:2311-1 chromium 2016-09-15
Debian DSA-3667-1 chromium-browser 2016-09-15
Arch Linux ASA-201612-18 qt5-webengine 2016-12-17

Comments (none posted)

curl: code execution

Package(s):curl CVE #(s):CVE-2016-7167
Created:September 16, 2016 Updated:November 2, 2016
Description: From the Red Hat bugzilla entry:

It was found that provided string length arguments in four libcurl functions curl_escape(), curl_easy_escape(), curl_unescape and curl_easy_unescape were not properly checked and due to arithmetic in the functions, passing in the length 0xffffffff (2^32-1 or UINT_MAX or even just -1) would end up causing an allocation of zero bytes of heap memory that curl would attempt to write gigabytes of data into.

Alerts:
openSUSE openSUSE-SU-2016:2768-1 curl 2016-11-10
Ubuntu USN-3123-1 curl 2016-11-03
SUSE SUSE-SU-2016:2714-1 curl 2016-11-03
SUSE SUSE-SU-2016:2700-1 curl 2016-11-02
SUSE SUSE-SU-2016:2699-1 curl 2016-11-02
Fedora FEDORA-2016-80f4f71eff curl 2016-09-29
Mageia MGASA-2016-0316 curl 2016-09-21
Arch Linux ASA-201609-18 lib32-curl 2016-09-20
Arch Linux ASA-201609-19 curl 2016-09-20
Debian-LTS DLA-625-1 curl 2016-09-17
Slackware SSA:2016-259-01 curl 2016-09-15
Fedora FEDORA-2016-7a2ed52d41 curl 2016-09-15
Gentoo 201701-47 curl 2017-01-19

Comments (none posted)

distribution-gpg-keys: privilege escalation

Package(s):distribution-gpg-keys mock CVE #(s):CVE-2016-6299
Created:September 19, 2016 Updated:September 21, 2016
Description: From the Red Hat bugzilla:

It was found that mock's scm plug-in would parse a given spec file with root privileges. This could allow an attacker who is able to start a build of an rpm with a specially crafted spec file within mock's environment to elevate their privileges and escape the chroot.

Alerts:
Fedora FEDORA-2016-5 mock 2016-09-23
Fedora FEDORA-2016-5 distribution-gpg-keys 2016-09-23
Fedora FEDORA-2016-145afea99e mock 2016-09-16
Fedora FEDORA-2016-145afea99e distribution-gpg-keys 2016-09-16

Comments (none posted)

graphicsmagick: multiple vulnerabilities

Package(s):GraphicsMagick CVE #(s):CVE-2016-7446 CVE-2016-7447 CVE-2016-7448 CVE-2016-7449
Created:September 15, 2016 Updated:September 28, 2016
Description: From the GraphicsMagick release notes:

  • EscapeParenthesis(): I was notified by Gustavo Grieco of a heap overflow in EscapeParenthesis() used in the text annotation code. While not being able to reproduce the issue, the implementation of this function is completely redone.
  • Utah RLE: Reject truncated/absurd files which caused huge memory allocations and/or consumed huge CPU. Problem was reported by Agostino Sarubbo based on testing with AFL.
  • SVG/MVG: Fix another case of CVE-2016-2317 (heap buffer overflow) in the MVG rendering code (also impacts SVG).
  • TIFF: Fix heap buffer read overflow while copying sized TIFF attributes. Problem was reported by Agostino Sarubbo based on testing with AFL.

More information may be found in the CVE assignment email.

Alerts:
Debian-LTS DLA-683-1 graphicsmagick 2016-10-26
openSUSE openSUSE-SU-2016:2641-1 GraphicsMagick 2016-10-26
openSUSE openSUSE-SU-2016:2644-1 GraphicsMagick 2016-10-26
Debian-LTS DLA-651-1 graphicsmagick 2016-10-11
Mageia MGASA-2016-0325 graphicsmagick 2016-09-28
Fedora FEDORA-2016-390ec4a8f3 GraphicsMagick 2016-09-19
Fedora FEDORA-2016-0bdf82500f GraphicsMagick 2016-09-14

Comments (none posted)

jackrabbit: cross-site request forgery

Package(s):jackrabbit CVE #(s):CVE-2016-6801
Created:September 19, 2016 Updated:September 27, 2016
Description: From the Debian LTS advisory:

Lukas Reschke discovered that Apache Jackrabbit, a content repository implementation for Java, was vulnerable to Cross-Site-Request-Forgery in Jackrabbit's webdav module.

The CSRF content-type check for POST requests did not handle missing Content-Type header fields, nor variations in field values with respect to upper/lower case or optional parameters. This could be exploited to create a resource via CSRF.

Alerts:
Debian DSA-3679-1 jackrabbit 2016-09-27
Debian-LTS DLA-629-1 jackrabbit 2016-09-18

Comments (none posted)

kernel: denial of service

Package(s):kernel CVE #(s):CVE-2016-3841
Created:September 20, 2016 Updated:November 10, 2016
Description: From the CVE entry:

The IPv6 stack in the Linux kernel before 4.3.3 mishandles options data, which allows local users to gain privileges or cause a denial of service (use-after-free and system crash) via a crafted sendmsg system call.

Alerts:
Red Hat RHSA-2016:2695-01 kernel 2016-11-09
Red Hat RHSA-2016:2584-02 kernel-rt 2016-11-03
Red Hat RHSA-2016:2574-02 kernel 2016-11-03
Ubuntu USN-3083-2 linux-lts-trusty 2016-09-19
Ubuntu USN-3083-1 kernel 2016-09-19
SUSE SUSE-SU-2017:0494-1 the Linux Kernel 2017-02-17
SUSE SUSE-SU-2017:0333-1 kernel 2017-01-30
Scientific Linux SLSA-2016:2574-2 kernel 2016-12-14
SUSE SUSE-SU-2016:3069-1 kernel 2016-12-09
SUSE SUSE-SU-2016:2976-1 the Linux Kernel 2016-12-02

Comments (none posted)

mariadb: access restriction bypass

Package(s):mariadb CVE #(s):CVE-2016-6663
Created:September 15, 2016 Updated:September 21, 2016
Description: From the Arch Linux advisory:

- CVE-2016-6663 (access restriction bypass): In the past mariadb used to read the main configuration file from three different locations. One of them (the datadir) is unsafe because it's writeable by the sql-server. This way a remote attacker who could gain access to the sql-server could deploy a maliciously crafted configuration file.

Alerts:
Red Hat RHSA-2016:2749-01 rh-mysql56-mysql 2016-11-15
Debian DSA-3711-1 mariadb-10.0 2016-11-11
Mageia MGASA-2016-0371 mariadb 2016-11-09
Red Hat RHSA-2016:2595-02 mariadb 2016-11-03
Slackware SSA:2016-305-03 mariadb 2016-10-31
Red Hat RHSA-2016:2131-01 mariadb55-mariadb 2016-10-31
Arch Linux ASA-201609-10 mariadb 2016-09-14
CentOS CESA-2017:0184 mysql 2017-01-26
Oracle ELSA-2017-0184 mysql 2017-01-24
Scientific Linux SLSA-2017:0184-1 mysql 2017-01-24
Red Hat RHSA-2017:0184-01 mysql 2017-01-24
Scientific Linux SLSA-2016:2595-2 mariadb 2016-12-14
Red Hat RHSA-2016:2928-01 rh-mariadb101-mariadb 2016-12-08
Red Hat RHSA-2016:2927-01 rh-mariadb100-mariadb 2016-12-08
openSUSE openSUSE-SU-2016:3028-1 mariadb 2016-12-06
openSUSE openSUSE-SU-2016:3025-1 mariadb 2016-12-06
SUSE SUSE-SU-2016:2932-1 mariadb 2016-11-28
SUSE SUSE-SU-2016:2933-1 mariadb 2016-11-28

Comments (none posted)

moin: cross-site scripting

Package(s):moin CVE #(s):CVE-2016-7146 CVE-2016-7148 CVE-2016-9119
Created:September 19, 2016 Updated:December 2, 2016
Description: From the Red Hat bugzilla:

MoinMoin 1.9.8 is out, released 2014-10-17.

See https://moinmo.in/MoinMoinDownload

Strongly recommended for all users and contains bug fixes and enhanced password functionality.

From the Debian advisory:

Several cross-site scripting vulnerabilities were discovered in moin, a Python clone of WikiWiki. A remote attacker can conduct cross-site scripting attacks via the GUI editor's attachment dialogue (CVE-2016-7146), the AttachFile view (CVE-2016-7148) and the GUI editor's link dialogue (CVE-2016-9119).

Alerts:
Debian DSA-3715-1 moin 2016-11-15
Fedora FEDORA-2016-b3f93ead5b moin 2016-09-18
Fedora FEDORA-2016-a77985b7c7 moin 2016-12-01
Fedora FEDORA-2016-d40c768095 moin 2016-12-01
Fedora FEDORA-2016-cde4525fab moin 2016-12-01
Ubuntu USN-3137-1 moin 2016-11-23
Debian-LTS DLA-717-1 moin 2016-11-22

Comments (none posted)

mozilla: multiple vulnerabilities

Package(s):firefox thunderbird seamonkey CVE #(s):CVE-2016-5257 CVE-2016-5270 CVE-2016-5272 CVE-2016-5274 CVE-2016-5276 CVE-2016-5277 CVE-2016-5278 CVE-2016-5280 CVE-2016-5281 CVE-2016-5284
Created:September 21, 2016 Updated:January 5, 2017
Description: From the Red Hat advisory:

Multiple flaws were found in the processing of malformed web content. A web page containing malicious content could cause Firefox to crash or, potentially, execute arbitrary code with the privileges of the user running Firefox.

Alerts:
Ubuntu USN-3112-1 thunderbird 2016-10-27
openSUSE openSUSE-SU-2016:2543-1 thunderbird 2016-10-14
Debian-LTS DLA-658-1 icedove 2016-10-16
SUSE SUSE-SU-2016:2513-1 firefox 2016-10-12
openSUSE openSUSE-SU-2016:2485-1 thunderbird 2016-10-10
openSUSE openSUSE-SU-2016:2484-1 thunderbird 2016-10-10
Debian DSA-3690-1 icedove 2016-10-10
Mageia MGASA-2016-0336 thunderbird 2016-10-06
Scientific Linux SLSA-2016:1985-1 thunderbird 2016-10-04
SUSE SUSE-SU-2016:2431-1 firefox 2016-10-04
SUSE SUSE-SU-2016:2434-1 firefox 2016-10-04
Oracle ELSA-2016-1985 thunderbird 2016-10-03
Oracle ELSA-2016-1985 thunderbird 2016-10-03
CentOS CESA-2016:1985 thunderbird 2016-10-03
CentOS CESA-2016:1985 thunderbird 2016-10-03
CentOS CESA-2016:1985 thunderbird 2016-10-03
Red Hat RHSA-2016:1985-01 thunderbird 2016-10-03
Mageia MGASA-2016-0329 firefox/rootcerts/nss 2016-09-28
openSUSE openSUSE-SU-2016:2386-1 firefox, nss 2016-09-26
Debian-LTS DLA-636-1 firefox-esr 2016-09-27
Fedora FEDORA-2016-a6672dbd40 firefox 2016-09-25
Fedora FEDORA-2016-de277b9183 firefox 2016-09-25
openSUSE openSUSE-SU-2016:2368-1 firefox, nss 2016-09-24
Debian DSA-3674-1 firefox-esr 2016-09-22
Ubuntu USN-3076-1 firefox 2016-09-22
CentOS CESA-2016:1912 firefox 2016-09-22
CentOS CESA-2016:1912 firefox 2016-09-22
Arch Linux ASA-201609-22 firefox 2016-09-22
CentOS CESA-2016:1912 firefox 2016-09-22
Slackware SSA:2016-265-02 firefox 2016-09-21
Scientific Linux SLSA-2016:1912-1 firefox 2016-09-21
Oracle ELSA-2016-1912 firefox 2016-09-21
Oracle ELSA-2016-1912 firefox 2016-09-21
Oracle ELSA-2016-1912 firefox 2016-09-21
Red Hat RHSA-2016:1912-01 firefox 2016-09-21
Mageia MGASA-2017-0059 iceape 2017-02-20
Fedora FEDORA-2016-55f912fcdc seamonkey 2017-01-04
Gentoo 201701-15 firefox thunderbird 2017-01-04
Gentoo 201701-15 firefox 2017-01-03
Fedora FEDORA-2016-2bca1021a3 seamonkey 2017-01-02
Slackware SSA:2016-365-03 seamonkey 2016-12-30

Comments (none posted)

php: multiple vulnerabilities

Package(s):php CVE #(s):CVE-2016-7411 CVE-2016-7412 CVE-2016-7413 CVE-2016-7414 CVE-2016-7416 CVE-2016-7417 CVE-2016-7418
Created:September 19, 2016 Updated:October 14, 2016
Description: From the Arch Linux advisory:

The package php before version 7.0.11-1 is vulnerable to multiple issues that can lead to arbitrary code execution and denial of service.

- CVE-2016-7411 (arbitrary code execution): A memory Corruption vulnerability was found in php's unserialize method. This happened during the deserialized-object Destruction.

- CVE-2016-7412 (arbitrary code execution): Php's mysqlnd extension assumes the `flags` returned for a BIT field necessarily contains UNSIGNED_FLAG; this might not be the case, with a rogue mysql server, or a MITM attack. A malicious mysql server or MITM can return field metadata for BIT fields that does not contain the UNSIGNED_FLAG, which leads to a heap overflow.

- CVE-2016-7413 (arbitrary code execution): When WDDX tries to deserialize "recordset" element, use after free happens if close tag for the field is not found. This happens only when field names are set.

- CVE-2016-7414 (arbitrary code execution): The entry.uncompressed_filesize* method does not properly verify the input parameters. An attacker can create a signature.bin with size less than 8, when this value is passed to phar_verify_signature as sig_len a heap buffer overflow occurs.

- CVE-2016-7416 (arbitrary code execution): Big locale string causes stack based overflow inside libicu.

- CVE-2016-7417 (insufficient validation): The return value of spl_array_get_hash_table is not properly checked and used on spl_array_get_dimension_ptr_ptr.

- CVE-2016-7418 (denial of service): An attacker can trigger an Out-Of-Bounds Read in php_wddx_push_element of wddx.c. A DoS (null pointer dereference) vulnerability can be triggered in the wddx_deserialize function by providing a maliciously crafted XML string.

Alerts:
SUSE SUSE-SU-2016:2460-2 php7 2016-11-01
SUSE SUSE-SU-2016:2477-2 php5 2016-11-01
openSUSE openSUSE-SU-2016:2540-1 php5 2016-10-14
SUSE SUSE-SU-2016:2477-1 php5 2016-10-07
Debian DSA-3689-1 php5 2016-10-08
SUSE SUSE-SU-2016:2460-1 php7 2016-10-05
SUSE SUSE-SU-2016:2461-1 php53 2016-10-06
SUSE SUSE-SU-2016:2459-1 php53 2016-10-05
Ubuntu USN-3095-1 php5, php7.0 2016-10-04
openSUSE openSUSE-SU-2016:2444-1 php5 2016-10-04
Slackware SSA:2016-267-01 php 2016-09-23
Mageia MGASA-2016-0319 php 2016-09-25
Arch Linux ASA-201609-16 php 2016-09-18
Debian-LTS DLA-749-1 php5 2016-12-16
Gentoo 201611-22 php 2016-12-01

Comments (none posted)

php5: invalid free

Package(s):php5 CVE #(s):CVE-2016-4473
Created:September 19, 2016 Updated:September 21, 2016
Description: From the Debian LTS advisory:

An invalid free may occur under certain conditions when processing phar-compatible archives.

Alerts:
Red Hat RHSA-2016:2750-01 rh-php56 2016-11-15
SUSE SUSE-SU-2016:2460-2 php7 2016-11-01
SUSE SUSE-SU-2016:2460-1 php7 2016-10-05
Debian-LTS DLA-628-1 php5 2016-09-18

Comments (none posted)

php-adodb: cross-site scripting

Package(s):php-adodb CVE #(s):CVE-2016-4855
Created:September 19, 2016 Updated:September 21, 2016
Description: From the Red Hat bugzilla:

A cross-site scripting flaw was found in one of ADOdb's test scripts.

Alerts:
Mageia MGASA-2016-0363 php-adodb 2016-11-03
Fedora FEDORA-2016-7d6ca385a4 php-adodb 2016-09-18
Fedora FEDORA-2016-fed6f8c57d php-adodb 2016-09-18
Gentoo 201701-59 adodb 2017-01-24

Comments (none posted)

tomcat: privilege escalation

Package(s):tomcat CVE #(s):CVE-2016-1240
Created:September 15, 2016 Updated:September 21, 2016
Description: From the Debian-LTS advisory:

Dawid Golunski from legalhackers.com discovered that Debian's version of Tomcat 6 was vulnerable to a local privilege escalation. Local attackers who have gained access to the server in the context of the tomcat6 user through a vulnerability in a web application were able to replace the file with a symlink to an arbitrary file.

Alerts:
Ubuntu USN-3081-1 tomcat6, tomcat7, tomcat8 2016-09-19
Debian DSA-3670-1 tomcat8 2016-09-15
Debian DSA-3669-1 tomcat7 2016-09-15
Debian-LTS DLA-623-1 tomcat7 2016-09-15
Debian-LTS DLA-622-1 tomcat6 2016-09-15

Comments (none posted)

unadf: two vulnerabilities

Package(s):unadf CVE #(s):CVE-2016-1243 CVE-2016-1244
Created:September 21, 2016 Updated:September 26, 2016
Description: From the Debian LTS advisory:

It was discovered that there were two vulnerabilities in unadf, a tool to extract files from an Amiga Disk File dump (.adf):

- - CVE-2016-1243: stack buffer overflow caused by blindly trusting on pathname lengths of archived files.

Stack allocated buffer sysbuf was filled with sprintf() without any bounds checking in extracTree() function.

- - CVE-2016-1244: execution of unsanitized input

Shell command used for creating directory paths was constructed by concatenating names of archived files to the end of the command string.

Alerts:
Debian DSA-3676-1 unadf 2016-09-24
Debian-LTS DLA-631-1 unadf 2016-09-21

Comments (none posted)

virtualbox: unspecified vulnerability

Package(s):virtualbox CVE #(s):CVE-2016-3612
Created:September 15, 2016 Updated:September 21, 2016
Description: From the openSUSE advisory:

CVE-2016-3612: An unspecified vulnerability in the Oracle VM VirtualBox component in Oracle Virtualization VirtualBox before 5.0.22 allowed remote attackers to affect confidentiality via vectors related to Core.

Alerts:
openSUSE openSUSE-SU-2016:2314-1 virtualbox 2016-09-15

Comments (none posted)

wireshark: multiple vulnerabilities

Package(s):wireshark CVE #(s):CVE-2016-7176 CVE-2016-7177 CVE-2016-7178 CVE-2016-7179 CVE-2016-7180
Created:September 21, 2016 Updated:September 27, 2016
Description: From the CVE entries:

epan/dissectors/packet-h225.c in the H.225 dissector in Wireshark 2.x before 2.0.6 calls snprintf with one of its input buffers as the output buffer, which allows remote attackers to cause a denial of service (copy overlap and application crash) via a crafted packet. (CVE-2016-7176)

epan/dissectors/packet-catapult-dct2000.c in the Catapult DCT2000 dissector in Wireshark 2.x before 2.0.6 does not restrict the number of channels, which allows remote attackers to cause a denial of service (buffer over-read and application crash) via a crafted packet. (CVE-2016-7177)

epan/dissectors/packet-umts_fp.c in the UMTS FP dissector in Wireshark 2.x before 2.0.6 does not ensure that memory is allocated for certain data structures, which allows remote attackers to cause a denial of service (invalid write access and application crash) via a crafted packet. (CVE-2016-7178)

Stack-based buffer overflow in epan/dissectors/packet-catapult-dct2000.c in the Catapult DCT2000 dissector in Wireshark 2.x before 2.0.6 allows remote attackers to cause a denial of service (application crash) via a crafted packet. (CVE-2016-7179)

epan/dissectors/packet-ipmi-trace.c in the IPMI trace dissector in Wireshark 2.x before 2.0.6 does not properly consider whether a string is constant, which allows remote attackers to cause a denial of service (use-after-free and application crash) via a crafted packet. (CVE-2016-7180)

Alerts:
Arch Linux ASA-201609-27 wireshark-cli 2016-09-26
Mageia MGASA-2016-0321 wireshark 2016-09-25
Debian-LTS DLA-632-1 wireshark 2016-09-21
Debian DSA-3671-1 wireshark 2016-09-20

Comments (none posted)

zookeeper: buffer overflow

Package(s):zookeeper CVE #(s):CVE-2016-5017
Created:September 19, 2016 Updated:January 2, 2017
Description: From the Debian LTS advisory:

Lyon Yang discovered that the C client shells cli_st and cli_mt of Apache Zookeeper, a high-performance coordination service for distributed applications, were affected by a buffer overflow vulnerability associated with parsing of the input command when using the "cmd:" batch mode syntax. If the command string exceeds 1024 characters a buffer overflow will occur.

Alerts:
Mageia MGASA-2016-0328 zookeeper 2016-09-28
Debian-LTS DLA-630-1 zookeeper 2016-09-18
Fedora FEDORA-2016-5557ccf1f9 zookeeper 2016-12-31
Fedora FEDORA-2016-54a717d5d6 zookeeper 2016-12-31

Comments (none posted)

Page editor: Jake Edge

Kernel development

Brief items

Kernel release status

The current development kernel is 4.8-rc7, released on September 18. Linus said: "Normally rc7 is the last in the series before the final release, but by now I'm pretty sure that this is going to be one of those releases that come with an rc8. Things did't calm down as much as I would have liked, there are still a few discussions going on, and it's just unlikely that I will feel like it's all good and ready for a final 4.8 next Sunday."

See this summary for the state of currently known regressions in the 4.8 release.

Stable updates: 4.7.4 and 4.4.21 were released on September 15.

Comments (none posted)

Quotes of the week

We know bug reports come from everyone, there is no such thing as "bug free software", and none of us are claiming it. What we are claiming is that you should stick to the tree that is tested by as many people as possible the closest (i.e. mainline) as that gets you the most bug fixes, as well as the ability to use the kernel community to help you out when you have problems. Otherwise you are on your own with your 2.5million lines added franken-kernel that no one will touch if they have a choice not to.
Greg Kroah-Hartman

Simply repeating "upstream first" over and over again and telling people that doing anything else is just silly isn't really helping move things forward. People have heard this but for a good chunk of the industry there's a big gap between that simple statement and something that can be practically acted on in any sort of direct fashion, it can easily just come over as dismissive and hostile. It's going to be much more productive to acknowledge the realities people are dealing with and talk about how people can improve their engagement with upstream, make the situation better and close the gaps.
Mark Brown

When someone says "pretty simple" regarding cryptography, it's often neither pretty nor simple.
Alex Elsayed

The point is, I suspect that the block layer community is all about throughput and the talk about latency and interactivity is seen as an annoying distraction.

Like the kids making noise about doing detours for catching Pokémons in the back seat of the car while you're in the driving seat, driving to some percieved important destination. If you see what I mean. Their problems is not really your problem, so you don't care much. It will be more "yeah yeah, we'll see about your Pokémons. Someday."

Linus Walleij

Comments (8 posted)

Kernel development news

Adding encryption to Btrfs

By Jonathan Corbet
September 21, 2016
One of the promises of the Btrfs filesystem is that its new design would facilitate the addition of modern features like compression and encryption. Compression has been there for a while, but Btrfs has yet to gain support for encryption; indeed, the ext4 filesystem got this feature first over a year ago with an implementation that is also used by the f2fs filesystem. Work to fill this gap is underway, as can be seen in this recently posted patch set from Anand Jain, but it would appear that encryption in Btrfs remains a distant goal.

It remains distant because it has become clear that this code will not be merged in anything like its current form. With luck, though, it should be the source of a lot of lessons that can be applied to later, hopefully more successful attempts. Sometimes, one simply has to stumble a few times when attacking a difficult problem space.

Crypto troubles

There is an aspect to cryptographic code development that has been learned the hard way many times over: this code needs to be written with help from people who understand cryptography well and know where the pitfalls are. Developers who set out without that domain knowledge are certain to make serious mistakes. So this is not a good way to introduce an encryption-related patch set:

Also would like to mention that a review from the security experts is due, which is important and I believe those review comments can be accommodated without major changes from here.

As Dave Chinner (among others) pointed out, it is far too late for a security review, which should really happen during the design phase. The ext4 encryption feature, he noted, did go through a design review phase ahead of the posting of any code, and quite a bit of useful feedback was the result.

In this case, it would appear that this kind of review would have been helpful. Eric Biggers, who is working on the ext4 encryption feature, looked at the code and came back with a harsh judgment:

You will also not get a proper review without a proper design document which details things like the threat model and the security properties provided. But I did take a short look at the code anyway because I was interested. The results were not pretty. As far as I can see the current proposal is fatally flawed as it does not provide confidentiality of file contents against a basic attack.

Alex Elsayed also pointed out some of the cryptographic problems in the code. It comes down to a poor choice of encryption modes that leaves a filesystem open to well-understood known-plaintext attacks. The reviewers said that a mode like XTS, which lacks this particular vulnerability, should have been used instead. Or, even better, an authenticated encryption (AE) approach should be used; AE modes are believed to be far more resistant to most known attacks. AE brings its own challenges, though; the (mostly obsolete) ecryptfs filesystem uses it, but the current ext4/f2fs implementation does not. A related issue, as Ted Ts'o pointed out, is the increasing importance of taking advantage of hardware-based encryption for performance; that will tend to rule out "exotic encryption modes" in favor of something boring (but hardware-supported) like AES.

Crypto at the wrong level?

Another criticism of the patch set is that it implements a Btrfs-specific encryption infrastructure, rather than using the generic infrastructure added at the virtual filesystem (VFS) layer and used by ext4 and f2fs. One motivation for that approach is that Btrfs encryption is managed at the subvolume level, meaning that a single master key is used for the entire subvolume. Ext4 and f2fs, instead, lack the subvolume concept; they provide file-level encryption that allows different users to have different keys within the same filesystem. Another result is that Btrfs does not benefit from the work that has been done on the VFS infrastructure; as Chinner put it:

The generic file encryption code is solid, reviewed, tested and already widely deployed via two separate filesystems. There is a much wider pool of developers who will maintain it, review changes and know all the traps that a new implementation might fall into. There's a much bigger safety net here, which significantly lowers the risk of zero-day fatal flaws in a new implementation and of flaws in future modifications and enhancements.

He compared Btrfs-specific encryption to the Btrfs RAID5/6 implementation, which has had known problems for years and appears to be essentially unmaintained. "Encryption simply cannot be treated like this - it has to be right, and it has to be well maintained." Some Btrfs developers bristled at the description of the filesystem's RAID implementation, but there was general agreement that the VFS code should be used to the greatest extent possible — and improved in places where it cannot yet be used.

Btrfs does provide some unique challenges that will stress the capabilities of the existing VFS code. That code, for example, manages encryption keys as an inode attribute; that is how file-level encryption is supported. Btrfs throws a spanner into that works in a couple of ways:

  • If Btrfs snapshots are present, an inode is likely to be present in more than one of them. Without a great deal of care, these snapshots could be used to force a reuse of the encryption keys and "nonce" values used with a specific file; many AE algorithms will fail catastrophically if that happens.

  • In general, Btrfs does a lot of sharing of file blocks at the extent level. That is how the copy-on-write mechanism works in general, and features like deduplication will cause even more sharing to happen. Once again, this sharing could be used to expose encrypted traffic, or to simply tell when one party has modified a file that shares extents with another.

A solution to some of these problems would be to simply copy extents and do without the sharing when encryption is involved. But another solution falls out of the requirements: encryption in Btrfs probably needs to be managed at the extent level, rather than at the file level. That would reduce the potential for nonce-reuse attacks and would eliminate problems that would otherwise result if one file sharing an extent is modified in a way that changes the extent's offset within the file.

As Btrfs developer Zygo Blaxell put it, the Btrfs extent-use model already creates challenges for the VFS layer:

Currently any extent in the filesystem can be shared by any inode in the filesystem (assuming the two inodes have compatible attributes, which could include encryption policy), including multiple references from the same inode to the same extent at different logical offsets. This is the basis of the deduplication and copy_file_range features.

This confuses the VFS caching layer when dealing with deduped reflinked, or snapshotted files. It's not surprising that VFS crypto has problems coping with it as well.

At the moment, encryption at the VFS level doesn't have any real concept of extents at all; extents are generally something that only specific filesystems know about. So the VFS file-encryption code is not suitable for solving the Btrfs encryption problem in its current form. As many have pointed out, though, the solution is not to start over, but to enhance the VFS code to get it to the point where it can do the job.

About the only definite conclusion that came from the discussion was that there is still a lot of work to do before the Btrfs encryption problem is even well understood, much less properly implemented. If nothing else, the patches posted so far have served as a focus point for a discussion that needs to happen and, hopefully, a starting point for the next try, sometime in the future. Once again, we see that cryptography is hard, and the intersection with a next-generation filesystem makes it even harder.

Comments (14 posted)

Automating stable-kernel creation

By Jake Edge
September 21, 2016

LinuxCon North America

At LinuxCon North America 2016 in Toronto, Sasha Levin presented some of the tools and techniques he uses to maintain stable kernels. He maintains the 4.1 and 3.18 stable kernels and wanted to make his life easier, so he started automating the process. While creating a stable kernel will never be a fully automatic task, he has developed some tools that can help.

Stable trees are just like more -rc cycles, he said with a grin. The intent is that stable kernels only get small changes (< 100 lines) from the mainline that fix a non-theoretical bug that users are running into. The criteria is pretty strict, but an exception is made for new device IDs; those are normally one-line changes that simply enable new hardware using the existing code. Stable kernels are typically supported for around ten weeks for the period between kernel releases.

[Sasha Levin]

There are also long-term support (LTS) stable kernels. Those follow the same rules, but continue to get support for much longer—typically two years or more. As time passes, fewer commits are made to the LTS kernels; since they don't add new features, they also don't add new bugs. But, on the other hand, fixing the bugs that are found is harder, since they often must be backported rather than simply cherry-picking commits from the mainline.

That means that the rate of LTS stable patches goes down, but each patch takes more time to handle. In addition, more people depend on those trees for servers and other critical infrastructure, where they don't want to change the kernel (and, in particular, update to a new major version) frequently. So it is important that those kernels are as reliable as they can be.

So, that "doesn't sound hard", Levin said, just look at every patch that goes into the mainline, decide if it is a fix, and add it to the tree if it is. But, of course, there are too many patches—around eight patches per hour, every hour of every day.

Even if someone could look at all those patches, it is not always obvious whether they fix a real problem or not. There is also the chance that Levin or some other stable maintainer misses a patch that does fix something. If no one is using that functionality, that isn't much of a problem, but if it is a critical security fix, that can be serious. On the flip side, if he takes a fix that he shouldn't have, it might introduce a security hole. For example, a few weeks earlier he took an XFS patch into the wrong kernel version and introduced a local privilege escalation.

Let's automate

So, "let's automate". He finds most of the patches needed for his trees by looking for "stable@" addresses or "Fixes" tags in the mainline commits. His first step, then, was to write a script that grabbed the logs and looked for those strings. But that was not enough.

As an example, he pointed to a commit, which is a simple fix for a minor security bug (an information leak), but was not marked for stable, nor with a "Fixes" tag. So he can't rely on those alone to find the patches that should be added to his tree(s).

Another technique he uses is to search for certain keywords and phrases. Strings like "fix", "NULL dereference", "buffer overflow", and so on might indicate a commit he should look at more closely. He has around twenty of these strings that he looks for now, though he adds to the list occasionally.

After that, he started "shamelessly stealing" Greg Kroah-Hartman's work. So Levin has a script, stable-show-missing, that looks at other stable trees to see what is missing in one or the other. Are there commits in Kroah-Hartman's (or another stable maintainer's) tree that are not in his? Or vice versa?

In a continuation of the "shameless stealing", he has a script that looks for backports of fixes into other stable trees. "Backports are evil", he said, and should be avoided, but it is important not to have multiple backports of the same fix in various trees. If there is only one backport, it may be wrong, but at least all of them are the same and a single fix can be applied to all of them if needed. For example, if a fix has been backported from 4.8 into 4.4, he can run his tool to find and show the backported patches; if they apply cleanly to his tree, he can just adopt them.

Another tool, stable-deps, will give a list of commits that need to be applied before a particular fix can be applied. That list can be used to find stable-candidate commits that have been missed along the way. It can also show whether a fix is for a bug in some big feature that has been introduced since the kernel version he is working with. That makes it easier to drop those kind of fixes without doing costly research on the mailing list.

When looking at a specific patch, there is always the question of whether it truly should be applied or not. There are multiple rules in the stable_kernel_rules.txt file; the first five are straightforward, but the rest of it is "lawyer talk", he said with a chuckle. In any case, his common check_relevant() function will find some of the obvious violations , though of course it is not perfect.

Finding the "stable@" address in a commit is a good indicator that it is stable material, but is no guarantee that it truly is. On the other hand, there may not be a stable indicator, but the fix should be applied. Even if there is a "Cc: stable@vger.kernel.org" line in the commit, there are multiple different ways that "tag" is formed. Some have angle brackets or other formatting differences; there is also, perhaps, a version indication (which can also come in a variety of formats).

These version tags (e.g. "Cc: stable@vger.kernel.org # v3.4+") are meant to help the stable maintainers quickly determine whether they should be interested in the patch or not. But there is no standard way of specifying the applicable versions, so check_relevant() tries to parse the version specification and to determine which kernel versions it actually corresponds to.

One problem he has encountered is the "fix for a fix". The "Fixes" tag refers to a commit that has been fixed, but that only works for mainline commit IDs. Once a fix has been cherry-picked into a stable tree, it will have a different commit ID than the corresponding change in the mainline. So a fix that references a mainline commit that has been cherry-picked into a stable tree is easy for a stable maintainer to miss. check_relevant() looks for that as well.

There are certain patch authors who are themselves flags for a patch that should get stable consideration. He mentioned Linus Torvalds and David Miller as two maintainers that mostly just fix bugs, often important bugs. While Torvalds "tries to hide" security problems, the fact that he has authored a particular change is a big sign that it is significant.

Putting all of that together results in a stable-steal-commits tool. It can be run on upstream or various stable maintainers' trees and will create a new tree with the changes that are found with his tools. It is not something that can be shipped, obviously, since it needs lots of validation, but it is a starting point. In particular, it is important to run stable-show-missing and look carefully at the results. Running stable-steal-commits takes about 30 minutes on an -rc release after -rc1; it takes around two hours for an -rc1 release.

When he is validating the tree that is created, he often finds that some patches need to be yanked out of the tree or that other patches need to be pulled in. That is not something that Git handles easily, which is why Kroah-Hartman uses quilt to manage stable-tree patches. Levin has created stable-yank and stable-insert to handle those kinds of problems. They are currently being used quite a bit, he said; he is trying to convince Kroah-Hartman to drop quilt in favor of them.

He now has a GitHub repository containing multiple tools that he uses for his stable kernel work. He also introduced his scripts in a post to the linux-kernel mailing list nearly a year ago.

Levin showed a rant from Dave Chinner that complained about having to make the same set of comments for multiple stable trees and maintainers. He wanted to see more coordination between the stable maintainers so that he and others could simply make one set of comments that would (somehow) propagate to all of the other stable trees that might also cherry-pick the commit(s) in question.

To help fill in that "somehow", Levin has come up with stable "notes". It will grab reviews and other comments from the mailing list and store them as notes on the commits in a Git tree. Other stable maintainers can add Levin's tree as a remote repository and configure Git to consult the notes that he is adding from stable reviews. That will help reviewers and maintainers so that they do not need to do multiple reviews for multiple stable releases; it will also help stable maintainers coordinate more easily.

The last piece of the puzzle is testing. Stable kernel candidates need to be tested before they can be released. He does local build tests and boots the kernels inside a virtual machine, but there is much more testing going on. The 0-day testing service and kernelci.org both test on every commit made to his Git repository. To him, it seems like these groups have "unlimited computing power or something" and their testing makes his life much easier. It is much better to find out about problems during the review cycle for the stable kernel rather than after it has been released.

[I would like to thank the Linux Foundation for travel assistance to attend LinuxCon North America in Toronto.]

Comments (none posted)

BBR congestion control

By Jonathan Corbet
September 21, 2016
Congestion-control algorithms are unglamorous bits of code that allow network protocols (usually TCP) to maximize the throughput of any given connection while simultaneously sharing the available bandwidth equitably with other users. New algorithms tend not to generate a great deal of excitement; the addition of TCP New Vegas during the 4.8 merge window drew little fanfare, for example. The BBR (Bottleneck Bandwidth and RTT) algorithm just released by Google, though, is attracting rather more attention; it moves away from the mechanisms traditionally used by these algorithms in an attempt to get better results in a network characterized by wireless links, meddling middleboxes, and bufferbloat.

The problem that any congestion-control algorithm must solve is that the net has no mechanism for informing an endpoint of the bandwidth available for a given connection. So the algorithm must, somehow, come to its own conclusions regarding just how much data it can send at any given time. Since the available bandwidth will generally vary over time, that bandwidth estimate must be revised occasionally. In other words, a congestion control algorithm must maintain an ongoing estimate of how much data can be sent, derived from the information that is available to it.

That information is somewhat sparse. These algorithms typically work by using one metric that they are easily able to measure: the number of packets that do not make it to the other end of the connection and must be retransmitted. When the network is running smoothly, dropped packets should be a rare occurrence. Once a router's buffers begin to fill, though, it will have no choice but to drop the packets it has no room for. Packet drops are thus a fairly reliable signal that a connection is overrunning the bandwidth available to it and should slow down.

The problem with this approach, on the network we have now, is that the buffers between any pair of endpoints can be huge. Oversized buffers have been recognized as a problem for some years now, and progress has been made in mitigating the resulting bufferbloat issues. But the world is still full of bloated routers and some link-level technologies, such as WiFi, require a certain amount of buffering for optimal performance. By the time an endpoint has sent enough data to overflow a buffer somewhere along the connection, the amount of data buffered could be huge. The packet-loss signal, in other words, comes far too late; by the time it is received, an endpoint could have been overdriving the connection for a long time.

Loss-based algorithms can also run into problems when short-lived conditions cause a dropped packet. They may slow down unnecessarily and, as a result, fail to make use of the bandwidth that is available.

Bottleneck Bandwidth and RTT

The BBR algorithm differs from most of the others in that it pays relatively little attention to packet loss. Instead, its primary metric is the actual bandwidth of data delivered to the far end. Whenever an acknowledgment packet is received, BBR updates its measurement of the amount of data delivered. The sum of data delivered over a period of time is a reasonably good indicator of the bandwidth the connection is able to provide, since the connection has demonstrably provided that bandwidth recently.

When a connection starts up, BBR will be in the "startup" state; in this mode, it behaves like most traditional congestion-control algorithms in that it starts slowly, but quickly ramps up the transmission speed in an attempt to measure the available bandwidth. Most algorithms will continue to ramp up until they experience a dropped packet; BBR, instead, watches the bandwidth measurement described above. In particular, it looks at the actual delivered bandwidth for the last three round-trip times to see if it changes. Once the bandwidth stops rising, BBR concludes that it has found the effective bandwidth of the connection and can stop ramping up; this has a good chance of happening well before packet loss would begin.

The measured bandwidth is then deemed to be the rate at which packets should be sent over the connection. But in measuring that rate, BBR probably transmitted packets at a higher rate for a while; some of them will be sitting in queues waiting to be delivered. In an attempt to drain those packets out of the buffers where they languish, BBR will go into a "drain" state, during which it will transmit below the measured bandwidth until it has made up for the excess packets sent before.

Once the drain phase is done, BBR goes into the steady-state mode where it transmits at more-or-less the calculated bandwidth. That is "more-or-less" because the characteristics of a network connection will change over time, so the actual delivered bandwidth must be continuously monitored. Also, an increase in effective bandwidth can only be detected by occasionally trying to transmit at a higher rate, so BBR will scale the rate up by 25% about 1/8 of the time. If the bandwidth has not increased (transmitting at a higher rate does not result in data being delivered at a higher rate, in other words), that probe will be followed by a drain period to even things out again.

One interesting aspect of BBR is that, unlike most other algorithms, it does not use the congestion window as the primary means of controlling outgoing traffic. The congestion window limits the maximum amount of data that can be in flight at any given time; an increase in the window will generally result in a burst of packets consuming the newly available bandwidth. BBR, instead, uses the tc-fq packet scheduler to send out data at the proper rate. The congestion window is still set as a way of ensuring that there is never too much data in flight, but it is no longer the main regulatory mechanism.

There is one last complication: many network connections are subject to "policers", middleboxes that limit the maximum data rate any connection can reach. If such a box exists, there is little point in trying to exceed the rate it will allow. The BBR code looks for periods with a suspiciously constant bandwidth (within 4Kb/sec) and a high packet loss rate; should that happen, it concludes that there is a policer in the loop and limits the bandwidth to a level that will not cause that policer to start dropping packets.

The BBR patch set was posted by Neal Cardwell; the code itself carries signoffs from a number of people, including Van Jacobson and Eric Dumazet. Google has, they say, been using BBR for some time, and is evidently happy with the results; BBR works fine when only one side of the connection is using it, so each deployment should, if it lives up to its promises, make the net that much better. We shouldn't have to wait long to find out; networking maintainer David Miller has applied the patches, meaning that BBR should be available in the 4.9 kernel.

Comments (48 posted)

Patches and updates

Kernel trees

Linus Torvalds Linux 4.8-rc7 Sep 18
Greg KH Linux 4.7.4 Sep 15
Sebastian Andrzej Siewior 4.6.7-rt13 Sep 15
Greg KH Linux 4.4.21 Sep 15
Steven Rostedt 4.4.21-rt30 Sep 21
Steven Rostedt 4.1.33-rt37 Sep 21
Steven Rostedt 3.18.42-rt44 Sep 21
Steven Rostedt 3.14.79-rt84 Sep 21
Steven Rostedt 3.12.63-rt84 Sep 21

Architecture-specific

Build system

Nicolas Pitre make POSIX timers configurable Sep 18

Core kernel code

Development tools

Device drivers

Device driver infrastructure

Documentation

Mauro Carvalho Chehab Create a book for Kernel development Sep 19
Jesper Dangaard Brouer XDP (eXpress Data Path) documentation Sep 20

Filesystems and block I/O

Kirill A. Shutemov ext4: support of huge pages Sep 15
Christoph Hellwig iomap based DAX path V3 Sep 16
Damien Le Moal ZBC / Zoned block device support Sep 20

Memory management

Networking

Security-related

Miscellaneous

Page editor: Jonathan Corbet

Distributions

The NTP pool system

September 21, 2016

This article was contributed by Tom Yates

NTP, the Network Time Protocol, quietly and without much fuss performs the critical internet function of knowing the correct time. Using it, a computer with imperfect communications links may join a distributed community of servers, each of which is either directly attached to a reliable clock, or is trying to best synchronize its clock to one or more better-synchronized members of the community. The NTP pool system has arisen as a method of providing such a community to the internet; it works well, but is not without its challenges.

NTP is quite complex in design, see these slides [PDF] for some details. Comparison of local and remote time stamps on transmission and reception over a pair of packet exchanges is used, iteratively, to estimate and correct for network latencies, though the inconstant nature of these latencies produces a floor below which, in practice, cannot be corrected for. As the protocol's author notes [PDF], a huff-and-puff algorithm corrects for large outliers and asymmetric delays, and a popcorn spike suppressor clips noise spikes.

NTP is fairly simple, however, in operation. Timestamps are exchanged via UDP, with the service being assigned port number 123. The protocol is hierarchical, with servers being classified into strata according to how many server hops lie between them and a reliable clock. The reliable clocks themselves are at the theoretical stratum of 0 (no server participating in the protocol should ever advertise itself as being at stratum 0). An NTP server whose current time comes from a directly attached reliable clock is at stratum 1, the highest stratum that should actually be seen on the internet. Servers that take their current time from a stratum 1 server are at stratum 2, and so on down to stratum 16, which equates to "not synchronized".

The protocol has been in continuous use since 1985, making it one of the oldest protocols still around. Back then, the business of finding higher-stratum servers to bind to was done by consulting the list of public NTP servers, which was maintained by the protocol's author, David Mills, on his University of Delaware homepage. You picked a couple of likely-looking servers whose access policies allowed you, dropped their admins an email if they'd asked for it, and put them in your ntp.conf file. (I am old enough to be quietly nostalgic for the days when a three-line text file was enough to configure any reliable service.)

This informal approach worked well while there were enough high-stratum servers willing to support the rest of the internet population. That stopped being the case in 2003, when Mills posted to the comp.protocols.time.ntp newsgroup that he'd removed from the public lists the servers of a national time standards laboratory, at their request, following repeated violations of their declared access policy.

Pooling our efforts

Following some discussion, Adrian von Bidder proposed, then implemented, an approach by which clients would configure the same small number of generic time sources by hostname: the original suggestions were left, right, and center.time.fortytwo.ch, the left, right, and center having no meaning other than to enable multiple servers to be configured without repeating a fully-qualified domain name. Through the use of round-robin DNS, a relatively large number of servers would combine to provide the underlying service. Within 24 hours, the refinement was suggested of incorporating geographical (continent or country) information in the hostnames to assist clients who wished to use servers that were likely to be fewer network hops away. Later that day, the suggestion was made to change the domain name being used to pool.ntp.org.

At this point, the skeleton of today's NTP pool service, which many Linux distributions come ready to use out of the box, was clearly visible. Modern pool hostnames, which look like 0.pool.ntp.org, 1.europe.pool.ntp.org, and 2.ar.pool.ntp.org, can clearly be seen foreshadowed in the previous discussion (again, the 0, 1, and 2 are meaningless placeholders to allow multiple servers to be drawn from the same zone without repeating a hostname). Later additions included a backend monitoring system, which continuously checks all pool servers to verify that they advertise good time and removes faulty servers from the pool for as long as they do not. The project also developed a DNS server that allowed for the weighting of servers in creating responses to queries of the pool zone, so that servers that had better internet connections could be returned more often than those with poorer connections.

In 2005, von Bidder relinquished the reins of the project. They were capably taken up by Ask Bjørn Hansen, who remains the project's lead developer (currently assisted by, amongst others, Guillaume Filion, Arnold Schekkerman, and John Winters). One major change since then was the addition in 2006 of the vendor pool; projects and commercial vendors that wish to ship systems preconfigured to use the NTP pool as a time source are strictly forbidden from using simple pool hostnames. Instead, they must ask the project for a zone dedicated to them (e.g. debian.pool.ntp.org, centos.pool.ntp.org, or linksys.pool.ntp.org), which they are allowed to configure into software and hardware they ship. The main technical reason for this is to enable a quick solution to a vendor that ships devices that, accidentally or otherwise, start to abuse the pool servers; the project notes that an accompanying procedural benefit is to have a process in place to talk to people who are going to use the pool at larger scale, to make sure they do so responsibly. Free-software organizations needing pool zones are requested to make a reference in their configuration file or documentation encouraging people to join the pool; the following example comes from the CentOS 7 config file, as installed:

    # Use public servers from the pool.ntp.org project.
    # Please consider joining the pool (http://www.pool.ntp.org/join.html).
    server 0.centos.pool.ntp.org iburst
    server 1.centos.pool.ntp.org iburst
    server 2.centos.pool.ntp.org iburst
    server 3.centos.pool.ntp.org iburst

Commercial or closed-source organizations are asked to make a financial contribution to the project, and possibly to add a few servers to the pool as well.

How are we doing for time?

One could argue that the public NTP service was saved from being a victim of its own success by two factors: firstly, the increasing affordability of GPS-driven time sources, which allow a stratum 1 server to be built without the need for an immediately-adjacent cesium beam clock or an expensive radio time signal receiver, and secondly, the creation of the NTP pool system. At the time of writing, the pool contains about 3,700 server addresses, of which about 30% are IPv6 addresses. Slightly more than two-thirds of the pool servers are in Europe, and most of the rest are in North America; this information can be found in real time, and in much more detail, on the pool's website. The most common stratum for a pool server is 2, which contains about 70% of the pool's servers; less than 10% are at stratum 1.

Because the system is distributed, there is no easy way to know how many clients are served by the pool system. One estimate in 2011 was five to fifteen million, and that number can only have increased since. It's pretty clear that running a pool server is a remarkably efficient way of helping people: each server helps, on average, well over a thousand end-users know the right time. Or, to look at it another way, because most pool servers are at stratum 2, their redistribution of a stratum-1 server's time reduces the load on the server of anyone kind enough to run a public stratum-1 server by a factor of a thousand or more.

But the pool system has its problems. The DNS infrastructure is in need of assistance. The system that monitors the quality of potential pool servers is located on the west coast of the US, and though its failure wouldn't result in the immediate unavailability of the pool's NTP servers, it is a single point of failure and could do with being parallelized to other sites around the world. Translation assistance is needed. Although it has been amazingly reliable for the last decade, Hansen notes that the system as a whole has several more single points of failure that he'd like to eliminate.

And more than anything else, the system needs more NTP servers, particularly in places other than Europe and North America. At the time of writing, 158 country zones have no servers at all. If a client queries an empty zone, the query generally falls back to the continental zone. This creates a disincentive to be one of the first few server operators in a zone, because all the queries descend on your server, but the alternative is that some countries remain without local time service. A proposal has been made to populate otherwise-empty or lightly-populated zones with randomly-selected volunteers from the global zone, to try to minimize the impact of being the first server in a country, but this can only go so far: China, a country not noted for unfettered links to the global internet and thus in particular need of local servers, has only eight servers in its zone. Until the world is well-supplied with servers, more are needed everywhere.

We noted above that the most common stratum of a pool server is 2. The average LWN reader will very quickly infer that a candidate pool server does not need a directly-attached clock. According to the project, all you need is a static IP address and a reliable internet connection. Assistance is offered to potential pool server operators in getting things configured correctly. Once this is done, your author, who has run a pool server for over five years, can attest that it is a remarkably painless and trouble-free process. It imposes a small load on the underlying hardware (I estimate it adds about 0.2 to the server's load average, and around 20GB of network traffic a month), and the pool allows you to select a lower connectivity level if you prefer a lighter load than this. If you want a way to help a lot of people do something important with minimal effort, running an NTP pool server is worth a look.

Comments (67 posted)

Brief items

Distribution quote of the week

Debian's bug system is a tool we use to improve the distribution, not a user support channel. We should not retain bugs that do not help us achieve that. It would be great if it could also be a user support channel, but this is just unachievable for a volunteer-maintained distribution like Debian, and we should avoid creating the impression that we promise to do this.
-- Russ Allbery

Comments (1 posted)

Updated Debian 8: 8.6 released

Debian 8.6 has been released. "This update mainly adds corrections for security problems to the stable release, along with a few adjustments for serious problems. Security advisories were already published separately and are referenced where available."

Full Story (comments: none)

Newsletters and articles of interest

Distribution newsletters

Comments (none posted)

Catanzaro: GNOME 3.22 core apps

Michael Catanzaro lays down the rules for which GNOME applications distributions should package if they want to claim to provide a "pure GNOME experience." "Selecting the right set of default applications is critical to achieving a quality user experience. Installing redundant or overly technical applications by default can leave users confused and frustrated with the distribution. Historically, distributions have selected wildly different sets of default applications. There’s nothing inherently wrong with this, but it’s clear that some distributions have done a much better job of this than others."

Comments (38 posted)

Page editor: Rebecca Sobol

Development

Automating font builds

By Nathan Willis
September 21, 2016

ATypI

At ATypI 2016 in Warsaw, Marek Jeziorek and Behdad Esfahbod presented the work that the font-tools team at Google has done to automatically build and test fonts from source. The project is increasingly of importance to Google's work on the Noto font project, which is intended to cover every writing system and language on the planet. As one would expect, building and updating fonts that cover so much ground is far beyond what developers can do manually.

Jeziorek started out by describing the team and explaining its mission. There are currently six engineers involved, plus three people who work in a management capacity. The team is responsible for the Noto font family, which means that it must answer to two product teams: Chrome OS and Android. In addition, the team develops all the tools needed to make Noto usable, and those tools are intended to be [Marek Jeziorek] useful for other projects both inside (for example, to the Google Fonts project) and outside the company.

Noto is an abbreviation for "No more Tofu," Jeziorek explained, a reference to the empty-square fallback character seen when a font does not have the glyphs needed to display the necessary text. The goal is for Noto to provide a quality font for every writing system and language, and to do so in a style that is consistent and looks harmonious across all of those various languages. The main benefit will be to the billion computer users estimated to get online for the first time in the next few years, who do not speak a language written with the Latin, Greek, or Cyrillic alphabets.

The plans right now include more than 100 writing systems and 800 languages, although that could be adjusted. The Noto font family is designed to follow Unicode; if there are additions in future revisions, Noto will add more support to match whatever changes. For each language, there will be a serif font design, a sans serif, and "sans UI" that targets interface design rather than setting long text runs. All of the fonts are open source and are hosted on GitHub.

The big challenge, he said, is coordinating with all of the one-hundred-plus designers that create fonts for their local language. Google works with partners, namely Adobe for Chinese, Japanese, and Korean (CJK) and Monotype for other languages, but those partners frequently employ outside contractors. The result is that the Noto project pulls in work created by a lot of independent type designers.

Each new font project begins with a design proposal sent in as a PDF, Jeziorek said. Several iterations of design feedback tend to follow (focusing on matching the Noto look and feel as well as on readability), after which the designer begins working on the final product. The files delivered when the font is complete are in source form, which is then built and tested for completeness. At the moment, he said, 29 Noto families have been released (each containing several fonts as mentioned earlier, and each including several weight and width variants) that build into 638 TrueType or 638 OpenType font binaries. That amounts to a lot of testing.

Esfahbod then talked about the build pipeline. Three years ago, he said, many existing font-manipulation scripts and tools targeted the Unified Font Object (UFO) format, and he began working on the FontTools Python library to improve its support for UFO. The first task was to build a tool that would create a subset of an existing font, for example. At ATypI 2014, however, he commented that he intended to keep extending FontTools because the other major production pipeline, Adobe's Font Development Kit for OpenType (AFDKO), was closed-source. "Then David Lemon gave me the good news," he said, that AFDKO was being released as an open-source project (with one or two components as small exceptions), as were several related libraries.

So, for the first time, it looked like it was possible to create a full build pipeline using open-source software. When the team next renegotiated its contracts with the type designers, it asked them to provide UFO source files rather than binaries. Surprisingly, though, the replies were "we just switched to using Glyphs." Esfahbod talked to Glyphs developer Georg Seifert about the possibility of providing a scriptable Glyphs-to-UFO converter, and Seifert was receptive to the idea. But, in the end, Esfahbod just wrote the converter himself, since the Glyphs file format is well-documented on GitHub.

That new component made it possible to extract data straight from a Glyphs source file and pipe it into any of several available UFO-related tools, including the ADFKO. New engineers [Behdad Esfahbod] started working on additional pieces of the pipeline. James Godfrey-Kittle developed a library to use UFOs within FontTools; Sascha Brawer wrote a tool to bypass one of the few remaining binary components of AFDKO, which fixed overlapping contours in glyph outlines. Other pieces in the ever-growing Python toolkit include compreffor, which optimizes Compact Font Format (CFF) font tables for size, cu2qu, which converts cubic Béziers from a CFF font into the quadratic Béziers required for TrueType, a tool to do visual diffing between fonts, and much more.

The tool set eventually encompassed everything required to automatically build high-quality binary fonts directly from Glyphs source files. Nevertheless, there was no top-level process to coordinate all of the pieces, until Godfrey-Kittle wrote fontmake. As the name implies, fontmake is a full build system for font sources. The supported inputs include UFOs, Glyphs files, and several auxiliary formats like the designspace files used to define interpolation masters in MutatorMath. That library has been the industry standard to generate large font families from a small set of masters, although Esfahbod noted that the newly released OpenType 1.8 variations font feature may change a lot of workflows.

Up until recently, though, fontmake was useful for the Noto team but was not really friendly to outsiders. "We developers just want to write new code, and maybe fix a bug here and there," he said, not maintain the code. But that has begun to change, he reported. Brawer has developed testing and quality-assurance tools; Godfrey-Kittle and Cosimo Lupo have integrated fontmake with continuous integration, and Godfrey-Kittle has gotten the various dependencies under control, ensuring that fontmake works with a set of "known good" versions of the other libraries.

The result is a fully capable build tool, though Esfahbod noted that it is currently designed only to support Noto's precise build process. Other users will certainly want different or additional options. It is also embarrassingly slow in some places, he added, at least compared to C. "But that's not the point," he said. "The point is that it's faster for development. Font tooling is such a niche that there are not many people interested in developing for it, but Python is what that community wants to use."

More flexibility and stability are still to come, he said, but there are still features to be added as well. Work will need to be done to support the new OpenType 1.8 variations fonts, there are improvements to be made in how CFF operators are used, and there is still one last piece of the AFDKO that has no open-source equivalent: the PostScript auto-hinter used on CFF fonts.

At the pace that the tools team has been working, however, the wait might not be too long at all. For everyone outside of the Noto project, a set of open-source build tools can reap plenty of benefits, from enabling distributions and other font distributors to automate their build scripts to enabling font designers to make use of concepts like continuous integration that other corners of the development community have already found useful.

Comments (1 posted)

Brief items

Quotes of the week

Two's complement arithmetic is very convenient for hardware, but has some counterintuitive mathematical properties at the margins. The C language is designed to let the obvious hardware instructions be used for its constructs, so it classes those edge cases as undefined behavior (that indeed may vary from one machine architecture to another).

This design decision was better for execution speed than for software reliability. In future revisions of the C standard, someone should argue that maybe we have enough execution speed now but not enough software reliability. So maybe the language should evolve in a way that defines these edge cases (and requires slower code on some oddball architectures). C is a living language and it does evolve.

John Gilmore

Here are some questions to ask before adding a library to your dependencies:

  • Is the library battle tested?
  • Is its API stable?
  • How does its code smell?
  • Is it backed by a community of developers that help maintain it?
  • Is it backed by stable organization? Is that organization known for responsible OSS stewardship?
  • Is it small enough that none of the other questions matter?

There’s plenty more you could ask. The point is: do your due diligence. If you decide to gamble, own it. Take responsibility.

Justin Kramer investigates npm install hairball. (Thanks to Paul Wise.)

Comments (12 posted)

Bash 4.4 and Readline 7.0 released

The GNU Bourne Again SHell (Bash) project has released version 4.4 of the tool. It comes with a large number of bug fixes as well as new features:"The most notable new features are mapfile's ability to use an arbitrary record delimiter; a --help option available for nearly all builtins; a new family of ${parameter@spec} expansions that transform the value of `parameter'; the `local' builtin's ability to save and restore the state of the single-letter shell option flags around function calls; a new EXECIGNORE variable, which adds the ability to specify names that should be ignored when searching for commands; and the beginning of an SDK for loadable builtins, which consists of a set of headers and a Makefile fragment that can be included in projects wishing to build their own loadable builtins, augmented by support for a BASH_LOADABLES_PATH variable that defines a search path for builtins loaded with `enable -f'. The existing loadable builtin examples are now installed by default with `make install'." In addition, the related Readline command-line editing library project has released Readline 7.0.

Full Story (comments: 30)

Emacs 25.1 released

Version 25.1 of the Emacs editor is available. New features include a dynamic module loader, experimental Cairo drawing, better TLS certificat validation, better Unicode input, a mechanism for embedding widgets within buffers, and more.

Full Story (comments: 9)

LLVM contemplates relicensing

The LLVM project is currently distributed under the BSD-like NCSA license, but the project is considering a change in the interest of better patent protection. "After extensive discussion involving many lawyers with different affiliations, we recommend taking the approach of using the Apache 2.0 license, with the binary attribution exception (discussed before), and add an additional exception to handle the situation of GPL2 compatibility if it ever arises."

Full Story (comments: 32)

CouchDB 2.0 released

The Apache CouchDB database project has announced its 2.0 release. New features include clustering support, a new query language, a new administrative interface, and more. "CouchDB 2.0 is 99% API compatible with the 1.x series and most applications should continue to just work."

Comments (none posted)

GNOME 3.22 released

The GNOME Project has announced the release of GNOME 3.22, "Karlsruhe". "This release brings comprehensive Flatpak support. GNOME Software can install and update Flatpaks, GNOME Builder can create them, and the desktop provides portal implementations to enable sandboxed applications. Improvements to core GNOME applications include support for batch renaming in Files, sharing support in GNOME Photos, an updated look for GNOME Software, a redesigned keyboard settings panel, and many more."

Full Story (comments: 42)

Newsletters and articles

Development newsletters from the past week

Comments (none posted)

Hutterer: Synaptics pointer acceleration

For this week's development horror story, it would be hard to do better than Peter Hutterer's quest to figure out how pointer acceleration works in the Synaptics driver. "Also a disclaimer: the last time some serious work was done on acceleration was in 2008/2009. A lot of things have changed since and since the server is effectively un-testable, we ended up with the mess below that seems to make little sense. It probably made sense 8 years ago and given that most or all of the patches have my signed-off-by it must've made sense to me back then. But now we live in the glorious future and holy cow it's awful and confusing."

Comments (30 posted)

Coghlan: The Python packaging ecosystem

Here's a lengthy piece from Nick Coghlan on how Python software gets to users. "There have been a few recent articles reflecting on the current status of the Python packaging ecosystem from an end user perspective, so it seems worthwhile for me to write-up my perspective as one of the lead architects for that ecosystem on how I characterise the overall problem space of software publication and distribution, where I think we are at the moment, and where I'd like to see us go in the future."

Comments (28 posted)

Garcia: WebKitGTK+ 2.14

Carlos Garcia Campos takes a look at the latest stable release of WebKitGTK+. "[The threaded compositor] is the most important change introduced in WebKitGTK+ 2.14 and what kept us busy for most of this release cycle. The idea is simple, we still render everything in the web process, but the accelerated compositing (all the OpenGL calls) has been moved to a secondary thread, leaving the main thread free to run all other heavy tasks like layout, JavaScript, etc. The result is a smoother experience in general, since the main thread is no longer busy rendering frames, it can process the JavaScript faster improving the responsiveness significantly." This release is also considered feature complete in Wayland.

Comments (8 posted)

The curious case of the switch statement (fuzzy notepad)

The fuzzy notepad blog is carrying a post about the switch statement with just about everything one might want to know about its past, present, and possible future. "As we’ve seen, the switch statement has had basically the same form for 49 years. The special case labels are based on syntax derived directly from fixed-layout FORTRAN on punchcards in 1957, several months before my father was born. I hate it."

Comments (24 posted)

Page editor: Nathan Willis

Announcements

Calls for Presentations

PGConf US 2017 Call For Presentations

PGConf US will take place March 28-31, 2017 in Jersey City, NJ. The call for presentations closes November 15. "This year we have announced several new additions to the conference such as an extra day of talks, a job fair, and increased sponsor opportunities. With the support of the United States PostgreSQL Association, we are dedicated to increasing benefits for all of our conference attendees, speakers, and sponsors."

Full Story (comments: none)

CFP Deadlines: September 22, 2016 to November 21, 2016

The following listing of CFP deadlines is taken from the LWN.net CFP Calendar.

DeadlineEvent Dates EventLocation
September 25 November 4
November 6
FUDCon Phnom Penh Phnom Penh, Cambodia
September 30 November 12
November 13
T-Dose Eindhoven, Netherlands
September 30 December 3 NoSlidesConf Bologna, Italy
September 30 November 5
November 6
OpenFest 2016 Sofia, Bulgaria
September 30 November 29
November 30
5th RISC-V Workshop Mountain View, CA, USA
September 30 December 27
December 30
Chaos Communication Congress Hamburg, Germany
October 1 October 22 2016 Columbus Code Camp Columbus, OH, USA
October 19 November 19 eloop 2016 Stuttgart, Germany
October 25 May 8
May 11
O'Reilly Open Source Convention Austin, TX, USA
October 26 November 5 Barcelona Perl Workshop Barcelona, Spain
October 28 November 25
November 27
Pycon Argentina 2016 Bahía Blanca, Argentina
October 30 February 17 Swiss Python Summit Rapperswil, Switzerland
October 31 February 4
February 5
FOSDEM 2017 Brussels, Belgium
November 11 November 11
November 12
Linux Piter St. Petersburg, Russia
November 11 January 27
January 29
DevConf.cz 2017 Brno, Czech Republic
November 13 December 10 Mini Debian Conference Japan 2016 Tokyo, Japan
November 15 March 2
March 5
Southern California Linux Expo Pasadena, CA, USA
November 15 March 28
March 31
PGConf US 2017 Jersey City, NJ, USA
November 18 February 18
February 19
PyCaribbean Bayamón, Puerto Rico, USA
November 20 December 10
December 11
SciPy India Bombay, India

If the CFP deadline for your event does not appear here, please tell us about it.

Upcoming Events

Events: September 22, 2016 to November 21, 2016

The following event listing is taken from the LWN.net Calendar.

Date(s)EventLocation
September 16
September 22
Nextcloud Conference Berlin, Germany
September 19
September 23
Libre Application Summit Portland, OR, USA
September 20
September 22
Velocity NY New York, NY, USA
September 20
September 23
PyCon JP 2016 Tokyo, Japan
September 21
September 23
X Developers Conference Helsinki, Finland
September 22
September 23
European BSD Conference Belgrade, Serbia
September 23
September 25
OpenStreetMap State of the Map 2016 Brussels, Belgium
September 23
September 25
PyCon India 2016 Delhi, India
September 26
September 27
Open Source Backup Conference Cologne, Germany
September 26
September 28
Cloud Foundry Summit Europe Frankfurt, Germany
September 27
September 29
OpenDaylight Summit Seattle, WA, USA
September 28
September 30
Kernel Recipes 2016 Paris, France
September 28
October 1
systemd.conf 2016 Berlin, Germany
September 30
October 2
Hackers Congress Paralelní Polis Prague, Czech Republic
October 1
October 2
openSUSE.Asia Summit Yogyakarta, Indonesia
October 3
October 5
OpenMP Conference Nara, Japan
October 4
October 6
ContainerCon Europe Berlin, Germany
October 4
October 6
LinuxCon Europe Berlin, Germany
October 5
October 7
International Workshop on OpenMP Nara, Japan
October 5
October 7
Netdev 1.2 Tokyo, Japan
October 6
October 7
PyConZA 2016 Cape Town, South Africa
October 7
October 8
Ohio LinuxFest 2016 Columbus, OH, USA
October 8
October 9
LinuxDays 2016 Prague, Czechia
October 8
October 9
Gentoo Miniconf 2016 Prague, Czech Republic
October 10
October 11
GStreamer Conference Berlin, Germany
October 11
October 13
Embedded Linux Conference Europe Berlin, Germany
October 11 Real-Time Summit 2016 Berlin, Germany
October 12 Tracing Summit Berlin, Germany
October 13 OpenWrt Summit Berlin, Germany
October 13
October 14
Lua Workshop 2016 San Francisco, CA, USA
October 17
October 19
O'Reilly Open Source Convention London, UK
October 18
October 20
Qt World Summit 2016 San Francisco, CA, USA
October 21
October 23
Software Freedom Kosovo 2016 Prishtina, Kosovo
October 22
October 23
Datenspuren 2016 Dresden, Germany
October 22 2016 Columbus Code Camp Columbus, OH, USA
October 25
October 28
OpenStack Summit Barcelona, Spain
October 26
October 27
All Things Open Raleigh, NC, USA
October 27
October 28
Rust Belt Rust Pittsburgh, PA, USA
October 28
October 30
PyCon CZ 2016 Brno, Czech Republic
October 29
October 30
PyCon HK 2016 Hong Kong, Hong Kong
October 29
October 30
PyCon.de 2016 Munich, Germany
October 31
November 2
O’Reilly Security Conference New York, NY, USA
October 31 PyCon Finland 2016 Helsinki, Finland
October 31
November 1
Linux Kernel Summit Santa Fe, NM, USA
November 1
November 4
Linux Plumbers Conference Santa Fe, NM, USA
November 1
November 4
PostgreSQL Conference Europe 2016 Tallin, Estonia
November 3 Bristech Conference 2016 Bristol, UK
November 4
November 6
FUDCon Phnom Penh Phnom Penh, Cambodia
November 5 Barcelona Perl Workshop Barcelona, Spain
November 5
November 6
OpenFest 2016 Sofia, Bulgaria
November 7
November 9
Velocity Amsterdam Amsterdam, Netherlands
November 9
November 11
O’Reilly Security Conference EU Amsterdam, Netherlands
November 11
November 12
Seattle GNU/Linux Conference Seattle, WA, USA
November 11
November 12
Linux Piter St. Petersburg, Russia
November 12
November 13
T-Dose Eindhoven, Netherlands
November 12
November 13
Mini-DebConf Cambridge, UK
November 12
November 13
PyCon Canada 2016 Toronto, Canada
November 13
November 18
The International Conference for High Performance Computing, Networking, Storage and Analysis Salt Lake City, UT, USA
November 14
November 16
PGConfSV 2016 San Francisco, CA, USA
November 14 The Third Workshop on the LLVM Compiler Infrastructure in HPC Salt Lake City, UT, USA
November 14
November 18
Tcl/Tk Conference Houston, TX, USA
November 16
November 17
Paris Open Source Summit Paris, France
November 16
November 18
ApacheCon Europe Seville, Spain
November 17 NLUUG (Fall conference) Bunnik, The Netherlands
November 18
November 20
GNU Health Conference 2016 Las Palmas, Spain
November 18
November 20
UbuCon Europe 2016 Essen, Germany
November 19 eloop 2016 Stuttgart, Germany

If your event does not appear here, please tell us about it.

Page editor: Rebecca Sobol


Copyright © 2016, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds