Leading items

Welcome to the LWN.net Weekly Edition for July 13, 2017

This edition contains the following feature content:

Emacs and Magit: both a review of the "Magit" interface to Git and a discussion of why the FSF is talking about starting a project to replace it.
User=0day considered harmful in systemd: surprising results from a disagreement over what makes a valid username.
OpenBSD kernel address randomized link: OpenBSD plans to randomly relink its kernel for every boot.
4.13 Merge window, part 1: coverage of the first 7,600 changesets merged for 4.13.
Hardened usercopy whitelisting: a grsecurity-derived technology to further restrict the copying of data between kernel and user space.
Highlights in Fedora 26: what's in the just-announced Fedora 26 release.

This week's edition also includes these inner pages:

Brief items: Brief news items from throughout the community.
Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

Emacs and Magit

By Jonathan Corbet
July 12, 2017

The Git source-code management system is widely known for its flexibility and for the distributed development model that it supports. Its reputation for ease of use is ... less well established. There should, thus, be an opening for front-end systems that can make Git easier to use. One of the most comprehensive Git front ends, Magit, works within the Emacs editor and has a wide following. But Magit has run into some turbulence within the Emacs development community that is blocking its wider distribution.

A look at Magit

Magit is an Emacs-Lisp (elisp) program, available from the MELPA package archive or directly from GitHub. While the web site does not directly address pronunciation, the usage found there suggests that it sounds more like "magic" than "maggot". Which is probably a good thing.

Once Magit has been installed on a user's system and hooked into Emacs, a magit-status command will bring up a window (in Emacs terminology — a "pane within the window" to many others) that is the primary control mechanism. It shows the information available from a "git status" command and can be used to directly look at the uncommitted changes found within the repository. The tab key can be used to expand or hide various subsections of this window.

Many Git operations are just a keystroke or two away when one is in this window. Position to a changed file and hit "s" to stage a change, or "u" to unstage it. Individual hunks can also be staged or unstaged if desired. Typing "c" brings up the commit options; another "c" will bring up a window for the commit message. From there, "^C^C" will finalize the commit. The "p" command can be used to push to another repository.

There is, of course, a set of options for examining the commit stream; many of them produce an ASCII-art version of the merge history shown by tools like gitk. Many of the common Git logging options (limiting to a specific author or searching for a given string) are available. The log display can be enhanced with diffstat information or the actual patch content. Magit can work with the reflog as well as with the regular commit stream.

The blame mode is another useful way to examine a file's history. When first started, it will annotate a source file with a set of headers indicating which commit added each line, along with author, date, and subject-line info. The actual commits are a keystroke away. Running magit-blame recursively goes to the version of the file just prior to when the lines under point were added. When a commit of interest is found, "M-w" puts the appropriate hash into the kill ring for easy access later.

Naturally, there is support for merging that can use either Smerge or ediff; the latter is, in your editor's opinion, one of the nicer Emacs features in general. Ediff can also be invoked to examine the difference between any two revisions of a file. There is support for rebasing, including handy commands for quickly editing a commit in the history and rebasing the branch on top of the result. For simple tasks like, for example, adding an "Acked-by" to a commit, Magit is far easier and quicker than using git rebase directly.

Most other common Git operations — managing remotes, branching, tagging, cherry-picking, stashing, generating pull requests, etc. are supported. There is special mode for bisecting. If the good-or-bad test can be scripted, then the entire bisection process can be run automatically by Magit. All told, it does appear to be a comprehensive interface to the Git functionality needed by most users.

Whether one sees Magit as a better way to use Git will depend, naturally, on one's attitude toward Emacs-based interfaces in general. It is all keyboard-based, of course, as befits a text editor; there is a pulldown menu if Emacs is running within its own window(s), but most users probably don't invoke it often. The vast number of operations and options provided goes well beyond what most users can be expected to remember — that is to be expected, since Git itself is like that. To get around this problem, Magit uses a series of "popups" displaying the available keystrokes. The experience is somewhat similar to working with the WordStar word processor, for the old-timers out there.

There are some complaints to be made about the system, even if the highly modal interface sits well. If a Git command puts out a strange error message (not an unusual occurrence with Git), Magit tends to throw up its hands and say "go look in that other buffer for the error message". Some operations can take a long time — measured in minutes for some logging commands your editor tried — and there is no visual indication that Magit is working or what progress it is making. A lot of (seemingly) irrelevant messages about reverted buffers and such show up in the message line when operations are proceeding. It's not a perfect or seamless interface, by any means.

Including Magit in Emacs

The Emacs editor ships with a wide array of add-on packages, but Magit is not one of them. That was the topic of a recent discussion on the Emacs development list where, to the dismay of some participants, Richard Stallman expressed a wish that "someone would write a package comparable to Magit" that could be included in Emacs. That would seem like a strange wish: Magit appears to be the best Emacs interface to Git, providing functionality that a lot of Emacs users wish to have. It is licensed under GPLv3. It is a large package with an active development community, not something that could be quickly replaced at anything resembling the current level of functionality.

So why is Stallman calling for a project to compete with an established and free Emacs package? The answer, of course, is that Magit contains the work of a long list of developers, quite a few of whom have not filed paperwork assigning their copyrights to the Free Software Foundation. The FSF still insists on such paperwork, refusing, in most cases, to take on software that it cannot claim ownership to.

This sort of discussion comes up occasionally on the Emacs list, with the same result every time. In this case, John Yates said that Magit "could emerge as one of those oh-so-elusive creatures: a true killer app for the emacs platform". He added that "sometimes community might be more important than copyright assignment" and asked Stallman to reconsider. Stefan Monnier, who has seen many such discussions, limited himself to saying:

If we want to distribute something like Magit with Emacs, there's no need to write a replacement: we can simply include Magit itself, since the license allows us to do so. The only hurdle is the one *we* (Emacs maintainers) put.

Stallman, however, is not known for being swayed by such arguments; his response in this case made it clear that he was not going to change his position. Either all of the significant contributors to Magit must sign papers with the FSF (with code from the holdouts being replaced), or an entirely new Emacs interface to Git must be written.

Phillip Lord has announced that he will make an attempt to get the papers for Magit in order. Only time will tell if he will succeed. Either way, though, a lot of work will be expended to either enable the distribution of software that is already distributable, or to try to replace it with equivalent functionality. It is hard to see this as a win for Emacs users or developers, and it would not appear to be a winning strategy for Emacs in general.

Comments (46 posted)

User=0day considered harmful in systemd

By Jake Edge
July 12, 2017

Validating user input is a long-established security best practice, but there can be differences of opinion about what should be done when that validation fails. A recently reported bug in systemd has fostered a discussion on that topic; along the way there has also been discussion about how much validation systemd should actually be doing and how much should be left up to the underlying distribution. The controversy all revolves around usernames that systemd does not accept, but that some distributions (and POSIX) find to be perfectly acceptable.

The bug was opened in late June by GitHub user "mapleray". It describes setting up a systemd service file with a "User=0day" entry, which means that the service should run as the 0day user. However, mapleray found that it ran as root instead, which is, at the least, rather surprising. It turns out that usernames starting with a digit are disallowed by systemd—so it ignores the line and puts a warning in the log. Since there is no user specified, systemd falls back running it as the default user: root.

Lennart Poettering replied that systemd was functioning as intended. In his mind, 0day is not a valid username, thus systemd should not accept it. The comment didn't address whether a warning was sufficient, however. He duly marked it as "notabug" and closed it. Others were not so sure, however.

Many of the GitHub comments focused on whether 0day should be treated as a valid username. Some distributions do allow it; even those that disallow it via adduser will allow it when used with the lower-level useradd command (or with adduser --force-badname). POSIX is much more liberal than most distributions; it allows many different kinds of usernames, including names starting with "-" or "." and names containing ".", any of which might confuse various command-line programs (e.g. chown), users, and administrators.

Falling back to root

But others thought that focusing on username validity was misplaced. Whether or not 0day is accepted by systemd as a valid username, many found it hard to see why the service should be started running as root when an invalid username is encountered. As "rain-1" put it: "The real problem here is that the unit is run as root, despite '0day' not describing the root user, isn't it?" To some, at least, systemd is clearly doing the wrong thing; "RealDolos" summed it up this way:

Same as systemd should not pick and run some random binary when the specified ExecStart is invalid, it shouldn't run a service under a random uid if it cannot find the specified user (uid 0 being the random uid fairly picked by the Debian PRNG, of course)

Shortly thereafter, trolls showed up in the bug report, which led to the issue being locked so that only project members could comment. Poettering posted a summary of his thoughts on the bug, but it did not address the issue of whether systemd should ignore invalid usernames in User directives (and run the unit as root). For backward compatibility reasons (i.e. newer service files used by older versions of systemd), complaints are issued about syntax errors, like unknown directives or directives with syntactically invalid settings, but the unit file is still run, he explained. On the other hand, semantically invalid settings (e.g. non-existent, syntactically valid usernames) will cause a fatal error for the unit.

Meanwhile, Daniel Skowroński posted about the bug to the oss-security mailing list. He focused on the root fallback aspect, but that didn't stop others from reiterating some of the discussion about username validity. However, Ben Tasker suggested an all-too-plausible scenario where the root fallback behavior could lead to real problems:

It'd be all too easy for a reviewer to look at the unit file, note it runs as '0day', double check that the package creates a user called '0day' and be happy that it's going to work. Hopefully someone *would* notice but that might not happen until after a vulnerability in that particular package has been remotely exploited (giving root access, oh dear), which is a situation that's in no-one's interest.

Like several others in the thread, Tasker thought that a CVE should be assigned. But Simon McVittie was concerned about the scope of CVE assignment growing to problems that are not truly security related: "[...] if the working definition of a vulnerability gets stretched too far towards things that are 'just a bug', it will reduce the perceived importance of fixing CVEs promptly, harming the overall level of security in software". While Tasker agreed with McVittie's overall concern, he was adamant that this bug qualified for a CVE:

So, as a result, you've potentially got a daemon listening for connections from the outside world, whilst running as root (despite the fact you're expecting it to run as an unprivileged user). If someone can convince that daemon to run arbitrary code, it does so with superuser privileges.

IMHO the discussion on how that unit file makes it onto the system in the first place is an irrelevant distraction.

Mistakes happen, whether that's a package reviewer missing the connotations of the *valid* username in the unit file, or some eejit leaving a unit file world writeable (I've seen kernel modules with 0777 on public facing production systems in the past).

Kurt Seifried assigned a CVE for the issue, CVE-2017-1000082, on July 6.

Another sub-thread on oss-security concerned the locked issue on GitHub. Pali Rohár noted that the lock blocked further discussion there, which is "really *bad* for security related problems". But others pointed out that the only recourse maintainers have when bug discussions start going off the rails is to lock them. The alternative is far worse, Alan Coopersmith said:

Honestly, given the level of flaming and trolling that happens on issues like this, locking the report is the only sane option I can see once everyone started piling on. Forcing FOSS maintainers to accept infinite amounts of shitposting is a horrible way to reduce security by burning out all FOSS maintainers quickly and leaving software abandoned.

Changes coming?

Overlapping some of that discussion was a thread on systemd-devel asking some questions about the bug and whether the fallback behavior would change. Systemd developer Zbigniew Jędrzejewski-Szmek replied that it might:

I do agree that we might want to completely reject unit files when some crucial lines fail to parse, for example ExecStart or WorkingDirectory or User, but it's not as obvious as one would think at first.

When new configuration options are added, the same unit file can almost always be used with older systemd, and it'll just warn & ignore the parts it doesn't understand. Similarly, various configuration options might be unavailable on some architectures and with some compilation options. The current behaviour of warn&ignore provides for "soft degradation" in those cases.

To do this properly, we would need to figure out which options are a) important enough, b) supported on all compilation variants and architectures, and then add a generic mechanism to make errors in them fatal.

Much of the systemd-devel discussion, once again, turned to the question of why valid usernames (by some definitions) were being rejected by systemd at all. Poettering posted a lengthy defense of the practice and pointedly disagreed with the POSIX rules (which are quite lax). He noted that the systemd username rules came from an interest in supporting the most portable subset of usernames across distributions. He also pointed out that the users in question here are "system users", rather than regular users that might want a username like "j.random.hacker" (which systemd would treat like 0day):

Also, do note that system users are different concepts than regular users: system users are concepts required for system services which are usually put together by developers, packagers and administrators who hopefully understand these issues to some point and pick good names instead.

Jędrzejewski-Szmek ("keszybz" on GitHub) created a pull request (PR) to fix the fallback problem that has been merged into the master branch. Meanwhile, version 234 of systemd, which would include that fix and other changes since the release of 233 back in March, is planned for July 12 (though it has not been tagged as of this writing). So it would seem that units with invalid User settings will not be run (as root or any other user) in systemd moving forward.

Perhaps the best description of why this bug caused such an uproar came in a comment on the issue that will be closed by Jędrzejewski-Szmek's PR; Justin Azoff said:

If I typo User=app as User=app0 the unit will fail to start. If I typo it as User=0app the unit will start as root. This behavior is absolutely insane.

If a unit file has User=something it can be said for certain that the one thing that the administrator absolutely does not want is the unit to start as root.

One can argue that systemd should not be checking usernames for itself, but should instead consult the system by using getpwnam()—many did. But systemd is rather opinionated about how things should work, naming, and so forth, so it should not come as an enormous surprise that it enforces its rules on system usernames. As Poettering pointed out—several times—systemd is free software; users, distributions, and others are all welcome to modify it to suit their needs. So, whether it makes sense or no, that part of the behavior will likely stand.

Comments (70 posted)

OpenBSD kernel address randomized link

By Jake Edge
July 12, 2017

A less than two-month-old project for OpenBSD, kernel address space randomized link (KARL), has turned the kernel into an object that is randomized on every boot. Instead of the code being stored in the same location for every boot of a given kernel, each boot will be unique. Unlike Linux's kernel address space layout randomization (KASLR), which randomizes the base address for all of the kernel code on each boot, KARL individually randomizes the object files that get linked into the binary. That means that a single information leak of a function address from the kernel does not leak information about the location of all other functions.

Theo de Raadt first posted about the idea on the OpenBSD tech mailing list on May 30. He described the current layout of the OpenBSD kernel code, which is effectively the boot code and assembly runtime (in locore.o), followed by the kernel .o files in a fixed order. His post had some changes that would split out the assembly runtime from locore.o and link it and all of the kernel .o files in a random order. The only piece that would be placed at a known address would be locore.o; it would be followed by a randomly sized gap, then by the kernel text that has its .o files arranged in a random order. There would also be random gaps before other sections (i.e. .rodata, .data, and .bss) that are placed after the kernel text.

Once the kernel was booted, locore.o would still contain a reference to the kernel text in the form of the address of main() (at a known offset in locore.o). So, once the kernel is up, locore.o is either unmapped or destroyed depending on the architecture. The idea is to thwart return-oriented programming (ROP) attacks against the kernel, De Raadt said:

There are a few mitigation strategies against BROP attack methodology. One can be summarized as "never reuse an address space". If a freshly linked kernel of this type was booted each time, we would be well on the way to satisfying that. Then other migitations efforts come into play.

That new layout set the stage for KARL, which De Raadt described in a post on June 12. Reworking the layout still means that the execution of a particular kernel version would always have the same addresses. For users of binary kernels provided by OpenBSD, for example, that would mean an attacker simply needs a copy of the kernel to extract the information they need—no real improvement on the current situation.

What KARL does is to ensure that, every time the kernel is booted, it gets a freshly linked version—new addresses for everything, essentially. De Raadt and Robert Peichaer created a "link kit" that contains all of the kernel .o files needed. After each boot, an rc script links a new kernel in the background and installs it. On the next reboot, a different kernel will be run that, in turn, creates the next kernel. De Raadt said that the link process takes less than a second on a fast machine.

There were still some more pieces of the puzzle to solve: kernel relinks were needed at install time, upgrade time, and when users install their own kernels. De Raadt was back with a post on June 30 that solved all of those problems. In particular, unique kernels are automatically built at install, upgrade, and boot time. In addition, if a new kernel is built and installed with "make install" the link kit will be updated with those objects so that the next boot will result in a new kernel. For debugging purposes, though, if someone copies their own kernel to /bsd, the relinks stop happening until they are re-enabled.

De Raadt asked that users test the feature hard, so that it will be ready for the 6.2 release slated for later this year. Overall, it is an interesting feature that should provide more security at a fairly low cost. It also came about quickly—eye-openingly so—though De Raadt said he has been talking about it with various other OpenBSD developers for five years or so (it "never got off to a good start, probably because I was trying to pawn the work off on others"). It is not inconceivable that Linux could do something similar some day, though one might guess that it would take a fair amount longer than a few months and one release cycle to make that kind of change.

Comments (17 posted)

4.13 Merge window, part 1

By Jonathan Corbet
July 10, 2017

As of this writing, just over 7,600 non-merge changesets have been pulled into the mainline repository for the 4.13 kernel release — and the patch rate does not look like it will be slowing down anytime soon. It will be another busy development cycle but, as has often been the case recently, many of the changes are internal cleanups that will not be visible to most users. That said, there are a number of interesting new features in this release.

Some of the most prominent user-visible changes include:

Support for non-blocking buffered I/O operations has been added at the block level. This, in turn, will help to improve asynchronous I/O support when used with buffered I/O.
The virtual filesystem and block layers have gained support for "lifetime hints"; these hints can be set on an open file using the fcntl() system call. The legal values are RWH_WRITE_LIFE_SHORT, for data that is not expected to stay around for a while, through to RWH_WRITE_LIFE_EXTREME for data that is expected to last forever. The idea is that the storage device can use this information to optimally place the data; thus far, only the NVMe driver actually makes use of this information.
The perf tool has a new --smi-cost option to measure the cost of system-management interrupts.
The s390 architecture now supports five-level page tables. That means it can now support up to 16EB of RAM, which should be enough for a year or two.
The next-interrupt prediction patches have been merged, hopefully bringing better power-management decisions with them.
While the conversion of the kernel's documentation to reStructured Text is not complete, an important milestone was reached for 4.13: all of the old DocBook template files have been converted, and support for the DocBook toolchain has been removed.
Ubuntu has been carrying a long list of enhancements to the AppArmor security module out-of-tree for some time. In 4.13, the core "domain labeling" code has been merged into the mainline. There is still quite a bit of AppArmor code yet to be upstreamed but, with the core code in place, it should be possible to consider that code in future merge windows.
The structure randomization plugin is now part of the build system. It can be used to randomize the layout of the fields in structures at build time, hopefully adding some security to the system.
The kernel now generates and assigns a unique ID number for each BPF program and map; these IDs can be used to obtain file descriptors for those objects in user space. This commit contains a test program that demonstrates this feature's use.
There is a new BPF program type, BPF_PROG_TYPE_SOCK_OPS, which can be invoked at various points in a socket's lifecycle to tweak a number of connection parameters. Naturally, the developers didn't want to spoil the fun by documenting this feature, but they did let some details slip in this commit message.
The tcp_sack, tcp_window_scaling, and tcp_timestamps sysctl knobs are now maintained separately for each network namespace.
A kernel-based TLS implementation has been merged, enabling better performance for protocols like HTTPS.
The new SO_PEERGROUPS command for getsockopt() will return a list of all groups that the socket peer is a member of.
Zoned block devices have different rules to writing to different parts of the device. For example, one zone may only allow writes to consecutive blocks. The dm-zoned device-mapper module will make a zoned block device look like a normal block device, hiding the zoned device's inherent write constraints in the process. Some information can be found in Documentation/device-mapper/dm-zoned.txt.
The first step in a long-term plan to improve the swapping of transparent huge pages has been merged. In current kernels, huge pages are split into small pages as nearly the first step in swapping them out. In 4.13, that splitting will be delayed until after swap space has been allocated and the swap-cache accounting done. That reduces lock contention and, it is claimed, leads to a 15% performance improvement. The plan is to further delay the split in the future until huge pages can be directly written to and read from the swap device.
New hardware support includes:
- Audio: Everest Semi ES8316 codecs, Rockchip pulse density modulation interface controllers, STMicroelectronics STM32 digital audio interfaces, STMicroelectronics STM32 S/PDIF receivers, and ZTE ZX AUD96P22 codecs.
- Industrial I/O: Texas Instruments ADC084S021, ADC108S102 and ADC128S102 analog-to-digital converters (ADCs).
- Media: STMicroelectronics STM32 digital camera memory interfaces, STMicroelectronics STM32 HDMI CEC interfaces, Maxim integrated MAX2175 tuners, Renesas digital radio interfaces, OmniVision OV5640 and OV13858 sensors, Freescale i.MX5/6 image processing units, Freescale MX5/6 camera sensor interfaces, and Dongwoon DW9714 voice coil motor interfaces.
- Networking: Cortina EDC CDR 10G Ethernet PHYs, Qualcomm Atheros QCA7000 UARTs, Microchip KSZ series switches, Allwinner H3 A83T A64 EMAC Ethernet controllers, Marvell Alaska 10Gbit PHYs, and Quantenna 802.11ac QSR10g FullMAC wireless interfaces.
- Pin control: ZTE ZX296718 pin controllers, Ingenic JZ47xx SoC pin controllers, Intel Cannon Lake PCH pin controllers, Marvell Armada AP806 and CP110 pin controllers, Renesas RZ/A1 pin controllers, and Qualcomm IPQ8074 pin controllers.
- USB: Motorola CPCAP USB PHYs, Renesas R-Car generation 3 USB 3.0 PHYs, Broadcom Northstar2 USB DRD PHYs, Synopsys USB 2.0 device controllers, and USB type-C connector system software interfaces.
- Miscellaneous: Aspeed Virtual UARTs, Analog Devices ADG792A/ADG792G Multiplexers, Intel Thunderbolt internal connection managers, Infinion IR35221 digital DC-DC multiphase converters, Dialog Semiconductor DA9061 regulators, TI LP87565 power regulators, HiSilicon Hi6421v530 PMIC voltage regulators, Amlogic Meson SPICC controllers, STMicroelectronics STM32 SPI controllers, Intel ACPI INT0002 virtual GPIO controllers, Motorola CPCAP PMIC battery monitors, Inside Secure SafeXcel cryptographic engines, Cavium CNN55XX cryptographic accelerators, and Faraday Technology FTIDE010 parallel ATA controllers.

Changes visible to kernel developers include:

There are many uses for universally unique identifiers (UUIDs) in the kernel. There are now two standard types for UUIDs: uuid_t and guid_t, replacing the uuid_be and uuid_le types used in some parts of the kernel previously. Various helper functions have been gathered and added to <linux/uuid.h>, and a number of kernel subsystems have been updated to use the new API.
The block-layer error-code refactoring described in this article has been merged.
The read-copy-update full-system idle detection mechanism has been removed, since nothing uses it. The sleepable RCU implementation has also been removed since it no longer seems to be needed.
The new CONFIG_REFCOUNT_FULL configuration option can be used to select a version of the refcount_t reference-count implementation that drops the overflow tests to gain a bit more performance. By default, this option is enabled.
The new "mux" driver subsystem provides support for multiplexer controllers that manage multiple devices.
The SPI driver subsystem has gained support for SPI slave mode.

By the usual schedule, the 4.13 merge window should close on July 16, with the final 4.13 release due in the first half of September. In other words, developers who are planning on attending the Linux Plumbers Conference or the North America Open Source Summit will want to be prepared for the merge window to be open during the events.

A followup article, covering the rest of the 4.13 merge window, will be posted after the 4.13-rc1 release happens.

Comments (9 posted)

Hardened usercopy whitelisting

By Jonathan Corbet
July 7, 2017

There are many ways to attempt to subvert an operating-system kernel. One particularly effective way, if it can be arranged, is to attack the operations that copy data between user-space and kernel-space memory. If the kernel can be fooled into copying too much data back to user space, the result can be an information-disclosure vulnerability. Errors in the other direction can be even worse, overwriting kernel memory with attacker-controlled data. The kernel has gained some defenses against this sort of attack in recent development cycles, but there is more work yet to be merged.

Much of the heap memory used within the kernel is obtained from the slab allocator. The hardened usercopy patch set, merged for the 4.8 kernel, attempts to limit the impact of erroneous copy operations by ensuring that no single operation can cross the boundary between one slab-allocated object and the next. But the kernel gets a lot of large memory objects from the slab allocator, and it is often not necessary to copy the entire object between the kernel and user space. In cases where only part of an object needs to be copied, it would be useful to prevent a rogue copy operation from copying to or from parts of the structure that do not need to be exposed in this way.

For example, the large mm_struct structure describes a process's virtual address space; it contains quite a bit of security-sensitive information. One field in this structure, saved_auxv is copied to and from user space. The prctl() functions that manipulate this field do not copy directly to or from the structure, but there is some obscure code in the ELF binary-format code that does pass that field directly to copy_to_user(). It would be nice if that copy operation could be restricted to that one field without the risk of exposing the rest of the structure.

Enabling protection at that level is the purpose of the hardened usercopy whitelisting patch set. Experience says we need to get the provenance of such patches right, so: this code originally comes from the grsecurity/PaX patch sets. David Windsor ported and reworked the code for mainline, and Kees Cook posted the set for review.

In short, this patch set extends the hardened usercopy mechanism by allowing the specification of a "usercopy region" within a slab-allocated object. Only data within that region can be copied to and from user space with functions like copy_to_user() or copy_from_user(). It is worth noting that no checking is applied to primitives like put_user(); the size of those operations is fixed and should not be subject to run-time attack.

Normally, a slab cache is allocated with kmem_cache_create(). This patch set adds a new function:

    struct kmem_cache *kmem_cache_create_usercopy(const char *name,
			    size_t obj_size, size_t align, unsigned long flags,
			    size_t useroffset, size_t usersize,
			    void (*ctor)(void *));

The useroffset and usersize parameters are new in this version of the function; they describe the region of objects allocated from this cache that can be copied between kernel and user space. If usersize is zero, no copying is allowed at all. Slabs created with kmem_cache_create() and objects obtained with functions like kmalloc() are fully whitelisted.

Whenever an object obtained from a slab allocator is passed to one of the user-space copy functions, the area to be copied will be checked to ensure that it lies entirely within the whitelisted window. If that test fails, a kernel oops will result.

One implication of the above design is that any given object can only have a single region that may be exposed to user space. In cases where it is necessary to copy more than one field, those fields must be grouped together so that the single region covers them all. To get there, the patch set ends up reorganizing a few structures before whitelisting them. A dozen or so structures have been specifically whitelisted in the patch set.

The final step in the patch set creates a new GFP_USERCOPY flag for memory allocations. There are certain system calls that can be used to force the kernel to allocate structures with a size controlled from user space. That is normally harmless, as long as the size kept within reasonable bounds. But certain types of attacks can benefit from the ability to create objects of a specific size. If those allocations are marked with GFP_USERCOPY, they will be taken from a separate slab, making it harder to control the layout of parts of the heap area.

It's not clear when these patches will be pushed toward the mainline, but there do not appear to be any serious obstacles in their way.

Comments (none posted)

Highlights in Fedora 26

By Jake Edge
July 12, 2017

The much anticipated release of Fedora 26 was made on July 11. As usual, it came with a wide array of updated packages, everything from the kernel through programming languages to desktops, but there are also internal tools and installation mechanisms that have changed as well. Beyond that, the new Python Classroom Lab is aimed at teachers and instructors to make it easier to get a full-featured Python (of various flavors and with lots of extras) in several different easily installable forms. Though it was delayed by more than a month from its original planned release date—something the project embraces at some level—Fedora 26 looks like it was worth waiting for.

The distribution is delivered for workstations and servers, as well as a version, Atomic Host, for container deployments. The Fedora Cloud Base has virtual machine (VM) images for several different cloud options including raw, QEMU copy on write (qcow2), libvirt/KVM, VirtualBox, and two versions for the Amazon Public Cloud. For server installations, there is a preview version of the Modularity initiative, which will (eventually) allow mixing newer components with older ones in ways that will mirror some of the advantages of a rolling release model.

There is plenty for desktop users too. For those who want something different from the default GNOME 3.24 desktop, there are spins covering most of the popular desktop environments (KDE Plasma, XFCE, LXDE, Cinnamon, Mate, LXQt, which is a new spin, and more); there is even a spin for the Sugar on a Stick (SoaS) desktop that came out of the One Laptop per Child initiative. For more specialized uses of the distribution, Fedora Labs provides curated sets of packages for several interest areas: robotics, astronomy, multimedia, security, and games. These tools can be installed as a full Fedora image or simply added to an existing Fedora installation as one bundle.

Fedora is not just available for x86 systems, either. There are multiple images for ARM aarch64, PowerPC PPC64, and PowerPC PPC64LE available from the alternate architectures page. For Raspberry Pi enthusiasts or those interested in ARM server images, the Fedora ARM page is the place to look. The alternative downloads page also has minimal network installers, BitTorrent download links, testing images, and more.

Much of that diversity has been present in Fedora for some time now, though refinements and additions have been made over time, of course. One of the things that the distribution has been working on recently is upgrading from previous versions of Fedora. In the release announcement, Fedora project leader Matthew Miller described it this way:

We've put a lot of work into making upgrades easy and fast. In most cases, this will take half an hour or so, bringing you right back to a working system with no hassle.

As far as software goes, it all starts with a 4.11 kernel. On top of that, GNU C library (glibc) 2.25 and GCC 7 are part of Fedora 26. A mass rebuild of all of the distribution using the new compiler and libc was part of the process of building the release. Other programming languages have been updated as well: Python 3.6, Golang 1.8, Glasgow Haskell Compiler (GHC) 8.0, PHP 7.1 (and Zend Framework 3.0), Ruby 2.4, and so on. System tools and libraries were not left out: OpenSSL 1.1.0, DNF 2.0, Cyrus IMAP 3.0, etc.

On the workstation side, there are some updates as well. LibreOffice 5.3 is in Fedora 26 and there have been many enhancements to the GNOME desktop. The tracker indexing service has been sandboxed for better security and Qt applications are better integrated with the desktop theme. In addition, the Fedora Media Writer application can now create bootable SD cards for ARM devices such as Raspberry Pi.

That's all pretty standard stuff for a new Fedora release. Much of the effort lately has been "under the hood" to some extent. The modularity work is proceeding and the prototype Boltron server based on that work is coming soon. Miller highlighted that in the announcement:

Stay tuned later this week for Fedora Boltron, a preview of a new way to put together Fedora Server from building blocks which move at different speeds. (What if my dev stack was a rolling release on a stable base? Or, could I get the benefits from base platform updates while keeping my web server and database at known versions?) We're also working on a big continuous integration project focused on Fedora Atomic, automating testing so developers can work rapidly without breaking things for others.

The Python Classroom Lab is a more-visible effort that comes out of the Python SIG and its Fedora Loves Python initiative. The lab comes as a Vagrant VM or Docker container to make it easy to install on various systems or it can be installed on Fedora as a bundle. It provides multiple versions of Python (3.6, 2.7, and PyPy 3.3) that are even more "batteries included" than the language itself. The scientific Python stack, IPython, Jupyter Notebook, Git, the Ninja IDE, and more are all part of the bundle. It should provide instructors with a known environment they can rely on for all their students to use.

One perhaps amusing side note is that the web site for the classroom lab (and quite a few others) tries to switch the language of the content based on the browser's geographic location, which is probably quite helpful to most. If, however, your language abilities do not match your current location, as is true for me, it gets a little annoying to keep switching back to English after following links.

Looking ahead, Fedora 27 is currently scheduled for October 24, but a "rain date" of October 31 is listed as well. That is fairly tight for a Fedora release, with less than four months of development time between the two. One reason for the push for late October is to try to align better with the GNOME 3.26 release that is planned for September. It is hoped that the elimination of alpha releases will help make that possible, but that initiative remains unproven at this point.

Miller has been thinking well beyond that, though, as he noted in a post to the Fedora devel mailing list on July 6. He has created "super-drafty F28 and F29 schedules" that try to align the releases with Fedora's traditional May and October release dates. That would put Fedora 28 on May 1, 2018 and Fedora 29 at the end of October 2018 (each with their own rain dates). So far, most of the complaints have been about the super-short Fedora 27 cycle, though there have been some concerns about the mass rebuild scheduled for Fedora 28 as well. That rebuild may conflict with the GCC development schedule to some extent.

One can probably be forgiven for suspecting that we may not see Fedora 27 by Halloween—or possibly even in November. There are lots of things going on in the project right now, new initiatives, plans, and so on, that would seem to make meeting the current schedule a formidable task. But only time will tell. In the meantime, there is much of interest going on in the Fedora world; it will be interesting to see where it all leads.

Comments (23 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>