|
|
Log in / Subscribe / Register

4.4 Merge window, part 1

By Jonathan Corbet
November 11, 2015
As of this writing, over 10,800 non-merge changesets have been pulled into the mainline kernel repository during the 4.4 merge window. This would thus appear to be another busy development cycle. We are still somewhat short of the 12,274 changes pulled during the 4.2 merge window, but there is still time to get there. It's worth noting that over 2,400 of the changes pulled this time came via the staging tree — a lot even for the staging tree's standards.

Some of the interesting, user-visible changes pulled for 4.4 include:

  • The mlock2() system call has been added. This version of mlock() supports a flags argument, which has been used to add the new VM_LOCKONFAULT feature. When this flag is set, pages in the indicated range will be locked into memory once they are faulted in, but the range will not be populated when the mlock2() call is made.

  • The output of the stat file found in each process's /proc directory has changed. In particular, the wchan field (number 30!) used to contain the absolute kernel address where a process is blocked — a leak of important kernel information. It now returns zero for a running process, or one for a process that is blocked.

  • Loopback-mounted filesystems can now support direct and asynchronous I/O operations. A precursor to this work was covered here in 2013, but the code that was actually merged appears to be a much simpler implementation.

  • The block layer now supports I/O polling on high-performance devices. In some situations, polling can lead to significantly lower latencies and higher throughput.

  • The LightNVM patch set has been merged; this work adds support for solid-state storage devices that allow low-level access. With such devices, the kernel can do the low-level management work that is normally done by the flash translation layer, opening up the possibility of higher-performance and more reliable operation.

  • Journaled RAID5 support has been added to the MD subsystem; the addition of a journal to RAID5 can guarantee that inopportune power failures will not corrupt a RAID volume, even when it is running in the degraded mode. Eventually it should offer performance improvements as well, but that's for a future merge window.

  • The perf tool is now able to build and load eBPF programs for use with performance monitoring and event tracing. There have been many other changes to perf, as usual; see the merge commit for details.

  • BPF maps which, until now, were tied to the lifetime of their associated file descriptor, can now be made persistent. They are stored into a virtual filesystem at /sys/fs/bpf. See this commit for examples of how this feature can be used.

  • As described here in October, unprivileged users are now able to load eBPF programs to use as socket filters.

  • A new TCP packet-loss detection mechanism called RACK has been merged. In short: when a selective ACK for a packet is received, any non-acknowledged packets sent at least one round-trip time before the acked packet will be considered lost and retransmitted. This heuristic differs from current mechanisms in that it uses transmit times rather than packet sequence to decide what might be lost. Google has evidently been running this code for the last year with good results and plans to propose the algorithm as an IETF standard. Some information can be found in the changelogs for this commit and this commit.

  • The arm64 architecture can now run with a 16KB page size. Note that Linus has expressed some strong doubts about whether this feature is useful in the real world.

  • Virtualization with KVM can now take advantage of hardware interrupt bypass capabilities. This feature allows interrupts for a device assigned to a guest to be delivered directly to that guest without having to pass through the host first.

  • The NFS client now supports the NFSv4.2 CLONE operation (which makes a fast copy of a file) using an ioctl() that looks suspiciously like the Btrfs clone command.

  • New hardware support includes:

    • Systems and processors: Broadcom Northstar Plus systems-on-chip.

    • Audio: Dialog DA7219 audio codecs, Wolfson WM8998 codecs, Nuvoton NAU8825 audio codecs, Rockchip SPDIF transceivers, and Synopsis Designware AHB audio interfaces.

    • Cryptographic: ST Microelectronics HW random number generators.

    • Graphics: Broadcom VC4 GPUs (as seen on the Raspberry Pi).

    • Industrial I/O: Avago APDS9960 gesture/RGB/ALS/proximity sensors, PulsedLight LIDAR sensors, Memsic MXC4005XC 3-Axis accelerometers, Holt Integrated Circuits HI-8435 threshold detectors, TI HDC100x relative humidity and temperature sensors, SGX Sensortech MiCS VZ89X VOC sensors, UPISEMI us5182d light and proximity sensors, Microchip MCP45xx/MCP46xx digital potentiometers, and Measurement Specialties MS5637, HTU21, and TSYS01 sensors.

    • Input: Unisys visorhid input controllers, FocalTech FT6236 I2C touchscreen controllers, ROHM BU21023/24 dual touch resistive touchscreens, and Corsair Vengeance K90 keyboards.

    • Miscellaneous: Maxim MAX31790 fan RPM controllers, Sitronix ST7789V display controllers, Freescale i.MX23/28/6 one-time programmable memory controllers, Intel MIC Coprocessor State Management devices, Intel Trace Hub controllers, Altera SOCFPGA FPGA managers, Xilinx Zynq FPGA managers, X-POWERS AXP20X PMIC regulators, Qualcomm Switch-Mode battery chargers, TI TPS65217 battery chargers, Silicon Labs 514 programmable I2C clock generators, Broadcom BCM2835 SPI auxiliary controllers, Dialog Semiconductor DA9150 fuel gauges, Atmel flexible serial communication units, Altera PCIe host controllers, Hisilicon HiP05 PCIe-SAS system controllers, Micro Crystal RV8803 realtime clocks, Broadcom BCM7038 watchdogs, UniPhier I2C controllers, and Allwinner sunXi reduced serial bus controllers.

    • Networking: Texas Instruments DP83848 PHYs, Hisilicon Network Subsystem modules (and various network devices managed by it), Allwinner A10 CAN controllers, Broadcom Cygnus SoC internal PHYs, Broadcom NetXtreme-C/E 10/25/40/50 gigabit Ethernet cards, Microchip ENC424J600 ethernet chips, Mellanox Technologies Spectrum Ethernet switches, QLogic QED 25/40/100Gb Ethernet NICs, Realtek RTL8XXX wireless interfaces (new from-scratch driver), Intel Fields Peak NFC controllers, and Marvell NFC-over-I2C and NFC-over-SPI devices.

    • Pin control: Atmel PIO4 pin controllers, Allwinner a83t pin controllers, Marvell berlin4ct pin controllers, Renesas R8A7795 pin controllers, Intel Broxton pin and GPIO controllers, AMD Promontory GPIO controllers, and ACCES 104-IDIO-16 GPIO controllers.

    • USB: Mediatek USB3.0 PHYs and Broadcom Cygnus PCIe PHYs.

Changes visible to kernel developers include:

  • The new "userio" module allows an application to connect to the kernel and emulate serial I/O devices (laptop touchpads, for example). See Documentation/input/userio.txt for details.

  • The arm64 architecture has gained support for the KASan debugging tool.

  • The block layer now supports "persistent reservations," whereby a portion of a shared storage device can be reserved to a specific system. See Documentation/block/pr.txt for details.

  • The memory-management improvements described here in September have been merged. They simplify memory management in a number of ways, change the way pages are reserved for high-priority allocators, and rework the handling of memory-allocation flags.

  • Trace instances have a new sysfs file (set_event_pid) that can be used to restrict trace events to those associated with a specific process. Most trace options are now configurable on a per-instance basis as well.

  • The graphics documentation has been renamed from drm.tmpl to gpu.tmpl to reflect the fact that, at this point, it covers a lot more than just direct rendering.

  • There is a new x86-specific configuration option (CONFIG_DEBUG_WX) that puts out warnings for memory sections that are mapped as being both writable and executable. Unfortunately, it triggers frequently, and its output is relatively hard to use, so the option is being disabled by default for now.

  • The make xconfig graphical configuration utility has been forward-ported to Qt5; builds with Qt3 are no longer supported.

  • The media subsystem now has support for software-defined radio transmitters (receiver support has been present for some time).

If the usual schedule holds, the 4.4 merge window can be expected to close on November 15, though Linus has been known to bring things to a close early at times.

Index entries for this article
KernelReleases/4.4


to post comments

Journaled RAID5 support

Posted Nov 16, 2015 10:50 UTC (Mon) by xav (guest, #18536) [Link]

Interesting, does anyone have a pointer to this journaled RAID5 support ?

4.4 Merge window, part 1

Posted Nov 16, 2015 15:04 UTC (Mon) by nix (subscriber, #2304) [Link]

Note: the stat change does not break procps, because procps uses /proc/$pid/wchan instead. It only breaks attackers trying to get ASLR info, and kernel hackers trying to see where their test program is blocked (who can just revert this commit, or use perf).

Another potential-but-not-actual ABI change under /proc: the Vmalloc* fields are gone from /proc/meminfo (well, set to zero), because they consume noticeable time to compute and are redundant to /proc/vmallocinfo. The latter can only be read by root, but if a non-root program cares about stuff like this, something very strange is going on...


Copyright © 2015, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds