|
|
Subscribe / Log in / New account

The rest of the 6.2 merge window

By Jonathan Corbet
December 27, 2022
The world got a special Christmas present from Linus Torvalds this year in the form of the 6.2-rc1 kernel prepatch. By the time the merge window closed, 13,687 non-merge changesets had been pulled into the mainline for the 6.2 release. This was the busiest merge window since 5.13 (which brought in 14.231 changesets) in mid-2021, and quite a bit busier than 6.1 was — but comparable to the late 5.x releases. Just under 4,000 of those changesets were pulled after the first-half summary was written; there were quite a few significant changes to be found in those late-arriving patches.

The most significant changes pulled in the latter part of the 6.2 merge window include:

Architecture-specific

  • The kernel can now perform return stack buffer stuffing to mitigate the Retbleed speculative-execution vulnerability on some Intel processor generations with a much lower performance cost.
  • The x86 architecture has also gained support for a control-flow integrity mechanism called FineIBT.
  • There is a new qspinlock implementation for the PowerPC architecture; it should provide improved performance and fix some lockup problems seen in extreme cases.
  • LoongArch has gained support for ftrace, suspend, hibernation, and stack protection.

Core kernel

  • The zram device can now recompress data streams to achieve better compression rates; see this documentation commit for details.
  • Shared anonymous memory areas can now be named; this capability extends the current memory-naming feature, which was previously limited to private memory.
  • The new trace_trigger= command-line option can enable a tracing trigger at boot time.

Filesystems and block I/O

  • There is a new set of sysfs knobs that can be used to fine-tune how much of the system's page cache can be used by pages destined to be written back to a specific device. See the documentation commits for strict_limit, max_bytes, min_bytes, max_ratio_fine, and min_ratio_fine for details.
  • The F2FS filesystem has gained an "atomic replace" ioctl() operation that can write data to a file and truncate it in a single atomic operation. F2FS has also gained a block-based extent cache that can be used to determine which data is hot (in active use) or cold; this commit contains a little information.
  • The ntfs3 filesystem has a few new mount options, starting with the undocumented nocase, which appears to control case-sensitive lookups. The windows_name option will prevent the creation of file names that Windows would not allow, and hide_dot_files controls whether files whose names start with "." are marked as being hidden.

Hardware support

  • Industrial I/O: Analog Devices MAX11410 and AD4130 analog-to-digital converters, MediaTek MT6370 analog-to-digital converters, Kionix KX022A tri-axis digital accelerometers, Maxim MAX30208 digital temperature sensors, Analog Devices AD74115H I/O controllers, and Analog Devices ADF4377 microwave wideband synthesizers.
  • Miscellaneous: Microsoft Azure network adapters, Baikal-T1 PCIe controllers, Ampere Computing SMPro error monitors, Lattice sysCONFIG SPI FPGA managers, Advantech embedded controller watchdog timers, Renesas R-Car S4-8 Ethernet SERDES PHYs, TI TPS65219 power management ICs, and Xilinx R5 remote processors.
  • Also: "iommufd" is a new user-space API for the control of I/O memory-management units; see Documentation/userspace-api/iommufd.rst for details.

Miscellaneous

Security-related

  • The kernel can now place an upper limit (10,000 by default) on the number of times the system can oops or warn before it just panics and reboots.
  • The TIOCSTI ioctl() operation will push data into a terminal device; that data will then be read as if were input typed by the user. As one might imagine, attackers find this operation useful. It seems that almost nobody else does, though. In 6.2, the kernel has gained a configuration option and sysctl knob that can disable this functionality entirely.

Internal kernel changes

  • There is a new struct encoded_page type meant to encapsulate the idea of using the lower bits of a pointer value for related information. This type, was created by Torvalds to increase type safety and prevent the accidental dereferencing of an augmented pointer without stripping out the extra bits first. There is no documentation but this commit is easy enough to read.
  • The venerable container_of() macro has a new sibling called container_of_const() that preserves the const quality of the passed-in pointer. In the merge message, Greg Kroah-Hartman explains this macro this way:

    The driver for all of this have been discussions with the Rust kernel developers as to how to properly mark driver core, and kobject, objects as being "non-mutable". The changes to the kobject and driver core in this pull request are the result of that, as there are lots of paths where kobjects and device pointers are not modified at all, so marking them as "const" allows the compiler to enforce this.

    So, a nice side affect of the Rust development effort has been already to clean up the driver core code to be more obvious about object rules.

  • The minimum version of binutils needed to build the kernel has been raised to 2.25.

One thing that didn't make it this time around is support for linear address masking, an Intel feature that allows storing extra data in pointer values. Torvalds complained about how the feature was implemented and refused to pull the patches. So this feature, it seems, will have to wait at least another cycle before landing in the mainline.

Meanwhile, the "extensive changelog" award must certainly go to Christian Brauner, for this patch, which features 520 lines of explanation (not including the stack trace) for a one-line fix.

Normally, the 6.2 development cycle would be expected to come to a close on February 12 or 19. Torvalds suggested that the holidays might slow down this release cycle slightly; time will tell. Meanwhile, it is time to start finding and fixing bugs — once the kernel developers finish celebrating the holidays, of course.

Index entries for this article
KernelReleases/6.2


to post comments

The rest of the 6.2 merge window

Posted Dec 27, 2022 18:27 UTC (Tue) by andy_shev (subscriber, #75870) [Link] (2 responses)

Not sure if there is no typo(s): "Lattice sysCONFIG SPI FPGA managers". I can't parse it...

The rest of the 6.2 merge window

Posted Dec 27, 2022 19:50 UTC (Tue) by mpr22 (subscriber, #60784) [Link] (1 responses)

things that manage Lattice brand FPGAs using the sysCONFIG feature over an SPI interface.

The rest of the 6.2 merge window

Posted Jan 19, 2023 16:03 UTC (Thu) by andy_shev (subscriber, #75870) [Link]

Thanks! Maybe article can be amended with this?

The rest of the 6.2 merge window

Posted Dec 28, 2022 7:04 UTC (Wed) by xecycle (subscriber, #140261) [Link] (7 responses)

Oh per-device limit of writeback cache! Is this the perfect solution to the decades old problem that writing to a slow USB disk takes down the entire system to a crawl?

The rest of the 6.2 merge window

Posted Dec 28, 2022 17:47 UTC (Wed) by iabervon (subscriber, #722) [Link] (6 responses)

I thought the worst performance aspects of that had been taken care of a while back with separate queues at a different layer and possibly proportional limits or reservations. But that still leaves the issue that the kernel will commit to a long period of storing data rather than having programs that write to slow disks take a long time such that you can interrupt them.

The rest of the 6.2 merge window

Posted Dec 28, 2022 22:31 UTC (Wed) by tux3 (subscriber, #101245) [Link] (5 responses)

Empirically, just a couple days ago I was copying data to a pathologically slow USB drive. Listing a folder or deleting a file on this key while the copy was in progress could stall for 2-15s. Other programs were very noticeably slowed, occasionally stalling for a few seconds. (On the same day, I also copied files from Win10 to the key without noticing similar stalls in other programs.)

I suppose I should try the prepatch and see if I can still reproduce. (Maybe going as far as taking a perf record and poking around a bit, if I feel particularly overconfident!)

The rest of the 6.2 merge window

Posted Dec 28, 2022 22:39 UTC (Wed) by iabervon (subscriber, #722) [Link]

Huh. I would have expected access to the key and other files on it to be slow, but anything that only uses other filesystems to be fast. When I copy things to a slow USB device it doesn't interfere with anything else, but my USB device is also not very large, so I may not be able to write enough to it to get significant memory (or disk cache) pressure on my system.

The rest of the 6.2 merge window

Posted Dec 28, 2022 23:00 UTC (Wed) by linusw (subscriber, #40300) [Link]

I would definitely try the BFQ scheduler on that device if you have this problem. It could be that this is locked because small read/writes to the device get to wait for long reads/writes that are very slow. BFQ could alleviate the situation by identifying the small reads/writes as necessary for interactivity. E.g. Fedora will use BFQ by default on USB drives.

The rest of the 6.2 merge window

Posted Dec 30, 2022 13:18 UTC (Fri) by MarcB (guest, #101804) [Link] (2 responses)

Windows disables most write-buffering by default on removable devices. This alone should prevent the issues.

In Linux, this should be achievable by setting "/sys/devices/virtual/bdi/<your USB block device>/max_ratio" to a low value.

Not sure if there are any smart udev rules to do this automatically, for potentially slow devices only.

The rest of the 6.2 merge window

Posted Jan 6, 2023 7:15 UTC (Fri) by dtardon (subscriber, #53317) [Link] (1 responses)

> Windows disables most write-buffering by default on removable devices. This alone should prevent the issues.

> In Linux, this should be achievable by setting "/sys/devices/virtual/bdi/<your USB block device>/max_ratio" to a low value.

> Not sure if there are any smart udev rules to do this automatically, for potentially slow devices only.

The rule wouldn't even have to be smart. All that's needed is a rule that matches for presence of a property (e.g., ID_SLOW_USB_DISK) and sets the max_ratio attribute. The property itself could then be attached to known-problematic devices via HWDB. For extra points, the max_ratio could be adjustable via a property as well.

The rest of the 6.2 merge window

Posted Jan 6, 2023 11:56 UTC (Fri) by farnz (subscriber, #17727) [Link]

One enhancement that would help is to be able to change max_ratio from a percentage to a byte counter (like how the system-wide part has both dirty_bytes and dirty_ratio). Then, the property could be based on the expected transfer rate for the device to get you something like 250 ms of data in flight - enough to avoid small transfers getting stalled because there's no write caching at all, not so much that it's a pain when you've finished doing work and want to eject the device.

The rest of the 6.2 merge window

Posted Dec 29, 2022 12:06 UTC (Thu) by jezuch (subscriber, #52988) [Link] (2 responses)

> Meanwhile, the "extensive changelog" award must certainly go to Christian Brauner, for this patch, which features 520 lines of explanation (not including the stack trace) for a one-line fix.

Congrats! From someone who occasionally gets complaints that my commit messages are way longer than the change itself :) (One person said that when they see a long commit message like this, they tend to think that it's something difficult, and so they avoid reviewing it.)

The rest of the 6.2 merge window

Posted Dec 29, 2022 13:44 UTC (Thu) by pm215 (subscriber, #98099) [Link] (1 responses)

I see the commit message includes "First, this is a clever but __worringly__ underdocumented algorithm. There isn't a single detailed comment to be found in next_group(), propagate_one() or anywhere else in that file for that matter.", which somewhat suggests that some of the explanation in the commit message could usefully be turned into comments for the benefit of future readers...

The rest of the 6.2 merge window

Posted Dec 31, 2022 7:44 UTC (Sat) by zev (subscriber, #88455) [Link]

True, though then people also have to be vigilant for the comments to remain up to date with the code. An outdated or otherwise inaccurate comment can sometimes be worse than no comment at all -- commit messages are a bit more effort to find, but they have the advantage that you know the exact point in time (and state of the code) for which they were written.

The rest of the 6.2 merge window

Posted Jan 1, 2023 10:02 UTC (Sun) by amarao (guest, #87073) [Link] (2 responses)

I tried to read copyright-next licence and I kinda... worried. What is proprietary relicence for a fee? What is copyleft sunset?

Can you do a thorough analysis of it?

The rest of the 6.2 merge window

Posted Jan 1, 2023 11:17 UTC (Sun) by Wol (subscriber, #4433) [Link] (1 responses)

To answer your two questions quickly ...

"What is proprietary relicence for a fee?" - look at the ghostscript project. They own all the copyright on their project, which means they can (and do) sell proprietary licences over and above the GPL.

"What is copyleft sunset?" - the licence contains a "suicide clause". So I haven't read the licence, but I guess it contains something like "after 15 years any code covered by this licence is placed into the public domain". Look at the TrollTech/Qt dual licence - it explicitly states that if TT stop releasing GPL versions, the most recent available version changes its licence to MIT.

There are also various "bugs" in GPL2 which I know GPL3 addressed, and I guess "copyleft-next" does too. The "if you fix copyright violations immediately you know about them". The "you don't have to force recipients to accept the source with the binary". And probably more.

Cheers,
Wol

The rest of the 6.2 merge window

Posted Jan 1, 2023 11:31 UTC (Sun) by Wol (subscriber, #4433) [Link]

Hmm.

Having read it, I like it. Copyleft sunset - clause 2 says "you can distribute this code". After 15 years, clauses 3-5 are void - the copyleft clauses requiring "share and share alike". So yes, after 15 years this licence becomes a MIT-style licence.

Not sure about clause 5. It does fix the "forcing recipients to accept the source" problem, but I'm not sure I like the way they've done it. Never mind, they've done it, which is the important thing.

As the original article said, it's just like a "simple GPL". Nice.

Cheers,
Wol


Copyright © 2022, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds