|
|
Log in / Subscribe / Register

Distributions quote of the week

Wondering when it is the best time to test the kernel to prevent Linux Kernel regressions from hitting Arch Linux, Fedora Linux, or openSUSE Tumbleweed?

It's now, as the first pre-release of #Linux 6.19 is out – which leaves plenty of time to find, report, debug, and fix any problems that those distros otherwise will encounter when they switch to 6.19.y in about eight to ten weeks. And testing is not even hard, as easy-to-install packages with pre-built mainline kernels exist for all three distros.

In case you want to play it a bit safer, delay testing by one week till -rc2 is out – bugs that lead to data loss introduced before -rc1 are extremely rare but will almost certainly have been found and fixed by then.

Anything up to 6.19-rc6 (five weeks from now) is still okayish, but less ideal.

The sixth -rc is your last good chance to test. Linus by then wants all regressions that have become known since the beginning of the 6.19 cycle fixed – but in case some were missed or not reported yet, there is still enough time to report, debug, and fix them before they reach those distros.

Testing any later is often too late: most bugs then can't be fixed anymore before those distros will switch to the 6.19.y series, which will happen within one or two weeks (in the case of Arch and Tumbleweed) or three to four (Fedora) after 6.19 is released.

Thorsten Leemhuis



to post comments

How to test a kernel?

Posted Dec 18, 2025 2:17 UTC (Thu) by aphedges (subscriber, #171718) [Link] (31 responses)

Does anyone have suggestions on how to actually test a kernel for regressions?

The only tool I'm aware of is Vagrant, which is non-FOSS, runs the target environment in a VM, and really only tests that deployment succeeds. I feel a good tool for this would run some sort of test suite with real (or realistic) workloads.

In addition, I can imagine much kernel testing would be to find bugs in hardware drivers, but it's not clear to me how to test that without a bunch of spare hardware.

How to test a kernel?

Posted Dec 18, 2025 5:00 UTC (Thu) by knurd (subscriber, #113424) [Link] (30 responses)

> Does anyone have suggestions on how to actually test a kernel for regressions?

Just use it on machines you regularly use where you'll notice any problems.

Because a lot of automatic testing in VMs is already performed by various CI systems. Those find a lot of stuff in, say, the the MM layer or file systems, which is why "data-eating bugs" are rare with pre-releases. But those VMs won't notice many "bugs in hardware drivers", as you mentioned, unless of course they are in one of the few drivers for hardware components VMs emulate; and as most VMs use some variant of a few similar set-ups, the coverage is quite limited here.

As about half of the kernel are drivers, way more testing is needed. And those drivers require the right hardware, and there is plenty out there in various combinations. For most drivers, there are also no test suits. So in the end, "we need a lot of people that regularly run latest mainline kernels on their regular machines and see what breaks" is really what's needed.

How to test a kernel?

Posted Dec 18, 2025 9:21 UTC (Thu) by taladar (subscriber, #68407) [Link] (26 responses)

Is there a particular reason why most hardware drivers don't have test suites? I would imagine especially with hardware running a test suite against a mock version of the hardware would be a lot faster than setting up specific failure conditions and in general states on the hardware itself during development?

I mean that might not work so well with super-complicated hardware like GPUs but most hardware is a lot simpler than that in terms of size of the interface between driver and hardware.

How to test a kernel?

Posted Dec 18, 2025 10:28 UTC (Thu) by knurd (subscriber, #113424) [Link] (16 responses)

> Is there a particular reason why most hardware drivers don't have test suites

Not sure, but I have a few ideas:

> against a mock version of the hardware

Writing drivers that support everything is already hard, and a lot of drivers leave a lot of room for improvement already – writing a mock version of the hardware and tests for it would make things a lot harder, I assume, so the bar might simply be too high in many cases.

And maybe companies have mock hw and test with it internally but haven't open-sourced it.

> interface between driver and hardware.

Well, hardware is tricky and has bugs (and its firmware, too) – so way more than testing the interface is needed.

And a lot of hw problems only show up under very specific conditions. Linus, for example, once faced a i915 GPU driver regression with a XPS13 I had not encountered with the exact same model (9360 or the one before it iirc), as he had one with the HiDPI display, while I only had one with a FullHD screen.

There are more reasons, but this already gives an idea, I'd say.

In the end it boils down to: even with test suits and mock hw we'd still really need a lot of field-testing.

How to test a kernel?

Posted Dec 18, 2025 10:54 UTC (Thu) by farnz (subscriber, #17727) [Link] (15 responses)

There's also the challenge of getting people to write the mock when they can just test on real hardware instead, and get the full range of real behaviours.

For example, take this commit for I2C over DP AUX channel: the behaviour before and after the commit is correct per the DP spec, so a mock is likely to work in both cases. But, in fact, that change fixed some devices and broke others.

A mock would neither require 16 byte transfers (Bizlink DVI-D dual link) nor fail on 16 byte transfers (Apple VGA adapter), but would correctly handle all sizes of I2C over DP Aux, as many devices do. To make a mock useful, each of the problem behaviours identified would need to be added to the mock - but that means multiple mocks.

How to test a kernel?

Posted Dec 19, 2025 10:03 UTC (Fri) by taladar (subscriber, #68407) [Link] (1 responses)

Obviously the mock needs to be as close to the observed behavior of the hardware as possible and if you have multiple devices with different observed behavior you would need multiple.

What I don't see is how it is any easier to test every software change in the driver with every real device the driver is supposed to support than it is to adjust the mock once to reflect newly discovered behavior of the specific hardware device. After that you could then just run the test suite for all hardware devices in seconds instead of connecting potentially dozens of devices, getting them into the required state and testing them once, something that is not really feasible after every change in the driver source code.

As I said, obviously that won't work for the really complex hardware but most hardware, especially the kind where one driver supports devices from dozens of vendors, isn't that complex.

How to test a kernel?

Posted Dec 19, 2025 10:15 UTC (Fri) by farnz (subscriber, #17727) [Link]

I think you're underestimating the complexity of a useful mock here. It takes a lot of work to go from "no mock" to "useful mock", because you not only need the mock to work when the driver works, but also to fail in all known cases where the driver would fail, and that's hard.

And many software changes don't need testing against the external hardware - you can show them correct from the code, instead of needing to test them, or at least show that the internal changes are tested, and the external interface is unchanged. Plus, there's a strong argument that if no-one is running the driver against a piece of hardware, it doesn't actually need testing, and if someone does use a given piece of hardware, you can delegate the testing to that user (thus scaling out the testing faster than development - if you have N pieces of hardware to test, you have O(N) people doing the testing, whereas a mock is O(1) people doing the work).

How to test a kernel?

Posted Dec 19, 2025 21:19 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (12 responses)

> A mock would neither require 16 byte transfers (Bizlink DVI-D dual link) nor fail on 16 byte transfers (Apple VGA adapter)

A better mock can do both :) And this kind of behavior can usually be extracted into a decorator and re-used across different devices.

It's a huge amount of work, but the end result might be pretty good. It's also something that can be done using generative AI, actually.

How to test a kernel?

Posted Dec 20, 2025 11:10 UTC (Sat) by farnz (subscriber, #17727) [Link] (11 responses)

I'm curious: how does a single mock simultaneously succeed only with 16 byte transfers, while also failing on 16 byte transfers and only succeeding with smaller transfers?

I'd probably implement this as two instances of the same code (just with different limits in there), but I can see no way for one mock to both fail and succeed simultaneously - at best, you'd have to change behaviour on external stimulus, changing the mock to represent different hardware when it's told to.

How to test a kernel?

Posted Dec 20, 2025 20:12 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

You do two runs, one with the failing behavior and one with the usual one. It's a very common approach with testing. The test harness can even automate the configuration for these kinds of things. We even have a quirks database in the kernel itself.

How to test a kernel?

Posted Dec 22, 2025 19:27 UTC (Mon) by farnz (subscriber, #17727) [Link]

I would call that two separate mocks, though, not one mock - because the two runs have deliberately different behaviour, and that different behaviour means that the test results gathered with one mock represent the likely behaviour against a different subset of hardware to test results gathered with the other mock.

It's only "the same mock" if the intent is that results gathered with one mock are intended to be a total substitute for results gathered with the other - if I must do at least two runs of the tests for a given class of hardware against different mocks to get a full set of results, then they're different mocks, even if they share 99.99% of code, and are compiled into the same binary from the same sources.

How to test a kernel?

Posted Dec 20, 2025 20:15 UTC (Sat) by Cyberax (✭ supporter ✭, #52523) [Link]

Another thing, you can also do experimental permutations by adding quirky behaviors to existing devices to see if they break. This can work because such weird behaviors often come from silicon bugs in components that are reused throughout the industry.

The total search space is gigantic, of course, so some kind of heuristics will be needed.

How to test a kernel?

Posted Dec 20, 2025 20:18 UTC (Sat) by Wol (subscriber, #4433) [Link] (7 responses)

For what is - allegedly - a single piece of hardware I'd have a single mock. With the different behaviours selected in code with an enum, and on the command line by an argument.

Which then prints as its first output to the log file "enum X of Y hardware-version".

Okay, you can't guarantee everybody will check, maybe they'll just run "--hardware-version myhardware", but hopefully at least some will do a broader test and check, and if they're expecting to see "enum 3 of 5 myhardware" and suddenly see "enum 3 of 6", they'll do a double-take and investigate.

Cheers,
Wol

How to test a kernel?

Posted Dec 22, 2025 19:32 UTC (Mon) by farnz (subscriber, #17727) [Link] (6 responses)

So you'd have multiple mocks, where you select which mock you're using today based on a command line argument, but you'd call it "one mock" and hope that people remember that they need to run 6 different mocks, not just one?

That, IME, is asking for people to forget to run some of the mocks - after all, I've tested active DP to DVI converter behaviour in a mock, even though I didn't actually test the BizLink or Apple behaviours.

It's the same rationale as not saying that the Apple DP to VGA converter is "the same hardware" as the BizLink DP to VGA converter - they're both active converters, but they have different silicon resulting in different behaviour.

Now, because there's a lot of commonality between them, I might well want to share a lot of code - but same applies to (say) FreeBSD and OpenBSD, or Debian and SteamOS, but nobody would call them "the same OS", because they are different in ways that matter,

How to test a kernel?

Posted Dec 22, 2025 22:52 UTC (Mon) by Wol (subscriber, #4433) [Link] (5 responses)

> So you'd have multiple mocks, where you select which mock you're using today based on a command line argument, but you'd call it "one mock" and hope that people remember that they need to run 6 different mocks, not just one?

So you've missed the

> > Which then prints as its first output to the log file "enum X of Y hardware-version".

and also the fact that somebody else might have added other mocks of which you weren't aware. If you bother to check the log, that tells you how many different known variants there are of this supposedly single piece of hardware, and which particular variant you are testing.

If you don't check the logs, nothing will help you :-)

Cheers,
Wol

How to test a kernel?

Posted Dec 24, 2025 16:41 UTC (Wed) by farnz (subscriber, #17727) [Link] (4 responses)

If the tests pass, why would I look at the log?

Fundamentally, the purpose of the mock is to speed up the feedback loop between making a mistake and failing, by interposing a fully automated step between "compile the change" and "test on real hardware". I don't get to avoid testing on real hardware - that's a hard requirement for considering the change tested - and so a mock that requires me to check logs on success is worthless, since I'll just skip that step, and test on real hardware.

The power of mocks is in being able to run the test on my laptop directly, rather than having to find the real hardware - and, IME, it's at its most powerful when the mock converts non-deterministic failures to deterministic (for example "this deadlock only happens if the previous transmit attempt has lost arbitration more than 3 times, but not reached its arbitration limit before you add this transmit attempt to the queue".

How to test a kernel?

Posted Dec 24, 2025 17:47 UTC (Wed) by Wol (subscriber, #4433) [Link] (3 responses)

If you're testing the hardware by using a mock, because you don't have the real hardware, then why are you testing the hardware? Presumably to make sure it works on all variations? So the idea is to check the logs so you can make sure you HAVE tested everything.

If you're just making sure it works on your hardware, then fair enough. If you want to make sure it works on all variants, how are you going to know you've tested EVERYthing?

Cheers,
Wol

How to test a kernel?

Posted Dec 24, 2025 17:54 UTC (Wed) by farnz (subscriber, #17727) [Link] (2 responses)

Again, if the existing test passes, why would I check the logs to ensure that I'm testing everything?

If I'm writing a new mock, sure, I'll check the logs to make sure that the test suite uses my new mock where appropriate - but if I'm writing a fix for a bug, I'll not bother checking the logs for older tests if the test suite passes - I'm just going to check the logs for the tests I've added, and check that they're testing the mocks I expect.

And if the result of my fix is a regression, that's on the people who added tests that were incomplete.

How to test a kernel?

Posted Dec 29, 2025 11:27 UTC (Mon) by taladar (subscriber, #68407) [Link] (1 responses)

Just add a test that asserts that the list of hardware variants the mock supports is still identical to the list of hardware variants your test uses in testing. When that test fails you know you need to add support for a new hardware variant (or remove support for one no longer supported by the mock I suppose).

How to test a kernel?

Posted Dec 30, 2025 11:39 UTC (Tue) by farnz (subscriber, #17727) [Link]

That then puts pressure on people to not add new mocks that differ from an existing mock in one small detail, because doing so results in you having to update all the tests that use a mock to confirm that yes, this mock is not needed in this test.

It also encourages people to put in a list of mocks that they don't actually test against, simply to meet the rule, which leads to a false sense of security - if your test only looks at the behaviour of the BizLink adapter and the generic "meets the spec" adapter, but you list the Apple adapter because your visual inspection shows that it's the same as the generic adapter at the moment, a future developer can get confused by the list of hardware variants tested.

Note, too, that you don't want to run the combinatorial explosion of devices - if you're doing this seriously, that results in a test suite that takes months to run. Rather, you want to run the generic "meets the spec" device, and known quirky devices only, skipping the ones that are believed to behave the same way as other devices.

How to test a kernel?

Posted Dec 18, 2025 13:41 UTC (Thu) by pizza (subscriber, #46) [Link] (7 responses)

> I would imagine especially with hardware running a test suite against a mock version of the hardware would be a lot faster than setting up specific failure conditions and in general states on the hardware itself during development?

In order for that to be even moderately useful, that mock version would have to be bug-for-bug compatible with *real hardware*, not just the "specification" of said hardware.

As an extreme example of this: A couple of Mitsubishi printers were reported to sometime misbehave on earlier Raspberry Pi boards (<=RPi3). I was unable to recreate this problem... until I switched to using wifi instead of hard-wired ethernet. After some trial and error a workaround was devised --- plug *anything else* into one of the USB ports (eg a flash dongle) and the problem went away.

A low-level USB sniffer provided the explanation. When only the printer was plugged in, the USB controller was effectively DoSing the printer with flow control packets slightly faster than the USB spec technically allowed (we're talking on the order of a few microseconds). When something else was plugged in, the rate of traffic to the printer was effectively halved and everything proceeded swimmingly.

How do you mock out a test for something like this? It's not enough to even be bug-for-bug compatible, you also have to be *timing* compatible. That is nearly impossible unless you have a ludicously-detailed spec or your mockup is effectively a cycle-accurate simulator running the same verilog as went into the real hardware. Oh, and also accurately simulating the other components of the full system. Sure, not everything requires this level of detail/accuracy, but what constitutes "good enough" varies _very_ widely.

(FYI, current $dayjob is writing models of hardware for to enable whole-system virtualization. Rarely does a week go by that doesn't have us exposing bugs in either the HW specifications or "battle tested" production software that only ever worked accidentally)

How to test a kernel?

Posted Dec 18, 2025 14:33 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

A mock can still be useful for catching some bugs. It can't catch all of them (the system I worked on with a full mock had a bug that required signals to couple via RF waves from one board to another, which the mock did not simulate), but it reduces bugs.

The bigger problem is that the mock also needs maintaining, as you discover more relevant information about the hardware - and the mock will inevitably end up having to diverge to simulate two variants of the "same" piece of hardware. As the mock gets more complicated, it becomes harder and harder to verify that the mock works the same way as the hardware - and you can end up with the Linux kernel being the only thing that can drive the mock, because it's too closely coupled to Linux kernel behaviour for other OS drivers to work.

How to test a kernel?

Posted Dec 18, 2025 14:54 UTC (Thu) by pizza (subscriber, #46) [Link]

> The bigger problem is that the mock also needs maintaining. [...] As the mock gets more complicated, it becomes harder and harder to verify that the mock works the same way as the hardware

Absolutely.

"All models are wrong, some are useful"

How to test a kernel?

Posted Dec 18, 2025 15:49 UTC (Thu) by pm215 (subscriber, #98099) [Link] (4 responses)

It's pretty rare for hardware data sheets to even be detailed enough to confidently implement a functionally matching software model, let alone a timing accurate one. It's extremely common for them to only document the "happy path" and say nothing at all about behaviour of the hardware if software does something wrong.

How to test a kernel?

Posted Dec 18, 2025 16:28 UTC (Thu) by farnz (subscriber, #17727) [Link] (3 responses)

When I've done this, the mock implements the happy path. Then, every single time we hit misbehaviour from the hardware, we update the mock to match the hardware's actual behaviour before fixing our code.

Eventually, you end up with the mock forming a sort of "code as documentation" telling you how the device actually behaves - and refactoring the mock to remove duplicate code occasionally gives useful insights into what's probably going on in the device.

How to test a kernel?

Posted Dec 18, 2025 16:39 UTC (Thu) by pm215 (subscriber, #98099) [Link] (2 responses)

Most of the models I have to deal with I don't have convenient access to the real hardware to confirm against, unfortunately.

How to test a kernel?

Posted Dec 18, 2025 16:54 UTC (Thu) by farnz (subscriber, #17727) [Link] (1 responses)

How do you test the real driver if you don't have access to either a mock that works well enough to prove the happy path, or the real hardware?

When I've used mock hardware, we have it precisely because we don't have convenient access to the real hardware - and one of your tasks when you fix a bug is to change the mock so that the bug reproduces with the unfixed code, but not with the fixed code.

This does mean that the mock is not guaranteed to simulate real hardware (since we may have taught it to reproduce the symptoms but not the underlying real reason), but it's better than trying to bugfix with no hardware and no mock.

How to test a kernel?

Posted Dec 18, 2025 17:25 UTC (Thu) by pm215 (subscriber, #98099) [Link]

I'm not writing the drivers, I just have to bug fix QEMU device models :) Typically people come along and say either "this real OS doesn't work on the model but is fine on the hardware" or else "if you do this sequence of fuzzer-generated actions the model asserts/hangs/overruns a buffer". In either case I tend to have an underinformative data sheet, possibly the sources to a driver, and no hardware.

How to test a kernel?

Posted Jan 2, 2026 22:05 UTC (Fri) by aphedges (subscriber, #171718) [Link]

Ever since I read Driver regression testing with roadtest [LWN.net] a couple of years ago, I've been wanting to implement such mocks for the (non-GPU) hardware I own to prevent regressions, but as someone with no kernel development experience, it turned out to be far harder than I initially anticipated!

I feel a larger effort to implement mocks might help catch some bugs, but I know that even determining whether mocking affects the defect rate would be pretty difficult. Plus, I'm not sure where any of the funding to implement such mocks would come from.

How to test a kernel?

Posted Jan 2, 2026 21:51 UTC (Fri) by aphedges (subscriber, #171718) [Link] (2 responses)

So is your suggestion just to run in "production"? I don't maintain any critical systems, but I'm pretty sure my coworkers would dislike if testing with a new kernel took down our compute cluster. Even my own personal NAS would be pretty inconvenient to be unavailable due to a new kernel bug.

I feel this kind of problem is less existent when you have a large fleet with a gradual rollout, but I'm working at orders of magnitude smaller scale than a large tech company.

How to test a kernel?

Posted Jan 5, 2026 17:42 UTC (Mon) by knurd (subscriber, #113424) [Link] (1 responses)

> So is your suggestion just to run in "production"?

Yes, definitely. But that doesn't mean that you should roll it out to all your coworkers, the whole compute cluster, or your NAS. Instead, choose where you can use it in production without causing too much trouble every day and in case of problems.

I, for example, run mainline on my main work machine (except during merge windows) – but not on my home server which is also my NAS.

Maybe you have a few coworkers that are tech-savvy, are willing to help, have a good backup strategy, and are able to quickly switch to the latest stable kernel in case of severe problems? Great, ask them if they are willing to regularly test mainline in relevant use cases for you (like a particular machine you have a few dozen of). Depending on your compute cluster, it might even make sense to run it on a few machines there to, for example, detect performance problems (or benefits!).

The latter is what Meta does, afaik, to detect problems early, which is the "large fleet" you talked about. Because, yes, that scale makes some things easier; some things, on the other hand, are easier to realize on a smaller scale. Maybe that allows you to test mainline in some areas where it does not cost you much to help prevent situations that might cost you a lot of money and trouble down the road.

How to test a kernel?

Posted Jan 12, 2026 16:16 UTC (Mon) by aphedges (subscriber, #171718) [Link]

Thanks for the advice!

I was definitely thinking of what I read about what Meta does when thinking about large-scale testing. I just wish there was tooling for (orders of magnitude) smaller compute clusters to make it easier.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds