|
|
Log in / Subscribe / Register

A final stable kernel update for 5.4

Greg Kroah-Hartman has announced the release of the 5.4.302 stable kernel:

This is the LAST 5.4.y release. It is now end-of-life and should not be used by anyone, anymore. As of this point in time, there are 1539 documented unfixed CVEs for this kernel branch, and that number will only increase over time as more CVEs get assigned for kernel bugs.

For the curious, Kroah-Hartman has also provided a list of the unfixed CVEs for 5.4.302.



to post comments

Questions

Posted Dec 3, 2025 17:17 UTC (Wed) by alspnost (guest, #2763) [Link] (31 responses)

This raises plenty of questions, however: if 5.4 LTS was still being "supported" until now, then how does it have over 1,500 CVEs? What about the other newer LTS kernels, like 6.1 and 6.6 - do they also have a growing list of CVEs, despite being supported with regular patch releases? How valuable is the LTS status in practice?

In related news, 6.18 is now confirmed as an LTS, as expected, and is now listed on the kernel.org releases page as such.

Questions

Posted Dec 3, 2025 17:26 UTC (Wed) by corbet (editor, #1) [Link] (2 responses)

As the kernels age, backporting fixes becomes increasingly hard, and finding people willing to do that work also becomes harder. It is not surprising the more and more problems fall through the cracks as a kernel gets older; fixing that would require significant funding to pay people to do that work.

Questions

Posted Dec 4, 2025 8:09 UTC (Thu) by alspnost (guest, #2763) [Link]

Sure - and I appreciate that is whole separate and longstanding issue. Both for the kernel and the wider ecosystem. I'm sure Greg would welcome some additional help, and it would be interesting to know what moves are afoot on that front. Nobody's getting any younger, and nor is the kernel itself!

Questions

Posted Dec 4, 2025 12:11 UTC (Thu) by skandigraun (guest, #136756) [Link]

I disagree with this statement. Though probably this has a role in the high CVE number also, but I believe this is not the main reason.

Many CVE fixes are trivial in terms of complexity, backporting could be even automated for a big portion. They are low hanging fruits for anyone to get their name into the kernel's commit list.

I think about a year ago I was looking at the unfixed list, picked one randomly, and thought that I would backport and submit the fix. Beforehand I checked if maybe someone else did it already: sure enough, I found it on LKML from a few months before. The patch was rejected with some reason along the lines of "most probably it can't be exploited on this version" (paraphrased) - however the given kernel version was marked as vulnerable for that CVE. (Unfortunately I won't be able to recall which CVE it was, so you need to take my word for it. Or not.)

After this the old "the kernel is making a joke of the cve system" sentiment flashed through my mind for a second, and dropped the backporting idea.

Of course this is anecdotal, and I might have been unlucky, and selected that one exceptional CVE-patch that was rejected. But it left a sour taste in my mouth.

Questions

Posted Dec 3, 2025 17:28 UTC (Wed) by pizza (subscriber, #46) [Link] (2 responses)

> This raises plenty of questions, however: if 5.4 LTS was still being "supported" until now, then how does it have over 1,500 CVEs?

"supported" doesn't mean "we're legally bound to address every possible/reported problem."

Questions

Posted Dec 6, 2025 17:39 UTC (Sat) by ATLief (subscriber, #166135) [Link] (1 responses)

That's true, but compared to "unsupported" there is *some* additional commitment. What's being promised by that additional commitment?

Questions

Posted Dec 6, 2025 19:19 UTC (Sat) by pizza (subscriber, #46) [Link]

>What's being promised by that additional commitment?

Some variation of "a best-effort to backport fixes for ~= 2 years" (as opposed to the typical "best-effort for ~3 months" that "normal" releases get)

The key words being "best-effort".

Actual commitments cost actual money. Everyone else gets the "as-is, no warranties whatsoever" sign tap.

(Worth pointing out that the 5.4.x EOL comes over *six years* after 5.4.0 was first released0

Questions

Posted Dec 3, 2025 19:18 UTC (Wed) by bluca (subscriber, #118303) [Link]

> This raises plenty of questions, however: if 5.4 LTS was still being "supported" until now, then how does it have over 1,500 CVEs? What about the other newer LTS kernels, like 6.1 and 6.6 - do they also have a growing list of CVEs, despite being supported with regular patch releases? How valuable is the LTS status in practice?

The LTS branches are very valuable, it's the CVEs assigned by the kernel CNA that are not, as it's well-known to be an attempt to "burn down the system": https://www.youtube.com/watch?v=HeeoTE9jLjM&t=1686s
That's why anybody who is serious about tracking security issues just ignores everything that comes out of that CNA, as it's not really useful, and they are (or at least were at the time of that talk) very open and honest about its real purpose. Everybody involved knows this.

You can just open a random one from that list to see what this means. It's just a random collection of bugs - no explanation, no impact assessment, no reproducer, nothing at all. It's intended to be that way. It's on a similar level of "quality" as reports written with AI slop that I get through a bug bounty program and that gets instantly closed and the reporter blocked.

Questions

Posted Dec 3, 2025 19:23 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (23 responses)

Because the kernel uses CVE to mean "if you squint a bit the patch looks sus". Anything that says "deadlock" in the commit message is a CVE. So is anything that says "a malformed inode could..." even though many filesystem maintainers don't consider kernel code to be hardened against malicious filesystems.

In some cases it's frankly ridiculous, I am the maintainer and cannot for the love of anything understand why https://marc.info/?l=linux-cve-announce&m=17629841572... is supposedly a vulnerability.

Questions

Posted Dec 3, 2025 20:04 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (22 responses)

Thinking more about it, it's probably because a WARN becomes a panic if one enables panic_on_warn, and in Linux CNA land a panic is an availability violation... even if you need to take special steps to enable normally-disabled panic conditions.

But Greg will tell you it's a vulnerability.

Questions

Posted Dec 3, 2025 20:11 UTC (Wed) by mb (subscriber, #50428) [Link] (21 responses)

You're almost there.
Only the user can tell if it is a vulnerability or not.
Get it?

Questions

Posted Dec 3, 2025 20:39 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (14 responses)

Because panic_on_warn is not the default, the user that sets it has explicitly said that availability is not a problem to him, let alone a security problem. Therefore, a WARN is not a vulnerability.

Besides, a host-controlled availability issue in a VM is *also* not a vulnerability because the host admin can just decide not to run the VM on a whim. Either way it simply makes no sense.

Questions

Posted Dec 3, 2025 20:50 UTC (Wed) by mb (subscriber, #50428) [Link] (13 responses)

What if the user wants both: Availability and security (not continuing on detected violations)?
That's a very common thing.

The users enables oops-on-panic for security.
The user wants to update for availability if there is a bug that oopses.

Only the user can tell, therefore upstream has to report everything.
Sorry, but you are wrong.

Questions

Posted Dec 3, 2025 21:09 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (12 responses)

> Availability and security (not continuing on detected violations)?

Then he's not treating availability as a security problem.

Besides, panic on oops is very different from panic on warn. An oops is already a condition where execution went down the wrong path, a WARN is a condition that is unexpected but that can be handled safely.

Panicking on oops makes sense, whereas panicking on WARN is an extreme setup.

> Only the user can tell, therefore upstream has to report everything.

Sure, there's "git log" for that.

Questions

Posted Dec 4, 2025 1:38 UTC (Thu) by sashal (✭ supporter ✭, #81842) [Link] (11 responses)

1. users set panic_on_warn to mitigate the risks involved in letting systems continue running in an incorrect state. Panic "mitigates" many instances of a much severe issue by turning them into a denial of service.

See this thread for quite a few examples: https://lwn.net/ml/linux-kernel/20211027233215.306111-1-a...

Saying that "WARN is a condition that is unexpected but can be handled safely" is incorrect.

2. panic_on_warn does not suggest the user doesn't care about availability.

3. Most Linux instances in the world currently run with panic_on_warn. Calling it extreme is a bit... extreme :)

Should they move away from doing that? probably.

Can we just close our eyes and pretend they don't exist? no.

4. And no, this isn't an attempt to "burn it down". The kernel CNA isn't even the most active CNA out there.

Questions

Posted Dec 4, 2025 6:33 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (10 responses)

Linus says the same in that thread: panic_on_warn is for two situations:
>
> - kernel testing (pretty much universally done in a virtual machine, or simply just checking 'dmesg' afterwards)
>
> - hyperscalers like google etc that just want to take any suspect machines offline asap

In neither case availability is a *security* problem.

The purpose of WARN is to report an unexpected state that *can* be handled, with some degradation, but a WARN should also guaranteeing kernel integrity. If you detect a situation that will certainly lead to use after free later, that's probably the best use of BUG. But for example KVM simply blocks all further ioctls on a file descriptor when it finds a WARN-worthy situation.

Questions

Posted Dec 4, 2025 9:00 UTC (Thu) by gregkh (subscriber, #8) [Link] (9 responses)

> In neither case availability is a *security* problem.

But it is, it meets the definition of "vulnerability" as required by cve.org. If a user can cause a machine to reboot, that affects the overall system, and so, if a user can cause a WARN() to be triggered on a Linux system, if panic-on-warn is enabled, then a reboot happens, which is a denial of service for the system.

And as Sasha said, billions of Linux systems out there run with panic-on-warn enabled, so these are CVE-worthy issues for those configurations. If you do not have panic-on-warn enabled, wonderful, probably way over half of all kernel CVEs will not ever be a real issue and you can just filter them away.

But as we do not know how Linux is used, i.e. we do not dictate use, we have to mark everything in the codebase that is fixed that could be resolving a vulnerability for any type of kernel configuration possible. It's just the "joy" of being an open source project, closed source projects could just declare "panic-on-warn should never be used in a real system, do so and your warranty is void" and be done with it, but we can't.

And no, we are not "burning down the system", this is explicitly what cve.org wants us to do. If they wanted to cause their own system to be "burned down", they could do many other things than force a few thousand Linux kernel CVEs to be assigned :)

Questions

Posted Dec 4, 2025 9:53 UTC (Thu) by pbonzini (subscriber, #60935) [Link] (8 responses)

> It's just the "joy" of being an open source project, closed source projects could just declare "panic-on-warn should never be used in a real system, do so and your warranty is void" and be done with it, but we can't.

That's *categorically false*. You are free to describe what you consider to be within the boundaries of your security/CVE policy. For example https://qemu-project.gitlab.io/qemu/system/security.html#... says "Bugs affecting the non-virtualization use case are not considered security bugs at this time. Users with non-virtualization use cases must not rely on QEMU to provide guest isolation or any security guarantees".

So, it takes a sentence in the manual to say "panic_on_warn=1 signals that you do not consider availability to be security sensitive" (I guess you could express it more clearly, but that's the idea).

Likewise, VMs by definition do not guarantee availability. If KVM can trigger a WARN in the guest, that's not security sensitive even with panic_on_warn because the host could just kill -9 the process for the VM.

> this is explicitly what cve.org wants us to do

It's not. I don't know any other open source project that creates a CVE for each assertion failure.

Questions

Posted Dec 4, 2025 10:21 UTC (Thu) by gregkh (subscriber, #8) [Link] (7 responses)

Yes, as an open source project you are free to say fun things like "do not hold it this way", but as been discussed many times in the past in many different places, that does not prevent a CVE from being issued against that software when someone "holds it wrong" (see the recent oss-security discussions about this very issue for specific examples, some quite extreme.)

For Linux, we are _barely_ getting away with saying "issues arising from specifically corrupted filesystem images that can caught by the userspace fsck program before mounting the filesystem are not a security issue." And even then, we get push-back all the time to classify them as CVEs as some "user friendly" distros love to auto mount anything a user plugs in no matter what.

When being a CNA, you have to follow the rules of cve.org and its definition of what a vulnerability is, and as there are billions of Linux systems out there running with panic-on-warn enabled (i.e. the huge majority of Linux overall), having a user be able to trigger a WARN() message, which causes the kernel to reboot the system, is considered a denial-of-service and meets the definition of having to have a CVE assigned to it.

And a "reboot" is not an "assertion failure", at the level of the kernel, it is a denial of service, which is why Linux has to mark these as CVEs.

Seriously, I wish this wasn't the case, but it's been discussed many times in the past, and unless the cve.org board comes to us and says "it's ok for you to not mark these types of things as CVEs now", we have to keep doing it.

Everyone seems to be ignoring the fact that there are many major bugfixes that are NOT given a CVE as they do not meet what the cve.org vulnerability definition as data loss or corruption is not considered a vulnerability. That's where I worry the most as those things are not being tracked anywhere at the moment, and might cause problems going forward with the implementation of the CRA and other such laws in other countries.

Then there is the issue where some companies only like to give "high" security bugs a CVE, and ignore the "low" and "middle" ones and just roll those out into normal bugfixes. That causes the overall numbers to be skewed away from those CNAs so just looking at raw quantity of CVEs issued does not actually give a good picture overall of how well a project is doing or not.

In other words, pure numbers of CVEs don't matter, what matters is your use case, and the code that you actually use, and if any of the reported CVEs are actually relevant for that. It's up to the end-user of Linux to determine this, it's not anything that us as a CNA can do.

Questions

Posted Dec 4, 2025 10:35 UTC (Thu) by epa (subscriber, #39769) [Link] (2 responses)

You are right. Pure numbers of CVEs don't matter. But then why say in the release announcement "there are 1539 documented unfixed CVEs for this kernel branch" as if that were a meaningful statement?

Questions

Posted Dec 4, 2025 14:23 UTC (Thu) by gregkh (subscriber, #8) [Link] (1 responses)

Because that might be meaningful to some people, as I have been asked for this type of information many times in the past.

Questions

Posted Dec 4, 2025 14:31 UTC (Thu) by daroc (editor, #160859) [Link]

If nothing else, it is the sort of information one might provide to one's management to justify spending time upgrading to a more recent kernel version.

Questions

Posted Dec 4, 2025 11:54 UTC (Thu) by pbonzini (subscriber, #60935) [Link]

> When being a CNA, you have to follow the rules of cve.org and its definition of what a vulnerability is, and as there are billions of Linux systems out there running with panic-on-warn enabled (i.e. the huge majority of Linux overall), having a user be able to trigger a WARN() message, which causes the kernel to reboot the system, is considered a denial-of-service and meets the definition of having to have a CVE assigned to it.

panic_on_warn explicitly says that I don't care. I *want* to lose availability in order to not compromise on confidentiality or integrity and I'm okay with false positives. The linked thread from 2021 for example explains (https://lwn.net/ml/linux-kernel/22828e84-b34f-7132-c9e9-b...) panicking as a safe condition, where the watchdog can bring back the system in a safe condition. panic_on_warn does the opposite of adding security issues; it gives peace of mind that you're safe *even if a developer programmed incorrectly the recovery path from a WARN*.

Of these two statements:

1) "It's not Linux that has a problem if a corrupted filesystem corrupts kernel memory, it's whatever system allowed the filesystem to be mounted without a previous fsck. Therefore it's out of the Linux CNA scope".

2) "It's not Linux that has a problem if a WARN triggers, it's whatever system embeds Linux *and* sets panic_on_warn=1 *and* cannot recover from it. Therefore it's out of the Linux CNA scope".

The second is much more defensible. In fact if anything it's the first that is absolutely *in*defensible and IMO doesn't follow the rules of cve.org.

We went from "all bugs are security bugs, therefore I don't care about marking bugs as security-sensitive" to "all bugs are security bugs, therefore I shall inundate the world with pointless and unusable reports".

> And a "reboot" is not an "assertion failure", at the level of the kernel, it is a denial of service, which is why Linux has to mark these as CVEs.

Anything that talks to someone else can be a source of denial of service. If a database goes down, whatever service it talks to won't be able to run. Cloudflare's recent screwup was an assertion failure, and it definitely caused denial of service. The kernel is not special in terms of availability (it is special in terms of confidentiality and integrity, which is why I have no problem with keyword-tagging use after free and oopses as CVEs).

People having to use stuff in ways it wasn’t meant for

Posted Dec 6, 2025 23:58 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (2 responses)

ChromeOS’s security team will trest any potentially exploitable bugs in ext4 as vulnerabilities, even if they require a malicious filesystem image to trigger. I believe Android has the same policy with f2fs.

In both cases, such an exploit could be used by an attacker to maintain persistence across reboots. The only way to avoid this problem is to use FUSE for the writable partition, which currently has horrible performance.

Running a full fsck at each boot is not an option for rather obvious performance reasons.

Perhaps Google ought to be taking charge of maintaining these filesystems, or at least triaging and fixing syzbot reports?

In the future, the solution is probably to (a) make FUSE fast enough and (b) use a better language to write filesystems. Unfortunately, even the patches to make FUSE fast only help filesystems that don’t use encryption, checksumming, or compression. Android and I believe ChromeOS use fscrypt and blk-crypto, and software encryption would be impossible because the kernel does not have the keys.

I don’t have anything I can contribute to fixing the problem, but this is a case where distros don’t have a decent alternative to what they are being told not to do.

People having to use stuff in ways it wasn’t meant for

Posted Dec 7, 2025 0:18 UTC (Sun) by gregkh (subscriber, #8) [Link] (1 responses)

Last I checked, Android does not have the same policy as it enforces filesystem integrity in a different way (signed partition images, if the signature does not pass, the filesystem is not mounted.) So those few billion devices should be safe :)

dm-verity doesn’t work for writable partitions

Posted Dec 7, 2025 2:16 UTC (Sun) by DemiMarie (subscriber, #164188) [Link]

Android and ChromeOS do use dm-verity for the OS partition, with a root hash that is either signed or passed to the kernel via a trusted path. This uses a Merkle tree to validate data blocks as they are read.

However, dm-verity is read only, so it can’t be used for the writable user data partition. An attacker who has gained root privilege can thus tamper with the filesystem metadata on the writable filesystem.

Questions

Posted Dec 3, 2025 20:41 UTC (Wed) by bluca (subscriber, #118303) [Link] (5 responses)

It's a "vulnerability" if there's a reproducer attached that demonstrates a security boundary being violated. Otherwise it's just a sparkling bug.

Questions

Posted Dec 3, 2025 20:49 UTC (Wed) by pbonzini (subscriber, #60935) [Link] (1 responses)

I don't necessarily want a reproducer. If something says use after free, or even easily-triggered-BUG_ON, then fine it's a vulnerability. But many CVEs from Linux don't even clear that bar.

Questions

Posted Dec 4, 2025 3:56 UTC (Thu) by pabs (subscriber, #43278) [Link]

See also bluca's comment above:

https://lwn.net/Articles/1049140/

Questions

Posted Dec 4, 2025 10:22 UTC (Thu) by kleptog (subscriber, #1183) [Link] (2 responses)

That's doesn't make a lot of sense. You're saying that because there's no reproducer it's not a vulnerability, until someone smart finds a way to exploit it and then it's a zero-day?

Ideally we'd like a way to be able to (easily) prove some bug was not exploitable, but if we had that we could just use that to remove all the bugs from the code base.

Questions

Posted Dec 4, 2025 10:38 UTC (Thu) by bluca (subscriber, #118303) [Link] (1 responses)

How do you know your future "zero day" has actually been fixed if you can't even reproduce it? Are you sure it's not still a "zero day" in waiting and you simply weren't smart enough to really fix it, given you weren't smart enough to reproduce it in the first place?

No reproducer -> no vulnerability. It's a simple rule and every bug bounty program I have seen follows it, otherwise anybody can just show up and claim money out of thin air, taking resources away from real security research. But of course we all know these are all just attempts at post-facto justifications, and the real goal is to "burn down the system" https://www.youtube.com/watch?v=HeeoTE9jLjM&t=1686s

Reproducers

Posted Dec 6, 2025 23:45 UTC (Sat) by DemiMarie (subscriber, #164188) [Link]

I have gotten bug bounties for problems that I found solely via source review. In both cases, reproducing the bug would have been hard: I didn’t have the affected hardware, and it might well have needed to be chained with another exploit.

Monolithic kernel were a mistake

Posted Dec 7, 2025 0:02 UTC (Sun) by DemiMarie (subscriber, #164188) [Link] (1 responses)

What this tells me is that monolithic kernels, especially in unsafe languages, were never a good idea to begin with.

There is simply too much code running with kernel privilege.

Breaking up Linux into a microkernel and userspace servers, both either written in safe languages or proven free of undefined behavior in other ways, is the only way I can see to stem the flood.

Monolithic kernel were a mistake

Posted Dec 8, 2025 10:45 UTC (Mon) by farnz (subscriber, #17727) [Link]

It's an engineering tradeoff; monolithic kernels have lower overheads (since they do fewer context switches), but the downside is that because it's all in a single context, bugs are of equal potential severity whether they occur in a rarely used part, or in the core of the kernel.

Microkernels are a different tradeoff; on the one hand, a bug in the userspace server for your webcam can't take out your filesystem, but on the other hand, you have higher overheads, because you're context switching more frequently (e.g. a write comes from my process, then you have to get it across to the filesystem server, then across to the block device server, then to the SSD).

That said, I'm curious as to whether an io_uring-like syscall interface (where all syscalls are done as IPC via a command queue and response queue) would ameliorate some of the overhead on modern multi-core machines, by allowing a process to queue work, while another core is clearing the queue. This'd be a major rethink of a lot of things, but it's at least plausible that the net effect would be to allow you to hide the microkernel overheads behind a queue, trading off latency.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds