A turning point for CVE numbers

Posted Feb 14, 2024 21:44 UTC (Wed) by mfuzzey (subscriber, #57966)
In reply to: A turning point for CVE numbers by bluca
Parent article: A turning point for CVE numbers

>The point of the CVE system should in theory be that it allows to quickly decide whether it's worth to drop everything on the floor..

But that does not and cannot work for something that is as wide scoped as the Linux kernel.
Even assuming the CVE does refer to a real vulnerability the impact is very usecase dependent.

A local privilege escalation on a server providing shell acounts is probably a big deal whereas the same vulnerabiliity on an embedded device that is only running the intended software (maybe already as root) who cares?

The problem is that CVE numbers are conceptually simple and hide a lot of the real complexity and nuances that need to go into their sensible interpretation and you end up with stupid policies that say "you have to fix all CVEs" (of course that's not directly the fault of the CVE numbers themselves but the way people try to use them).

Giving managers something they think they can understand and make decisions on when the realities are much more complicated is generally a bad idea.

Also in my experience updates in the same stable kernel series very rarely cause issues and those that ocasionally do slip though can be mitigated with reasonable testing. Updating to a new kernel release does need a bit more care and tesrting though. I think the risk of *not* updating is higher than that of updating, providing you do test to some extent.

> companies will just stop using Linux in their products, starting with anything to do with government contracts,

Unlikely I think. At this point is there are few viable alternatives to Linux for vast swathes of applications. The alternatives generally either aren't open source and involve per instance license fees (making them either insufficiently flexible or too expensive) or lack the breadth of hardware support that Linux enjoys (making them unusable for many, partiuclarly in embedded).

A turning point for CVE numbers

Posted Feb 15, 2024 0:50 UTC (Thu) by bluca (subscriber, #118303) [Link] (41 responses)

> But that does not and cannot work for something that is as wide scoped as the Linux kernel.

Of course it can and does, this is just the usual kernel developers misplaced exceptionalism and sense of grandeur. It's just some piece of software like many others.

> Even assuming the CVE does refer to a real vulnerability the impact is very usecase dependent.

And that's what the impact assessment and other data are used for, you are stating the obvious. "Does this exploit apply to our product" is the standard minimum assessment that everyone does.

> Also in my experience updates in the same stable kernel series very rarely cause issues

They break apart all the time, as soon as they involve anything that is not exercised on a couple dozens kernel developers laptops or desktops, and sometimes even there, like the disk corruption bug of a couple of months ago. New major releases are even worse, with userspace interfaces being intentionally broken left and right.

> At this point is there are few viable alternatives to Linux for vast swathes of applications.

I'm sure the developers of all past software that was once widespread and then faded into obscurity thought the same at some point or another. It just needs to stop making economic sense to use it, and that's exactly what it will happen - back to being a toy for hobbyists. We live in a capitalist society, and all those companies that are directly or indirectly sponsoring the vast, vast majority of development feel no attachment nor loyalty to anything but their share prices and profit margins.

A turning point for CVE numbers

Posted Feb 15, 2024 0:54 UTC (Thu) by pizza (subscriber, #46) [Link] (27 responses)

> New major releases are even worse, with userspace interfaces being intentionally broken left and right.

[citation needed]

(Especically given this flies against a _very_ longstanding "don't break userspace" rule that's kept all manner of crappy interfaces around)

A turning point for CVE numbers

Posted Feb 15, 2024 1:06 UTC (Thu) by bluca (subscriber, #118303) [Link] (26 responses)

That rule is a fantasy from a world that never existed. There is the syscall ABI stability, and that's about it. I still remember the scramble to fix pretty much every single udev rule in existence when the sequence of uevents was changed in an incompatible way. Or having to throw a way a bunch of stuff and start over when overlayfs was made incomaptible with SELinux. Or countless changes in the netlink protocol. And so on and so forth. If you really think "don't break userspace" really means anything, then either we have fundamentally different ideas of what userspace actually means, or you haven't been paying much attention.

A turning point for CVE numbers

Posted Feb 15, 2024 1:19 UTC (Thu) by pizza (subscriber, #46) [Link] (3 responses)

> That rule is a fantasy from a world that never existed.

Then why are you so grumpy about not getting something that was never promised to begin with?

Seriously, write and/or maintain your own kernel/system if that sort of stability matters so much to you.

("But waaah, that's too much work!" you exclaim. So if you're not willing to do it, why do you expect others to do it for you, for free?)

A turning point for CVE numbers

Posted Feb 15, 2024 1:28 UTC (Thu) by bluca (subscriber, #118303) [Link] (2 responses)

So is it or is it not a "long standing rule"? You are the one who claimed it was, not me

A turning point for CVE numbers

Posted Feb 15, 2024 8:15 UTC (Thu) by Wol (subscriber, #4433) [Link] (1 responses)

> So is it or is it not a "long standing rule"?

Depends what you're talking about. As far as I know udev is (developer wise) absolutely nothing to do with the kernel.

"Do not break user space" is the rule Linus applies to the linux kernel. And that is a big part of the reason linux is so successful. Who knows what rules the udev guys apply to udev...

Cheers,
Wol

A turning point for CVE numbers

Posted Feb 15, 2024 10:35 UTC (Thu) by bluca (subscriber, #118303) [Link]

udev acts on uevents, which are a userspace interface exposed by the kernel. Some time ago the sequence of events for existing devices was changed in a way that broke pretty much every userspace configuration in existence, so userspace was very much broken. The answer was "shrug, deal with it".

And that's a legitimate answer to give of course, it's their kernel after all. The problem is taking that approach _and_ then going around proudly proclaiming "we do not break userspace".

A turning point for CVE numbers

Posted Feb 15, 2024 21:51 UTC (Thu) by fw (subscriber, #26023) [Link] (21 responses)

On x86-64, some previously supported system call interfaces no longer work. Try running a CentOS 6 chroot on a Debian 11 system, for example.

A turning point for CVE numbers

Posted Feb 16, 2024 0:21 UTC (Fri) by bluca (subscriber, #118303) [Link] (20 responses)

That's preposterous. Don't you know that the kernel "never breaks userspace"?

Never break userspace

Posted Feb 16, 2024 10:03 UTC (Fri) by timon (subscriber, #152974) [Link] (19 responses)

Linus et al. approach the “never break userspace” rule pragmatically, with a kind of “If a tree falls in a forest and no one is around to hear it, does it make a sound?” thinking: If the kernel devs know of no actual userspace code depending on some kernel API/ABI, and if nobody complains when something is changed or deprecated, then this case is not in scope for the “never break userspace” rule; in the same way, if after the fact someone complains, the kernel devs are usually quick to roll back any change or deprecation.

Never break userspace

Posted Feb 16, 2024 12:12 UTC (Fri) by bluca (subscriber, #118303) [Link] (18 responses)

Except that is also most definitely not true. Complaints are usually met with "shrug, deal with it".

Never break userspace

Posted Feb 16, 2024 14:28 UTC (Fri) by corbet (editor, #1) [Link] (17 responses)

Examples? Preferably with pointers to the discussion around the issue? I don't doubt there are places where the kernel community is failing to live up to its goals, but it's hard to make things better without some clarity about where the problem exists.

Never break userspace

Posted Feb 16, 2024 15:36 UTC (Fri) by bluca (subscriber, #118303) [Link] (16 responses)

Sure, here's one:

https://lists.freedesktop.org/archives/systemd-devel/2022...
https://lists.freedesktop.org/archives/systemd-devel/2022...

The other one that hit me personally was when overlayfs was made incompatible with selinux. And then there's all the times that netlink changed. And all the time uevents changed. And all the times sysfs changed. And so on and so forth. The reality is that "we don't break userspace" is a nice story that kernel developers like to go around tell anybody who's willing to listen, but it's just that, a story. They barely care about syscall ABI stability, and even that gets broken from time to time as already pointed out by another comment.

Never break userspace

Posted Feb 16, 2024 16:01 UTC (Fri) by mb (subscriber, #50428) [Link] (2 responses)

Well. Technically you are right. The kernel frequently breaks interfaces.
But in reality, systemd and udev are parts of the operating system.
So I don't care that much, if the OS breaks itself. That will get fixed eventually.

What I and most users care about is whether actual user applications break.
And that extremely rarely happens.
I run decades old binaries that work just fine.

It doesn't affect my application, because the OS as a whole still works as before, after porting systemd/udev to the new interfaces. A combination of updated kernel and incompatible systemd/udev would never hit stable distributions.

Therefore, do you have examples of real user applications breaking, that are not part of the OS?

Never break userspace

Posted Feb 16, 2024 16:08 UTC (Fri) by bluca (subscriber, #118303) [Link] (1 responses)

> Therefore, do you have examples of real user applications breaking, that are not part of the OS?

Need some help with all that goal post moving? Must be exhausting, going that far

Never break userspace

Posted Feb 16, 2024 16:54 UTC (Fri) by mb (subscriber, #50428) [Link]

I have said that you were right. You have said nothing wrong.
There is no such thing as a general interface stability guarantee.

I've just set things into perspective. That is no goal post moving.

Never break userspace

Posted Feb 16, 2024 16:06 UTC (Fri) by corbet (editor, #1) [Link] (12 responses)

As yes, the BIND/UNBIND thing was a big enough deal that I wrote about it at the time. What I suggested there might still seem to make sense: rather than sniping at the kernel community from the sideline, work with them to improve the situation. Let Thorsten know about regressions, preferably early enough to keep them from making it into a release. Things can be improved.

I have to say, Luca, that I would expect a systemd developer to understand how this kind of constant badmouthing from outside can make an environment toxic; systemd has certainly suffered its share of that. Why continue with that pattern? A more constructive approach might work wonders.

Never break userspace

Posted Feb 16, 2024 16:49 UTC (Fri) by bluca (subscriber, #118303) [Link] (10 responses)

> As yes, the BIND/UNBIND thing was a big enough deal that I wrote about it at the time. What I suggested there might still seem to make sense: rather than sniping at the kernel community from the sideline, work with them to improve the situation. Let Thorsten know about regressions, preferably early enough to keep them from making it into a release. Things can be improved.
>
> I have to say, Luca, that I would expect a systemd developer to understand how this kind of constant badmouthing from outside can make an environment toxic; systemd has certainly suffered its share of that. Why continue with that pattern? A more constructive approach might work wonders.

These things do get reported, and they get ignored/shrugged away if you are lucky, and if not you get taken for a long ride. For this case it's explained in the link above as well. For the overlayfs case Google even went as far as sending 20 revisions of a patchset to try and restore backward compatibility, albeit optionally, and it was stonewalled: https://lore.kernel.org/lkml/20211117015806.2192263-1-dva...
Apparently breaking userspace can just be waved through, while fixing it requires "building a security model" and other extremely high-bars to be met. All in the meanwhile anybody using selinux needs to completely open up the security policy to make it work at all, of course, which I guess makes for a very interesting "security model". I could go on, but can't be bothered to look up yet more references.

So yeah, trying and dispelling this myth that "the kernel doesn't break userspace" is pretty much all that's left. Reading blatantly false statements being made irks me really badly, especially when used to justify some potentially damaging process changes as it happened here.

Never break userspace

Posted Feb 16, 2024 17:01 UTC (Fri) by mb (subscriber, #50428) [Link] (9 responses)

> These things do get reported, and they get ignored/shrugged away if you are lucky

Quite honestly, systemd and udev also broke lots and lots of things about how the Linux operating system works.
Administrators had to change tons of scripts, because some things suddenly worked differently after the distribution updated from classic init to systemd.

But the correct answer to users complaining often enough is to "ignore it" or "shrug it away".
Things work differently now. Get used to it. That's the correct answer surprisingly often.
That's true for systemd and it's also true for parts of the kernel.

What should be avoided is breaking changes that don't have positive sides. Changes just for the sake of changing and breaking things. That is bad and must be avoided. And it should always be considered, if a non-breaking change is possible.

But if a change breaks things and at the same time brings big benefits (relative to the breakage)?
*shrug*

Never break userspace

Posted Feb 16, 2024 18:05 UTC (Fri) by bluca (subscriber, #118303) [Link] (8 responses)

> Quite honestly, systemd and udev also broke lots and lots of things about how the Linux operating system works.

Or in other words, different software work differently. Or, yet again, you are moving the goal posts. Because nobody ever said "systemd works in exactly the same way as your 1980s garden variety collection of shell scripts", in fact the idea was very much the opposite. Some compat layers for the main interfaces were provided, which were always clearly documented as sub-optimal and wonky and intended for transition purposes, and after 20 years or so we'll remove them too, with ample advance notice. But nobody ever claimed that every single workflow in existence would continue unchanged after switching.

In fact, we don't even make absolute claims such as "we never break compatibility, period". From time to time we do breaking changes, and we try to announce them in advance, and for really impactful ones we try to get consensus on the mailing list first, and in rare cases we even try to help distributions migrate ahead of time to ensure the impact is nominal only - see when we dropped support for unmerged-usr last year for example - this happened in v255, and nobody noticed.
Sometimes things break accidentally, and sometimes they get fixed and sometimes they don't.

But what we most certainly don't do, is going around claiming "we never break compatibility", and I certainly don't use such a claim to start firing a bogus CVE for each commit that I backport to every stable branch I maintain.

See where the difference is?

Never break userspace

Posted Feb 16, 2024 18:59 UTC (Fri) by mb (subscriber, #50428) [Link] (7 responses)

>Sometimes things break accidentally, and sometimes they get fixed and sometimes they don't.

>See where the difference is?

Nope.
It's exactly the same thing. Things change. Deal with it like everybody has to deal with systemd.

Never break userspace

Posted Feb 16, 2024 19:03 UTC (Fri) by rahulsundaram (subscriber, #21946) [Link] (6 responses)

> It's exactly the same thing. Things change. Deal with it like everybody has to deal with systemd.

The difference is that kernel developers have publicly committed to never breaking userspace. Systemd developers haven't. It is the disconnect between the public messaging and reality that's causing the contention. Not the changes themselves necessarily.

Never break userspace

Posted Feb 16, 2024 19:10 UTC (Fri) by mb (subscriber, #50428) [Link] (3 responses)

>The difference is that kernel developers have publicly committed to never breaking userspace

Things like uevents, tracepoints, sysfs files, etc... were pretty much never part of that claim.
Devs try hard to not make unnecessary breakages, but if a sysfs file disappears/changes or an uevent changes, programs have to deal with it.
Has always been like that.

> It is the disconnect between the public messaging and reality that's causing the contention.

The disconnect between the expectation and the reality is causing the contention.

Never break userspace

Posted Feb 16, 2024 19:21 UTC (Fri) by bluca (subscriber, #118303) [Link] (2 responses)

> Things like uevents, tracepoints, sysfs files, etc... were pretty much never part of that claim.

Citation needed. That is very much not evident from any claim anybody has ever made that I have seen.

Never break userspace

Posted Feb 16, 2024 19:41 UTC (Fri) by mb (subscriber, #50428) [Link] (1 responses)

>Citation needed

Even syscalls have been removed in the past, breaking applications.
BUT these applications always were very limited in count and usually part of the OS itself.

Citation: Look at the sources.

There has never been a thing like a general stability guarantee.
It always has been a matter of common sense.
If a change only breaks udev or systemd and nothing else, it might make sense to do it.

Never break userspace

Posted Feb 16, 2024 20:07 UTC (Fri) by bluca (subscriber, #118303) [Link]

Excellent, you have now demonstrated that my original post, claiming that there is no such thing as 'kernel never breaks userspace', was indeed correct, and thus 'just update to the latest' cannot be suggested as a solution to security issues since public interfaces will be broken left and right with no regard for backward compatibility. Congrats!

> If a change only breaks udev or systemd and nothing else, it might make sense to do it.

I beg to differ

Never break userspace

Posted Feb 16, 2024 19:25 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

> The difference is that kernel developers have publicly committed to never breaking userspace.

Where?

Okay, I know Linus says "never break user-space", and he is very strict about it. But at the end of the day, shit happens.

And there's plenty of kernel developers who *haven't* signed up to it. They just know that trying to get it past Linus is not a battle worth fighting most of the time.

There's one big example I can think of, that had a rather nasty fall-out, in the raid world. So bad, in fact, that kernels were modified to have an explicit "fail to boot" config, iirc!

Something to do with the fact that raid layout was accidentally changed. So you have pre-change kernels that will trash post-change arrays, pre-discovery kernels that will trash pre-change arrays, and post-discovery kernels that will refuse to access arrays without a "this is a pre/post-layout flag".

Sometimes that's all you can do :-(

Cheers,
Wol

Never break userspace

Posted Feb 16, 2024 20:04 UTC (Fri) by bluca (subscriber, #118303) [Link]

In this very thread you can see such statements

Never break userspace

Posted Feb 18, 2024 5:41 UTC (Sun) by ras (subscriber, #33059) [Link]

I smiled at the popcorn reference, thinking it was a joke. My bad. It was advice. Good advice, as it turned out.

A turning point for CVE numbers

Posted Feb 15, 2024 1:08 UTC (Thu) by pizza (subscriber, #46) [Link] (12 responses)

> I'm sure the developers of all past software that was once widespread and then faded into obscurity thought the same at some point or another. It just needs to stop making economic sense to use it,

It's _vastly_ cheaper to keep using it than to replace it with something else. By multiple orders of magnitude.

...Any replacement will necessarily need to be roughly equivalent in features and complexity, and even if you completely discount the initial development costs, you're going to still end up with a similar ongoing maintenance burden.

A turning point for CVE numbers

Posted Feb 15, 2024 10:39 UTC (Thu) by bluca (subscriber, #118303) [Link] (6 responses)

It is now, but there is no law of physics that says it will remain cheaper forever, no matter what happens. Intentionally breaking the security tracking system so that it's unusable sounds like a great way to make it more expensive than it was before.

A turning point for CVE numbers

Posted Feb 16, 2024 13:17 UTC (Fri) by hkario (subscriber, #94864) [Link] (5 responses)

It's not breaking anything.

If you have a policy that says you need to ship fixes for all CVEs, then that's a stupid policy. It just conditions vendors to refuse each and every CVE until it goes through arbitration (something proprietary vendors already do).

What consumers of CVEs need to do is be selective, evaluate if the CVE is relevant, what are the effects of exploiting it, etc. and only then backport it to the product they ship that uses the kernel or other CVEs. Same for end users, if the bug is in an API that's not used by any software that is running, then, no, you don't have to install updates.

The problem is that all of it requires actual work, not blind adherence to the policy, and it's for security, so the business also doesn't want to spend money for it.

It's a complex problem and there are no simple solutions.

A turning point for CVE numbers

Posted Feb 16, 2024 13:35 UTC (Fri) by bluca (subscriber, #118303) [Link] (4 responses)

> If you have a policy that says you need to ship fixes for all CVEs, then that's a stupid policy.

Nobody I know of has such a policy, so that sounds like yet another of those made-up strawman that the kernel people pushing for this have conjured out of thin air.

We rely on CVE metadata&al to decide whether we need to pick a fix or not. If the metadata is bogus, because the kernel maintainers just flood the system with bogus CVEs, then we can't do that sensibly anymore, and the process is broken.

A turning point for CVE numbers

Posted Feb 16, 2024 13:43 UTC (Fri) by pizza (subscriber, #46) [Link] (3 responses)

> Nobody I know of has such a policy, so that sounds like yet another of those made-up strawman that the kernel people pushing for this have conjured out of thin air.

I worked for a company that had such a policy.

Respectfully, you need to STFU about stuff that is outside your realm of expertise and experience.

A turning point for CVE numbers

Posted Feb 16, 2024 14:09 UTC (Fri) by bluca (subscriber, #118303) [Link] (2 responses)

> I worked for a company that had such a policy.

Sounds like a problem in that company then, why should that justify breaking everything for everybody else?

> Respectfully, you need to STFU about stuff that is outside your realm of expertise and experience.

Respectfully, you need to STFU about my expertise and experience, because you have no idea about either (just like I don't about yours)

A turning point for CVE numbers

Posted Feb 16, 2024 15:32 UTC (Fri) by pizza (subscriber, #46) [Link]

> Sounds like a problem in that company then, why should that justify breaking everything for everybody else?

*shrug* You made an assertion such organizations do not exist (because you didn't know any) and used that to accuse others of making things up or otherwise speaking in bad faith. You were incorrect on both fronts.

You're free to argue that the current status quo has problems (or not). You're free to talk about *your* experiences, and how proposed actions by others will have ill effects on you or third parties.

But you don't get to claim that other people's direct experiences are wrong, incorrect, or irrelevant, and accuse them of bad faith for taking steps to improve the messes they are dealing with, "because you have no idea about either".

A turning point for CVE numbers

Posted Feb 16, 2024 15:49 UTC (Fri) by pizza (subscriber, #46) [Link]

> Sounds like a problem in that company then

Incidently, that company was that way because *EU regulations required them to be*.

(They laid off my research team on the tail end of a major process/policy revamp brought about by new regulations soon to come into effect. I was made to endure many training sessions about how those new/updated regulations affected every part of the overall product lifecycle, from early design to manufacturing to label placement/content to post-sales support to how end-of-life would be handled)

So it's not "that company's problem" so much as "the problem of any company operating in a regulated space"

A turning point for CVE numbers

Posted Feb 15, 2024 11:43 UTC (Thu) by Wol (subscriber, #4433) [Link] (2 responses)

> ...Any replacement will necessarily need to be roughly equivalent in features and complexity, and even if you completely discount the initial development costs, you're going to still end up with a similar ongoing maintenance burden.

I would beg to differ. How much software - that was perfectly good at doing its job - has been replaced by a FAR inferior solution because one bunch of suits with a big marketing budget schmoozed another bunch of suits with a big spending budget. (Or other things like underhand shenanigans etc etc.)

Okay, there's loads of counter-pressures in place certainly as regards linux, but there's nothing stopping massively inferior solutions driving out far better ones.

Cheers,
Wol

A turning point for CVE numbers

Posted Feb 15, 2024 14:00 UTC (Thu) by pizza (subscriber, #46) [Link] (1 responses)

> Okay, there's loads of counter-pressures in place certainly as regards linux, but there's nothing stopping massively inferior solutions driving out far better ones.

The problem is that in the real world, _hardware_ (and its configurations, and the expectations of the software running on top of it) is more complex than ever, and that requires ever-more-complex software to sanely manage it. Decry that reality all you want, but at the end of the day, reality doesn't care about feelings.

So I stand by my point. You want "simpler/inferior" operating systems? They already exist [1], and it turns out nobody wants to use them, or invest the (considerable!) effort needed to adapt/maintain them for their own needs.

[1] Or rather, existed, having never grown beyond the "academic toy" status or long since confined to the dustbins of history.

A turning point for CVE numbers

Posted Feb 15, 2024 17:24 UTC (Thu) by Wol (subscriber, #4433) [Link]

> [1] Or rather, existed, having never grown beyond the "academic toy" status or long since confined to the dustbins of history.

And how much software, having started out as an "academic toy", is now mainstream despite being unfit for purpose precisely because it's all the CS grads know?

Pretty much all the software I swear BY, was designed and then built. Pretty much all the software I swear AT, was cobbled together and the cracks papered over. Unfortunately, properly designed software is a rarity :-( It's also usually older software which imho is still better in many cases than its modern replacements, which just aren't "fit for purpose".

Even if it's only in the programmer's head, a truth table of all possible options leads to a far better program than a programmer responding "oh I didn't think of that" when faced with an end user pointing out the beedin' obvious! (And no, I don't expect the first programmer to *implement* all possible options, just the fact that they were considered in the design results in a far better design.)

Cheers,
Wol

A turning point for CVE numbers

Posted Mar 9, 2024 0:50 UTC (Sat) by DanilaBerezin (guest, #168271) [Link] (1 responses)

> It's _vastly_ cheaper to keep using it than to replace it with something else. By multiple orders of magnitude.

This is a pretty large blanket statement that definitely isn't always true. If it were as true as you claim it was, things wouldn't fade into obscurity or ever be replaced. There are plenty of conditions where replacing something is cheaper than continuing to use it. X is a great recent example.

A turning point for CVE numbers

Posted Mar 9, 2024 1:21 UTC (Sat) by pizza (subscriber, #46) [Link]

> This is a pretty large blanket statement that definitely isn't always true.

I didn't claim it was true in a general sense; I only claimed it was true for the Linux kernel.