Much ado about SBAT

By Jonathan Corbet
July 20, 2023

Sometimes, the shortest patches lead to the longest threads; for a case in point, see this three-line change posted by Emanuele Giuseppe Esposito. The purpose of this change is to improve the security of locked-down systems by adding a "revocation number" to the kernel image. But, as the discussion revealed, both the cost and the value of this feature are seen differently across the kernel-development community.

The patch in question adds three lines to the kernel's x86 makefile, the end result of which is to add a new ELF section to the kernel executable, called ".sbat", containing these two lines of text:

    sbat,1,SBAT Version,sbat,1,https://github.com/rhboot/shim/blob/main/SBAT.md
    linux,1,The Linux Developers,linux,$(KERNELVERSION),https://linux.org"

Where $(KERNELVERSION) is replaced with the current kernel version number. The first line describes the section as conforming to version 1 of the UEFI Secure Boot Advanced Targeting (SBAT) specification. The second line is the one that matters; it identifies the executable component as "linux", with version number 1 (also called the "generation number"); the rest of the line is just documentation.

The purpose of the SBAT entry is to give the bootloader a way to know whether the given executable is safe to run. Since we are dealing with secure boot, that executable should already be signed by a key known to the system, but it is possible that a security vulnerability has been found in that binary since it was signed. To protect against that possibility, the bootloader will contain a minimum acceptable generation number; if the generation found in the SBAT section is below that number, the bootloader will refuse to boot it. Whenever a vulnerability that will lead to a secure-boot compromise is fixed, the minimum generation will be incremented, with the effect of immediately blocking the loading of any vulnerable versions of the code.

The SBAT idea is not new; it has been supported by the GRUB bootloader since the 2.06 release. The developers of the unified kernel image concept are implementing ways to load a kernel directly, without using a separate bootloader; as a part of this work, they want to add SBAT checks for the kernel image itself. The addition of SBAT version number to the build is a necessary part of this scheme.

The secure boot mechanism already contains ways to prevent the loading of old, vulnerable images through explicit revocation. There is just one little problem: given the diversity of the software used in the Linux world — and the large number of releases of each — the revocation lists quickly grow larger than the low-level firmware implementing secure boot can handle. SBAT is an attempt to overcome that problem by creating a single number that can be used to determine whether a given image is safe to boot.

Generation-number management

Much of the above was not really explained in the patch posting, which led to a number of questions on the list. Other signs of inattention (the inclusion of a linux.org URL, for example) also raised red flags. It is fair to say that this idea did not get the reception that its backers were hoping for.

At its core, the disagreement was over the management of that version number: how often would it change, and who would be responsible for deciding when it should be changed? Greg Kroah-Hartman suggested that the kernel's version number, which is already included in the image, could be used instead, but Luca Boccassi replied that the version number would not work, since there are too many of them for the system to track. In other words, kernel-version numbers present the same problem that revocation lists do — the problem that SBAT was developed to work around.

Kroah-Hartman later asked a number of questions about how the decision to update the generation number would be made, who would make it, who would change the number in the kernel, and so on. He suggested looking at past kernel history to come up with an idea of how many times that number would have changed over time. Boccassi's answers were not seen as entirely satisfactory; he said that the decision would be made by: "most likely those who understand the problem space". He suggested that the kernel project makes "3 releases a year", and that generation-number changes would be no more frequent than that. Kroah-Hartman replied that he does "a release or two a week across multiple stable kernel versions", rather more than three per year, and repeated his process questions.

Boccassi called the stable releases "irrelevant for the case at hand" and said that the process questions didn't matter: "the question here was about mechanism and storage. And it already works btw, it's just the kernel that's lagging behind, as usual". Kroah-Hartman was not moved:

To think that "let's add a security canary to the kernel image" is anything other than a process issue shows a lack of understanding about exactly how the kernel is released, how the existing kernel security response team works, and who does any of this work. To ignore that means that there is no way in the world this can ever be accepted.

Daniel P. Berrangé did make an attempt to address the process-related questions, saying that generation-number increments would be tied to CVE numbers and would likely be infrequent. He also acknowledged that there are some open questions with regard to how backports to the stable releases would be handled. Kroah-Hartman responded that most security-relevant kernel bugs are never assigned CVE numbers, and that he knows of many example of bugs that could be used to break secure boot. He also admonished: "as the person running the stable releases, you BETTER be working with me to try to figure this all out".

One might not think that the management of a simple number would be so hard. But the question of when it should be incremented is not trivial. As Kroah-Hartman and others pointed out, the kernel project is fixing bugs that may have security implications almost every day. Nobody knows how often exactly, because monitoring the patch stream for possible security issues is a task that nobody has the resources to keep up with for any extended period of time. It is probably safe to say, though, that almost every mainline kernel release has at least one fix for a bug that could be used to attack secure boot. So perhaps the generation number would simply need to increment for each release.

There is a worse problem, though, in that almost nobody runs mainline releases; instead, most users are running kernels derived from the stable updates. It is far from clear how the mainline generation-number updates should be backported to the stable releases, which happen much more frequently than mainline releases. Each stable release may have a subset of the fixes that were identified as needing generation-number increments in the mainline; how should the generation number be calculated in such cases? If a given fix is not applicable to a specific kernel release, should that number be incremented anyway — thus causing older binaries to fail to boot, even though they lack the vulnerability in question?

Letting distributors do it

For these reasons and more, it was occasionally suggested that, if such a generation number is to be a part of a kernel build, it should be created and managed by the distributors who are building the kernels. As Ard Biesheuvel put it:

Therefore, I don't think it makes sense for the upstream kernel source to carry a revocation index. It is ultimately up to the owner of the signing key to decide which value gets signed along with the image, and this is fundamentally part of the configure/build/release workflow. No distro builds and signs the upstream sources unmodified, so each signed release is a fork anyway, making an upstream revocation index almost meaningless.

Boccassi's reply, after describing the linux-kernel list as "an open sewer", dismissed this idea as unworkable:

The 'owner of the signing key' is not good enough, because there are many of those - as you know, the kernel is signed by each distro. But the key here is that the revocation is _global_ (again: global means it applies to everyone using shim signed by 3rd party CA), so each distro storing their own id defeats the purpose of that.

If this global number is not stored in the mainline kernel source, he said, somebody would have to maintain an external registry to somehow map generation numbers to points in the kernel's development history.

Paolo Bonzini, though, thought that even a distributor-managed generation number is unworkable: "I'm quite positive that a revocation index attached to the kernel image cannot really work as a concept, not even if it is managed by the distro". That led to another missive from Boccassi stating that the mechanism has been shown to work elsewhere, and that "the kernel is not special in any way":

The only thing that matters is if, given a bug, somebody either observed it being used as a secure boot bypass by bad actors in the wild, or was bothered enough to write down a self-contained, real and fully working proof of concept for secure boot bypass. If yes, then somebody will send the one-liner to bump the generation id, and a new sbat policy will be deployed. If no, then most likely nobody will care, and that's fine, and I expect that's what will happen most of the time.

It is not clear that this approach will satisfy the developers who see the whole mechanism as a sort of security theater.

As of this writing, the discussion appears to be at an impasse, with little mutual understanding between the participants. The proponents of the SBAT mechanism see a way of addressing their revocation problems that only needs an occasional one-line kernel patch to maintain. Longtime kernel developers, though, see a raft of unresolved process issues and strongly doubt that a single integer value can describe the security status of the huge variety of kernels in the wild. The kernel is more complicated that that, as is the security environment it operates in; any sort of global revocation mechanism may have to be as well.

Index entries for this article
Kernel	Security/UEFI secure boot

Much ado about SBAT

Posted Jul 20, 2023 15:44 UTC (Thu) by rwmj (subscriber, #5474) [Link] (17 responses)

I just want to emphasize that Emanuele (who works on my team) clearly signposted this as an RFC, both in the heading and in the first paragraph, where he wrote:

*Important*: this is just an RFC, as I am not expert in this area and I don't know what's the best way to achieve this.

So I don't think he deserved the disparaging comments that he got in that thread.

Much ado about SBAT

Posted Jul 20, 2023 16:29 UTC (Thu) by bluca (subscriber, #118303) [Link] (6 responses)

Indeed, the behavior of certain kernel maintainers, who ought to know better, was simply appalling and uncalled for.

Much ado about SBAT

Posted Jul 20, 2023 21:34 UTC (Thu) by Paf (subscriber, #91811) [Link] (5 responses)

This wasn't even half baked, and the article quotes several messages where those proposing it basically say "it's fine, why do you care?". They care because as proposed it's obviously just going to become security theatre. The proposal entirely ignores *the hard parts* of having a validation/invalidation/vulnerability number like this, the things that would have to be done *to make this actually matter rather than be a checkbox feature*. Showing up and getting mad that people care about that ... well, yeah. Dunno what else to say.

The patch as is absolutely trivial. And absolutely without value. The process is what would make this useful as something more than words.

Much ado about SBAT

Posted Jul 20, 2023 23:50 UTC (Thu) by bluca (subscriber, #118303) [Link] (4 responses)

> This wasn't even half baked

And that's why it was titled as 'RFC' and start with: "*Important*: this is just an RFC, as I am not expert in this area and
I don't know what's the best way to achieve this."

> They care because as proposed it's obviously just going to become security theatre.

If you think secure boot is 'security theatre' you can do that, you have every right to be wrong at your own expenses. However, this doesn't affect you nor anybody else who doesn't care about the use case, as it's just one line in a makefile that somebody else will look after.

> The proposal entirely ignores *the hard parts* of having a validation/invalidation/vulnerability number like this, the things that would have to be done *to make this actually matter rather than be a checkbox feature*. Showing up and getting mad that people care about that ... well, yeah. Dunno what else to say.

Except of course the 'proposal' already exists and functions for other boot components, along with a process to manage it, and it evidently works, so the other thing to say would have been "maybe I should do the bare minimum research and inform myself before commenting on things I do not know"

Please.

Posted Jul 21, 2023 0:30 UTC (Fri) by corbet (editor, #1) [Link]

Please stop here; surely you can find a way to make your points without engaging in personal attacks against the people who disagree with you?

Or, to put it another way: if you truly dislike the "open sewer" of the linux-kernel mailing list (to use your words), please refrain from peeing in the LWN comment stream.

Much ado about SBAT

Posted Jul 21, 2023 20:37 UTC (Fri) by rgmoore (✭ supporter ✭, #75) [Link] (2 responses)

And that's why it was titled as 'RFC' and start with: "*Important*: this is just an RFC, as I am not expert in this area and I don't know what's the best way to achieve this."

Sure, but remember what RFC stands for: request for comments. If someone starts out by admitting they're not an expert and asks for help in achieving their goals, it behooves them to listen to the people who are experts when they point out problems with the proposal. That doesn't seem to be what happened here. Instead, the people asking for opinions on their proposal decided to argue with the experts whose opinions they had solicited.

Much ado about SBAT

Posted Jul 22, 2023 10:37 UTC (Sat) by bluca (subscriber, #118303) [Link] (1 responses)

Replying to a thread after a moderator has said "stop here" is not really good practice

Much ado about SBAT

Posted Jul 25, 2023 22:00 UTC (Tue) by SLi (subscriber, #53131) [Link]

I read that comment as saying it to you, not to everyone. And perhaps also as saying it only about you communicating in a certain way.

Much ado about SBAT

Posted Jul 20, 2023 18:09 UTC (Thu) by willy (subscriber, #9762) [Link] (7 responses)

Why didn't you review it internally before sending out a poor quality RFC?

Much ado about SBAT

Posted Jul 20, 2023 18:56 UTC (Thu) by bluca (subscriber, #118303) [Link] (5 responses)

There was nothing 'poor quality' about it

Much ado about SBAT

Posted Jul 20, 2023 18:59 UTC (Thu) by willy (subscriber, #9762) [Link] (4 responses)

Given that both Vitaly and Emmanuel have described it as such, you should probably stop defending it as not-such.

eg https://lwn.net/ml/linux-kernel/87wmz33j36.fsf@redhat.com/

Annoying the people who have to apply your patch is not a good start.

Much ado about SBAT

Posted Jul 20, 2023 19:16 UTC (Thu) by bluca (subscriber, #118303) [Link] (2 responses)

There is no such description in the mail you quoted. The reality is that the lkml once again lived up to its standards of being an open sewer. There are people there who are lost causes, but there are also those who ought to know better.

Much ado about SBAT

Posted Jul 20, 2023 19:33 UTC (Thu) by willy (subscriber, #9762) [Link]

Once again you critique the mote in their brother's eye without noticing the beam in your own.

Much ado about SBAT

Posted Jul 28, 2023 15:22 UTC (Fri) by jschrod (subscriber, #1646) [Link]

Then, please stop using the same kind of communication here. Your tone is unwanted.

It doesn't help soliciting favorable sympathies either.

Much ado about SBAT

Posted Jul 20, 2023 19:39 UTC (Thu) by Lionel_Debroux (subscriber, #30014) [Link]

FWIW, I did not feel annoyed by the explicit "(This is not yet ready)" which starts the first message of https://github.com/memtest86plus/memtest86plus/pull/34 , and it looks pretty clear that the other maintainers weren't either.

Yes, I'm aware that memtest86+ isn't a general-purpose OS, and that as such, SB bypasses should be very few and very far between, and therefore, SBAT update handling - if ever needed - shall be a non-issue. The Linux kernel is in a different situation, I fully understand it.
However... there is significant contrast between the way this memtest86+ PR went, and the way this LKML thread (which I did not read entirely) went. Enough that one could argue that there may have been a nicer way to handle the situation on LKML, perhaps ?

Much ado about SBAT

Posted Jul 21, 2023 6:07 UTC (Fri) by pbonzini (subscriber, #60935) [Link]

To be honest it's much more annoying when you get poor quality patches with three internal Reviewed-by.

I don't think the disconnect between the kernel developers and the proposers of the SBAT mechanism could have been bridged by internal discussions.

Much ado about SBAT

Posted Jul 29, 2023 1:06 UTC (Sat) by milesrout (subscriber, #126894) [Link] (1 responses)

I am not going to spend the time to read through the thread. But it is clear even from the small snippets quoted in the article that he approached the issue with a very bad attitude.

>Boccassi called the stable releases ""irrelevant for the case at hand"" and said that the process questions didn't matter: ""the question here was about mechanism and storage. And it already works btw, it's just the kernel that's lagging behind, as usual"".

>Boccassi's answers were not seen as entirely satisfactory; he said that the decision would be made by: ""most likely those who understand the problem space"".

You can't openly display a dismissive and rude attitude to a project and expect to be received with smiles and praises.

Much ado about SBAT

Posted Jul 31, 2023 7:16 UTC (Mon) by idrys (subscriber, #4347) [Link]

Please note that you're mixing up two different people here: Emmanuele, who sent the initial RFC, and Luca, whom you're quoting here. The comments from the latter do not imply anything about the attitude of the former...

How does SBAT get updated?

Posted Jul 20, 2023 15:59 UTC (Thu) by nickodell (subscriber, #125165) [Link] (8 responses)

How does the list of revoked versions get updated? To clarify, I'm not asking a process question here. When a vendor decides to revoke e.g. version 10, how does that get sent out to each devices revocation database? Presumably it can't be a part of the signed image itself, because the old versions of each signed image won't have the new revocations. Is it a trust-on-first-use approach?

How does SBAT get updated?

Posted Jul 20, 2023 16:12 UTC (Thu) by danpb (subscriber, #4831) [Link] (7 responses)

According to the walkthrough example in:

https://github.com/rhboot/shim/blob/main/SBAT.example.md

the UEFI CA has responsibility of updating the global SBAT revokation list used for policy enforcement. The SBAT mechanism is intended as a way to replace the current global DBX revokation list (which contains hashes or certs), and updates of DBX are issued from the UEFI CA. The DBX updates can get onto actual machines in a variety of ways - either when the UEFI firmware update is delivered, or separately via fwupd simply updating the DBX data on its own. Conceptually SBAT would be delivered the same way in the long term.

So while it isn't stated explicitly, but my interpretation is that the vendors update the SBAT record in their signed binaries as needed, and inform the UEFI CA of the issues fixed, and the UEFI CA can then periodically decide to issue updated SBAT revokation list. AFAICT though, none of this has actually happened in practice yet. I see Fedora grub EFI binaries getting updated SBAT, but not sign of revokations of the old versions, though its possible I'm just not looking hard enough.

How does SBAT get updated?

Posted Jul 21, 2023 7:39 UTC (Fri) by juliank (guest, #45896) [Link] (6 responses)

Revocations are hard-coded in the shim, there isn't a mechanism to distribute SBAT revocations out-of-band yet, which is very much in progress. There are two levels of revocations, latest and previous (which probably should be renamed policy, default or something). The default revocations revoke components two levels behind basically, such that if the new boot loader fails, you can still recover with the old one (then after you verified it works, you can `mokutil --set-sbat-policy latest` to force the latest SBAT.

How does SBAT get updated?

Posted Jul 21, 2023 8:58 UTC (Fri) by james (subscriber, #1325) [Link] (5 responses)

Is there a process to prevent roll-back of the shim?

How does SBAT get updated?

Posted Jul 21, 2023 9:09 UTC (Fri) by bluca (subscriber, #118303) [Link] (4 responses)

DBX - and yes the UEFI spec should support all of this natively, but that moves at glacier speed

How does SBAT get updated?

Posted Jul 28, 2023 3:54 UTC (Fri) by jacmet (subscriber, #19734) [Link] (3 responses)

So does this mean that an effective SBAT update requires:

1: bugfix in kernel and new release with incremented SBAT number
2: updated distribution kernel release based on that with updated SBAT number
3: updated SBAT rollback info in shim and new shim release
4: updated shim distribution binary signed by Microsoft
5: updates installed on end machines

That seems like a fairly heavy/slow process to me.

How does SBAT get updated?

Posted Jul 28, 2023 3:58 UTC (Fri) by jacmet (subscriber, #19734) [Link] (1 responses)

Plus the DBX update to blacklist the old shim. This presumably means there will be a lot of shim updates, doesn't that lead to the same DBX capacity problems as blacklisting the kernels directly?

How does SBAT get updated?

Posted Jul 28, 2023 8:24 UTC (Fri) by mjg59 (subscriber, #23239) [Link]

The number of signed shims is *way* lower than the number of kernels, but it's also viable for sbat updates to be delivered out of band and for shim to pay attention to those (eg, just store the SBAT in a boot services-only variable, provide a MOK method for an authenticated physically present user to delete them in order to restore to factory state and permit rollback, have the components in the boot chain compare the next component against whatever's present, have updates to that gated through shim and ensure that they're only moving things forward and (if desired) are signed by the distro who owns that shim)

How does SBAT get updated?

Posted Jul 28, 2023 16:59 UTC (Fri) by wtarreau (subscriber, #51152) [Link]

plus adding to that that sometimes complex fixes for such issues will not flow as fast into older kernels, you will end up with an unclear period where you need to support older branches with their older unfixed version and newer ones as well. That's where it derived into "maybe we could have one number per branch" and then this becomes even more complicated.

Much ado about SBAT

Posted Jul 20, 2023 16:25 UTC (Thu) by gray_-_wolf (subscriber, #131074) [Link] (29 responses)

One thing (well, out of many, but it is the one I want to ask about) is how does the single number in the upstream kernel (let's pretend stables are not a thing now) play together with distributions doing their own builds with their own set of configurations. Would for example debian kernel, that does have some CONFIG_FOO=n be invalidated and unbootable if upstream kernel increased the SBAT number due to vulnerability only present when CONFIG_FOO=y? My understanding (based on skimming the thread) is yes, is it that correct assertion? Since it does not seem to make much sense.

Much ado about SBAT

Posted Jul 20, 2023 16:30 UTC (Thu) by bluca (subscriber, #118303) [Link] (28 responses)

This is explicitly covered in the protocol and described in the example:

https://github.com/rhboot/shim/blob/main/SBAT.example.md

Much ado about SBAT

Posted Jul 20, 2023 16:49 UTC (Thu) by gray_-_wolf (subscriber, #131074) [Link] (27 responses)

So if I get it right, fedora would produce kernels with something like:

sbat,1
linux,1
linux.fedora,1

And if there is problem with CONFIG_FOO=y, linux wound bump the number to 2, so fedora would produce

sbat,1
linux,2
linux.fedora,2

While debian

sbat,1
linux,2
linux.debian,1

And `the UEFI CA issues an update to SBAT` would be:

sbat,1
linux,2
linux.debian,1
linux.fedora,2

Is that correct? That seems like it will be... fun to keep up to date, I guess?

Much ado about SBAT

Posted Jul 20, 2023 16:58 UTC (Thu) by WolfWings (subscriber, #56790) [Link] (20 responses)

linux.fedora wouldn't update their number unless they made changes to their patches or configuration settings.

If linux.fedora's changes were the cause of a vulnerability, then linux.fedora's number would increment entirely independently of linux in your example case.

When you carry forward upstreams SBAT lines you can benefit from their generation epochs.

Much ado about SBAT

Posted Jul 20, 2023 23:01 UTC (Thu) by cquike (subscriber, #107549) [Link] (11 responses)

What happens if the distribution option CONFIG_FOO=n actually "disables" the vulnerability? The distribution kernel in that case is not affected but still the SBAT data in the firmware would say that that linux,1, for instance, is vulnerable and the system will not boot, which IMHO is the wrong behaviour.
Is that understanding correct?

Much ado about SBAT

Posted Jul 20, 2023 23:52 UTC (Thu) by bluca (subscriber, #118303) [Link] (10 responses)

Sure, there's a number of hypothetical corner cases that one might imagine where a revocation that doesn't affect everybody might go out. Likelihood is low enough that it's not really worth worrying about. There are more important aspects to optimize for.

Much ado about SBAT

Posted Jul 21, 2023 10:19 UTC (Fri) by dvdeug (guest, #10998) [Link] (9 responses)

That there is a security hole in some experimental or niche code not included in most distros' kernels seems a nigh certainty, not hypothetical.

Much ado about SBAT

Posted Jul 21, 2023 15:39 UTC (Fri) by bluca (subscriber, #118303) [Link] (8 responses)

If it's certain, can you point to any occasion where any of that has been used to exploit a system?

Much ado about SBAT

Posted Jul 21, 2023 20:45 UTC (Fri) by dvdeug (guest, #10998) [Link] (6 responses)

Eleven days ago, CVE-2023-32250 and CVE-2023-32254 were issued on bugs in the Linux kernel's ksmbd, an in-kernel SMB server. At the end of June, CVE-2023-3338 was issued on a bug in the DECNet implementation. Actively exploited, I don't know, but I'm sure there are.

Much ado about SBAT

Posted Jul 21, 2023 20:53 UTC (Fri) by bluca (subscriber, #118303) [Link] (5 responses)

How are those related to UEFI?

Much ado about SBAT

Posted Jul 21, 2023 21:23 UTC (Fri) by mjg59 (subscriber, #23239) [Link] (3 responses)

If you're able to trigger them locally you could overwrite kernel state and kexec into a backdoored kernel (or even Windows) without violating any secure boot controls.

Much ado about SBAT

Posted Jul 22, 2023 9:00 UTC (Sat) by WolfWings (subscriber, #56790) [Link] (2 responses)

I feel like that's a core disconnect here honestly.

The RFC was expecting that "once you hit /init it's no longer relevant" which would indeed exclude the supermajority of CVEs from ever increment the counter.

But the reality is this: Secure Boot carries forward long past the boot phase into userspace, and if that has a vulnerability that allows sufficient escalation then to the user it is functionally identical to any other secure boot compromise:

Suddenly the OS isn't the one you signed keys and did all the security hoopla for.

Which yes, means that the SBAT scope for "the ENTIRE kernel" includes parts WAY outside of the "boot" portion of things. It also ends up including everything callable from userspace including second-tier stuff like NFS that requires networking and filesystems first, etc.

Maybe a more narrowly defined initial record for 'linux-pre-init' as a component name or something instead that's just the UEFI-stub-to-/init codepath fits what the initial submission expected for a rarely-changing codepath.

But the full 'linux' meta-component would have to update it's counter rather frequently so X-1 fallback would be woefully insufficient and result in 'functionally bricked' systems for various degrees of brickage.

Much ado about SBAT

Posted Jul 22, 2023 10:39 UTC (Sat) by bluca (subscriber, #118303) [Link] (1 responses)

There is already such a boundary, and it's what we are largely focusing on right now: ExitBootServices. It provides a clear boundary, and what comes before is clearly and unequivocably 'high value', so in my view it's a great starting point for this.

Much ado about SBAT

Posted Jul 27, 2023 4:37 UTC (Thu) by WolfWings (subscriber, #56790) [Link]

And reading the threads on LKML I think that was the disconnect at the time of this article at least.

Took me a'while to realize that, the kernel devs were largely approaching it as "Okay, it's named linux, so you mean the whole kernel then?" thus my suggestion to make it more clearly just the early-boot region that the epoch is covering with a more verbose and narrowly defined name. I think it was causing cognitive overload at times in attempting to use the shortest possible component name.

Coming back with "Any security flaw in the codepath from initial execution up until UEFI ExitBootServices is called" possibly with the modifier "...that allows unsigned code to be executed." as a recurring phrase for "What increments the epoch?" instead of variants on "Whatever those in charge decide." might have had a better cooling effect, and kept the scope much more narrowly defined. Requiring it to only be of concern for signed code would also mean for example unsigned-module mode might be entirely out of scope, etc.

But this is my years of dealing with PCI-DSS/HIPPA/etc compliance audits coming to the fore: Defining the scope as narrowly as possible is a huge part of passing security audits in the real world, and that seems to apply to keeping the epoch number from skyrocketing and becoming real-world-worthless.

Much ado about SBAT

Posted Jul 24, 2023 8:41 UTC (Mon) by paulj (subscriber, #341) [Link]

If you can trigger a bug relatively reliably during or after boot - be it by network, by crafted filesystem exploit, by some startup script, etc. - you get control. It doesn't matter how much you've locked down the firmware.

I'm pretty sure we've had some articles recently containing discussions about the impossibility of securing filesystems from bugs and exploits, and the need to restrict which user-provided FSes the kernel will mount...

Much ado about SBAT

Posted Jul 22, 2023 0:36 UTC (Sat) by geofft (subscriber, #59789) [Link]

I think for the most part people aren't interested in building exploits for these because Secure Boot attacks are pretty low value - especially because, at least as of last time I checked, there's a signed Ubuntu shim/GRUB which is happy to load arbitrary kernels. (I actually have a laptop running an old version of Debian with Ubuntu's signed shim/GRUB...) But it would be pretty easy to automate the process of turning root -> ring 0 escalations (which are much more common than non-root -> ring 0 escalations!) into bypasses of kernel_lockdown(7), if that would be interesting to anyone.

Specifically - I think this conversation would be more productive if you could define what exactly counts as an exploit that SBAT should protect against / that would merit bumping the SBAT generation number. My understanding of Secure Boot is there are two ways to think about this:

1) An attacker should not be able to produce a bootable system image using other people's signed kernels/bootloaders/shims/etc. that gets them arbitrary code execution in ring 0 before ExitBootServices() has been called.

2) An attacker should not be able to produce a bootable system image using other people's signed kernels/bootloaders/shims/etc. that gets them arbitrary code execution in ring 0 at any time.

The second one is what kernel_lockdown(7) is meant to protect against. But that's a more stringent policy than I think is really possible to achieve at least in the MS UEFI ecosystem. As mentioned, there are the Ubuntu shim/GRUB binaries, and also Windows itself hasn't been treating root -> ring 0 as a security issue (see e.g. http://www.github.com/ionescu007/r0ak which is a worked example of a driver allowing kernel read/write).

If the second definition is what you're actually trying to solve (and I might be out of date here and maybe the Ubuntu shims are now revoked?), it would be worth clearly documenting that, and then people can start producing those exploits. The ksmbd vulnerability mentioned in a sibling comment could quite easily be weaponized. But also a whole lot of other things that have the benefit of starting from a privileged position, like root in the initramfs, are interesting too.

If you're going for the first definition, I think it's worth documenting that too, because there's relatively code that runs before ExitBootServices() - there is some attacker-controlled stuff (command line parsing, initrd loading but not mounting), so an exploit isn't out of the question, but there's very little of it. And that might help the kernel folks get more comfortable with the idea of maintaining a generation number.

Or maybe you're going for some other definition which is not obvious to someone who does pay attention to Secure Boot, which means it'll be even less obvious to the average LKML reviewer.

(Or maybe the practical threat model here is that you want to make sure that attackers can't run code even in ring 3 as root, which means you can't trust the MS UEFI ecosystem and you have to sign all your own kernels/UKIs with your own key that you enroll yourself in DB and KEK... at which point you want to maintain the SBAT generation on your own and not trust anyone else to do it!)

To be clear - I am absolutely on board with criticizing the LKML crowd's refusal acknowledge what sort of bugs are security bugs and treatment of e.g. adding support for a new PlayStation controller's buttons as an "All users must upgrade." stable release that datacenter sysadmins need to care about just as much as they care about remote code execution in kernelspace. But I think that not clearly defining the security goals of SBAT / Secure Boot and saying you're only interested in vulnerabilities with known exploits risks making essentially the same mistake. It's silly to say that a vulnerability with a PoC that gets arbitrary code execution doesn't matter to you because nobody has constructed an exploit against your particular use case: they clearly could, and the only people who benefit from the obscurity are black hat attackers. A bug doesn't become unexploitable by refusing to mention the CVE in the commit message; neither does a bug become unexploitable by pointing out that there are no public exploits.

Much ado about SBAT

Posted Jul 21, 2023 7:43 UTC (Fri) by juliank (guest, #45896) [Link] (7 responses)

Notably even if a vendor was bumped before, the bump of the generic linux entry would cause the per-vendor revocations to go away as they become obsolete. Hence why it's not an ever growing list and works a lot better than dbx.

so if you had revocations

linux.fedora,2
linux.debian,2

and linux,1 was the latest SBAT

linux gets an update, we drop the linux.<anything> and only insert the linux one:

linux,2

Then fedora gets another issue and we have

linux,2
linux.fedora,3

Then linux gets another issue and we have

linux,3

Much ado about SBAT

Posted Jul 21, 2023 10:13 UTC (Fri) by gray_-_wolf (subscriber, #131074) [Link] (6 responses)

What if fedora decides to fix a bug A as 3 while linux bug B as the same 3? They might very well do it by accident. How is this going to be prevented? Or, how (if ever) will the numbers be reconciled (there are many distributions, so I assume it will happen sooner or later)?

Much ado about SBAT

Posted Jul 21, 2023 10:36 UTC (Fri) by bluca (subscriber, #118303) [Link] (5 responses)

That's why revocations are centralized, and managing generation IDs also needs to be centralized, rather than have everyone do it on their own

Much ado about SBAT

Posted Jul 21, 2023 13:11 UTC (Fri) by zdzichu (subscriber, #17118) [Link] (4 responses)

How exactly would centralisation solve the problem presented by gray_-_wolf?

Much ado about SBAT

Posted Jul 21, 2023 15:39 UTC (Fri) by bluca (subscriber, #118303) [Link] (3 responses)

Because everybody will know what the upstream product gen ID is

Much ado about SBAT

Posted Jul 22, 2023 16:32 UTC (Sat) by zdzichu (subscriber, #17118) [Link] (2 responses)

Ok, so let's assume everybody is at gen 42. Now, two new vulnerabilities gets published, CVE-2025-001 and CVE-2025-002.
Fedora backports the fix for 001 in their kernel. Arch backports fix for 002. They need to get new generation ID for their patched kernels, so they ask central authority.

What gen numbers they will get in answer?

Much ado about SBAT

Posted Jul 22, 2023 16:53 UTC (Sat) by excors (subscriber, #95769) [Link] (1 responses)

That sounds easy: the central authority decides on a total order for vulnerabilities. If they pick 001<002, then they say you can sign as gen 43 if you're not vulnerable to 001, and you can sign as gen 44 if you're vulnerable to neither 001 nor 002. Arch will need to fix 001 (or determine they were never vulnerable to it, e.g. if it was a Fedora-specific issue) so they can sign as a new generation, before gen 42 gets revoked.

Much ado about SBAT

Posted Aug 4, 2023 8:43 UTC (Fri) by ghane (guest, #1805) [Link]

My understanding is that Google does this, releasing two checkpoints every month for certified Android.

Assume there are two sets of exploits, set A (which Google feels Handset OEMs can fix by themselves quickly) and set B (which need blobs from Qualcomm, etc). These are called (eg) 01 July 2023 and 05 July 2023 (the 1 and 5 are constant).

If you fix set A, you roll out the update, and claim "1 July 2023". You then have time to talk to upstream chip manufacturors, and at some time roll out "5 July 2023".

You cannot claim 1 Aug 2023, (or any future version) however, till 5 July 2023 has been included.

So from a user (and app developer) point of view, any "1" date means you are no more than 1 month behind that month, and a "5" means that you are current till that month.

Much ado about SBAT

Posted Jul 21, 2023 1:13 UTC (Fri) by NYKevin (subscriber, #129325) [Link] (5 responses)

It is entirely possible that Linus or somebody else explicitly decides that this is out of scope for Linux (which, after all, is "just" a kernel, not a full OS) and then we *only* have linux.fedora, linux.debian, etc. Each distro would then be responsible for bumping the number as applicable, regardless of whose "fault" the vulnerability is. They are already in a position to do that anyway, since most of them already track kernel vulnerabilities in order to provide timely security updates.

(I have no idea if the standard allows that, but if the standard is unwilling or unable to accommodate Linux, then any standard-conforming hardware will become effectively unsalable to datacenters and Android manufacturers, so I imagine they would find a way to make it work if it currently does not. Worst case, we would just have linux,1 forever and there would never be a linux,2.)

Much ado about SBAT

Posted Jul 21, 2023 8:35 UTC (Fri) by taladar (subscriber, #68407) [Link]

> Worst case, we would just have linux,1 forever and there would never be a linux,2

At that point we could certainly talk about security theatre though.

Much ado about SBAT

Posted Jul 21, 2023 9:12 UTC (Fri) by bluca (subscriber, #118303) [Link] (3 responses)

No, there would still be a global gen id, it would just be much more painful to maintain. I'll let you figure out how pleased distribution maintainers would be with upstream maintainers if that were to happen - not that I am under any illusion they care one bit.

Much ado about SBAT

Posted Jul 22, 2023 3:16 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (2 responses)

Well, in this hypothetical (which I should stress has not actually happened (yet)), it would be a discussion between the distros and whoever is pro-SBAT. Frankly, I'm having a hard time imagining, say, Debian wanting to spend a lot of time on something like this, so it's possible they give SBAT a "no" as well. I suppose it's possible that the SBAT people try to write off the "uncooperative" distros, but I should like to think that there is enough Debian in datacenters to give hardware manufacturers pause about that course of action.

Ultimately, the problem here is that somebody has to bell the cat, and at least historically in the FOSS community, things usually work out best if the person belling the cat is the person who actually wants the cat to wear a bell in the first place. The alternative just leads to the Nebraska problem,[1] which would be very bad in this case as it could leave a large amount of hardware suddenly unbootable.

[1]: https://xkcd.com/2347/

Much ado about SBAT

Posted Jul 22, 2023 4:59 UTC (Sat) by plugwash (subscriber, #29694) [Link]

I'm no expert, but my understanding is that the underlying issues are.

1. If we want our Linux distros to keep booting out of the box on new computers, then we have to play along with the secure boot "ecosystem", which is ultimately controlled by Microsoft.
2. UEFI has very limited space for storing revocations, a single revocation incident consumed around half the availble revocation space. It's not at all clear to me what MS would do if revocation space ran out, but it may well not be nice for Linux.

So it was determined that a more efficient revocation scheme was needed, one that could basically say "any version of software package "x" that is older (in security fix terms) than "y" is vulnerable rather than having to revoke images individually.

Much ado about SBAT

Posted Jul 22, 2023 10:44 UTC (Sat) by bluca (subscriber, #118303) [Link]

> I'm having a hard time imagining, say, Debian wanting to spend a lot of time on something like this, so it's possible they give SBAT a "no" as well.

No, Debian is involved too. Not only myself, but the maintainers of the bootloader stack are not only fully aware but also actively working on this, both downstream and upstream in the Shim community.

Much ado about SBAT

Posted Jul 20, 2023 16:55 UTC (Thu) by smurf (subscriber, #17840) [Link] (15 responses)

The idea of "no single serial number can possibly work" has been around since the first distributed versioning system was invented, umpteen years ago.

Thus if I had to boil the kernel version down to a single integer, I'd use the number of minutes since 2023-07-01, or something along these lines.

Much ado about SBAT

Posted Jul 20, 2023 17:35 UTC (Thu) by NYKevin (subscriber, #129325) [Link] (9 responses)

A constantly-updating number would imply that every kernel gets revoked as soon as the next release happens (if not sooner). The folks running old, but supported kernels would probably not appreciate that.

Much ado about SBAT

Posted Jul 20, 2023 18:25 UTC (Thu) by excors (subscriber, #95769) [Link] (8 responses)

As far as I can tell, installing a new kernel would not automatically revoke older generations - revocations are applied by the system administrator (like https://support.microsoft.com/en-gb/topic/microsoft-guida...) or through Windows Update etc. (This differs from other anti-rollback mechanisms which simply refuse to install or boot an update if its generation number is lower than the one previously installed, like https://android.googlesource.com/platform/external/avb/+/...)

I presume the process is that someone identifies a serious bug that can be used to undermine Secure Boot, they make sure it's fixed on all interesting branches, they mark the current point in time, they get everyone who signs kernels to promise they're never going to sign any kernel without that bug fix, then they ask Microsoft to revoke all versions before that point. In that case, I don't think it makes a fundamental difference whether the point is identified by e.g. last CommitDate in minutes since epoch, or by a global generation number that was incremented at the same time as the bug was fixed on all branches. The latter just means the coordination between all the parties is recorded in the kernel repository, whereas the former means the kernel developers don't have any involvement and the coordination happens in some less-visible external forum and there's more chance of mistakes.

Much ado about SBAT

Posted Jul 20, 2023 19:41 UTC (Thu) by smurf (subscriber, #17840) [Link] (7 responses)

On the other hand, with a time-based scheme you can simply say "you can't boot kernels older than a month" and be done with it, no process required, formal or otherwise.

Also, my impression is that many of those fixes are applied more-or-less silently, so nobody is going to start a formal process that points a big fat red arrow at a particular hole.

Also², what happens when two of these formal processes run in parallel, or the resolution of one of them overtakes the other? Resolving these questions, plus then running the formal process itself whenever that's required (who decides?), is going to cost a nontrivial amount of work time = money.

Much ado about SBAT

Posted Jul 20, 2023 20:32 UTC (Thu) by bluca (subscriber, #118303) [Link] (3 responses)

That is obviously not really feasible. The timestamp is irrelevant, what matters is whether there is an observed vulnerability or not.

Much ado about SBAT

Posted Jul 21, 2023 6:05 UTC (Fri) by smurf (subscriber, #17840) [Link] (2 responses)

Of course it's feasible. Grab the top commit's timestamp, convert to minutes, add to SBAT record, compile, install, boot, verify it's working, tell your boot loader not to allow booting any kernels with a lower number. Done.

The question is whether you can reliably capture all relevant security violations with that. Well, no you can't, not reliably. I know that. The point is that there does not seem to be a way to reliably increment the number every time a problem is found *and* make sure that all those fixes are back- and crossported (otherwise the increment won't mean anything) *and* not point big fat red arrows at security problems, which would happen if the same commit increments the number in the SBAT. (And if it doesn't you wouldn't know what to backport in order to keep the guarantee.)

So if there's a choice between a zero-precent solution (not do the SBAT thing at all), a 100% solution that is costly and nobody is going to implement, and a 95% solution that doesn't really cost anything, I advocate for the latter.

Much ado about SBAT

Posted Jul 21, 2023 9:16 UTC (Fri) by bluca (subscriber, #118303) [Link] (1 responses)

No, it is not sensible, because the revocation would have to be incremented and deployed every month for no reason at all, which is just silly busywork. The stable backport issue is a made up strawman - it doesn't matter from which branch or which version number it has, either it's vulnerable to the identified issue or it isn't. If it's vulnerable it stops booting until fixed and bumped, it's as simple as that.

Much ado about SBAT

Posted Jul 23, 2023 21:03 UTC (Sun) by danielthompson (subscriber, #97243) [Link]

Where does the busy work come from?

"the revocation would have to be incremented" means updating the minimum accepted generation number (revoking everything older than *this* number), right? If so, there is no good reason to update the minimum accepted generation number simply because an upstream component increases a time-based generation number.

Much ado about SBAT

Posted Jul 22, 2023 3:27 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (2 responses)

To my understanding, no, you can't do that. Or at least, you can't do that, assuming that I'm understanding your proposal correctly:

1. Revoke all old kernels, scoped to some organization, so the IT department can just tell everyone "you have to use the latest release, no exceptions." This is what I think you are proposing, and it makes logical sense.
2. Revoke all old kernels, globally, and everyone using SBAT enforcement is forced to comply. This is how the technology is meant to be used (at least as far as I can tell), and it is ridiculous, so presumably not what you meant.

In short, these version numbers are not (apparently) intended for enforcing generic IT policies, and you cannot just "roll your own" SBAT policy (at least as far as I have been able to determine). They are intended as a more efficient means of global certificate/signature revocation, and are inherently unscoped. See https://github.com/rhboot/shim/blob/main/SBAT.md for more discussion of motivations etc.

Much ado about SBAT

Posted Jul 22, 2023 3:32 UTC (Sat) by NYKevin (subscriber, #129325) [Link] (1 responses)

> and you cannot just "roll your own" SBAT policy (at least as far as I have been able to determine).

To be more explicit: I believe that this is impossible, because I believe that your SBAT policy would have to be signed by one of the keys that your firmware trusts, and the keys that your firmware trusts are the same as the keys that everybody else's firmware trusts, so if you could just make up and sign an arbitrary policy, you could trivially bypass the whole scheme.

Much ado about SBAT

Posted Jul 22, 2023 6:10 UTC (Sat) by mjg59 (subscriber, #23239) [Link]

It's absolutely possible for enterprises to enroll their own keys (and to have their own keys enrolled at the VAR step of device provisioning), and any such enterprise could impose their own policy.

Much ado about SBAT

Posted Jul 20, 2023 20:00 UTC (Thu) by flussence (guest, #85566) [Link] (4 responses)

A Modest Proposal: simply grep the entire kernel and git log at build time for strings of the form "CVE-(\d+)-(\d+)", numify them ($0 * 100000 + $1), and use the maximum.

Much ado about SBAT

Posted Jul 20, 2023 20:12 UTC (Thu) by geofft (subscriber, #59789) [Link] (3 responses)

That would work a lot better if the kernel did not have an unwritten policy of deliberately not mentioning CVEs or exploitability in commit messages.

See, for instance, the commit message fixing StackRot, a vulnerability so significant it got a cutesy name and not just a CVE: https://git.kernel.org/linus/9471f1f2f502

Much ado about SBAT

Posted Jul 20, 2023 21:46 UTC (Thu) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

> That would work a lot better if the kernel did not have an unwritten policy of deliberately not mentioning CVEs or exploitability in commit messages.

Yes, this remains a troublesome approach. A lot of other security efforts in the kernel have improved including the reception of hardening patches and maybe Rust will help in the future. Hopefully at some point they stop actively hiding it when they are aware of the issue ahead of time.

Much ado about SBAT

Posted Jul 20, 2023 23:43 UTC (Thu) by bluca (subscriber, #118303) [Link]

It's downright irresponsible, and gives bad actors a head start.

Much ado about SBAT

Posted Jul 21, 2023 0:46 UTC (Fri) by willy (subscriber, #9762) [Link]

The decision about whether to name a vulnerability is the security researcher's decision, and does not reflect a finding of significance (who would have the authority to make such a finding, anyway?)

Much ado about SBAT

Posted Jul 20, 2023 18:06 UTC (Thu) by caliloo (subscriber, #50055) [Link] (2 responses)

What I see from what’s reported is a solution used in another context (that of proprietary OS editors I suppose) trying to be applied to the kernel development context. I don’t see how it would possibly work unless tackled at the distributor level, and even then… Only the final shipper of binary to a machine can take the responsibility of bricking a device because it doesn’t have the right version number.

To think that having a number maintained at the root of development is a solution is to be ignorant of how dev is done with Linux and Linux distros imho.

Software is a socio technical construct, especially when it comes to security, and to only focus on the technical for such a thing is the patch author mistake. It’s doubtful that the socio side of this patch can be made to work with the proposed technical form. I understand the reaction of the maintainers, although perhaps not the form of their refusal.

Much ado about SBAT

Posted Jul 20, 2023 21:37 UTC (Thu) by Paf (subscriber, #91811) [Link]

This is well said. The social part - how and when this is updated, since there is clearly not a straightforward technical way to decide that - is what gives this actual value.

Much ado about SBAT

Posted Jul 21, 2023 8:43 UTC (Fri) by taladar (subscriber, #68407) [Link]

I would argue that it is even a technical problem.

It is a fact that a security vulnerability's affected versions are usually of the form

introduced in version x and fixed in version x+n

introduction backported to version x-1.k and fixed in version x-1.k+m

(repeat that second one as often as necessary)

How is that tree structure of development and release branches with LTS supposed to be mapped to a single monotonically increasing integer?

Not to mention downstream patches.

And then there is the reality that out of all the possible combination of CONFIG options the vast majority only ever affect a subset.

Not sure how broken secure boot is exactly in terms of being able to defeat it by bugs in any random subsystem of the payload it boots but even if you remove all the combinations of CONFIG options that make no difference it would lead to a lot of false positives being blocked.

Much ado about SBAT

Posted Jul 21, 2023 0:56 UTC (Fri) by DemiMarie (subscriber, #164188) [Link] (6 responses)

To me, the only reasonable approach is to have a counter that is bumped by 1 at each kernel release. If the version in the kernel image is less than the version in the firmware, the firmware refuses to boot the kernel. Otherwise, the counter gets set to the version in the kernel image.

Much ado about SBAT

Posted Jul 21, 2023 2:06 UTC (Fri) by geofft (subscriber, #59789) [Link]

This is basically what Chrome OS's rollback prevention does, with the side feature that a physically present user can boot into recovery mode (which allows recovering the device itself, but destroys user data encryption keys). It seems like a new firmware version can raise the minimum kernel version, but it's not automatically updated simply by booting a new kernel.

Android seems to use a manually-incremented number like the SBAT generation.

Much ado about SBAT

Posted Jul 21, 2023 7:56 UTC (Fri) by Lionel_Debroux (subscriber, #30014) [Link]

Given that most kernel releases contain known and explicit, known and concealed or unknown security fixes, and certainly nobody wants to go through the trouble of trying to find whether every patch fixes a vulnerability which specifically allows bypassing Secure Boot... yeah, a counter which is bumped by 1 at "every" set of synchronized stable kernel releases (say, excluding e.g. brown paper bag releases containing a single build error fix, as occurs once in a while) - for instance, the recent 6.4.2, 6.3.12, 6.1.38, and 5.15.120 releases, https://lwn.net/Articles/937400/ - seems like a straightforward, easy, and reasonably honest way to update the SBAT.
As for distributors who chose to maintain versions not based on a mainline LTS (Red Hat, etc.)... well, tough luck for them ?

Incrementing the SBAT counter's value by 1 at "every" set of stable kernel releases will cause that value to grow to 3 digits after a couple years or so, but... one might argue that it's life when sets of stable kernel releases are being produced every few days, representing backports from a subset of 10-15K commits per ~70 day period, hundreds of which are probably security-relevant, and who knows how many of these specifically enable Secure Boot bypass.
By virtue of slower release cycles, the SBAT counters of Windows or macOS, even if those were also incremented upon every release, would have slower progression. A simplistic look such as "Oh, the Linux SBAT counter value is much higher than that of Windows, therefore Linux security is trash" is, well, simplistic, and I really hope that people blurting out such stupidities is not a worry of anyone participating in that or this thread.

Anyway, there was no need for the LKML thread to turn out this way, and help pile on the sewer reputation of part of this messaging conduit...

Much ado about SBAT

Posted Jul 21, 2023 8:45 UTC (Fri) by taladar (subscriber, #68407) [Link]

That sounds like a good way to make people stop applying firmware updates to avoid breaking their systems.

Much ado about SBAT

Posted Jul 21, 2023 9:19 UTC (Fri) by bluca (subscriber, #118303) [Link] (1 responses)

That requires a constant churn that is not good for anybody, nor serve any purpose.

Much ado about SBAT

Posted Jul 23, 2023 0:28 UTC (Sun) by DemiMarie (subscriber, #164188) [Link]

Updating whenever a new stable release comes out is the solution that is supported by upstream. The “constant churn” you are referring to is an inevitable consequence of upstream trying to support every Linux user out there. One can, of course, choose not to follow upstream’s recommendation, but then upstream is not responsible for any security problems one’s systems may have.

My understanding is that grsecurity does a good job of keeping pace with upstream kernel patches, but I suspect they release equally often. I am not aware of Red Hat or SUSE being able to keep up with the endless stream of security issues. Android has a simpler problem because their kernel configuration is smaller.

Much ado about SBAT

Posted Jul 21, 2023 12:10 UTC (Fri) by mathstuf (subscriber, #69389) [Link]

This doesn't work with the multiple release tracks and cadences of the kernel. Should every set of stable releases come with a "must apply" patch to bump the mainline integer (because mainline being behind stable's counter seems…wrong)?

There's also the unfortunate case of "commit X is not vuln, X+1 is because a bug is introduced, X+2 fixes it". If there are SBAT bumps in between, the old X commit is marked as vulnerable unnecessarily. In fact, I think this convinces me that the integer should be under the distributor control with some central registry of "please use at least X when fixing issue Y (and also fix all older issues too)" to help with coordination of things.

Much ado about SBAT

Posted Jul 21, 2023 3:03 UTC (Fri) by shironeko (subscriber, #159952) [Link]

if this feature truly is so great, why not just maintain the number out-of-tree? it's just a number right?

after all the distro pick up this magical number and be free of secure-boot rollbacks, I'm sure there will be a swarm of people pushing mainline to adopt it.

Much ado about SBAT

Posted Jul 21, 2023 5:30 UTC (Fri) by champtar (subscriber, #128673) [Link]

On server HW, can you rollback / bypass the revocation DB ? Motherboard dies, get replaced by a new one that prevent the perfectly working OS to boot, is the only option to disable SB ?

Much ado about SBAT

Posted Jul 21, 2023 8:57 UTC (Fri) by paulj (subscriber, #341) [Link] (22 responses)

So, this seems to be an attempt to:

1. Map every release of a bootable artifact to a monotonically increasing generation number, such that every increase of the generation indicates that prior generations have a serious security flaw

2. Introduce hierarchy into the artifact labels to which the generation numbers are associated, to ease assignment management.

The definition of a security issue seems to be unspecified, and the root of the argument here (?). Every Linux kernel release surely has security fixes, so surely every release would have its own generation number, rendering the whole scheme largely moot?

The requirements for being a "product" worthy of being signed and being able to make use of "Secure Boot" by default (not having to install own keys, etc.) are of course at the whim of Microsoft, as ever. Thanks again to everyone for literally giving Microsoft the keys to being able to boot Linux.

Much ado about SBAT

Posted Jul 21, 2023 9:25 UTC (Fri) by bluca (subscriber, #118303) [Link] (21 responses)

> Every Linux kernel release surely has security fixes, so surely every release would have its own generation number, rendering the whole scheme largely moot?

No, things that apply to this will be few and far between

> Thanks again to everyone for literally giving Microsoft the keys to being able to boot Linux.

Did you just step out of a time machine from roundabout 2010 or so?

Much ado about SBAT

Posted Jul 21, 2023 9:49 UTC (Fri) by dvdeug (guest, #10998) [Link] (10 responses)

That claim seems disputed by everyone in the know; potential security faults are regularly fixed. The policy of when to upgrade besides the extremes of every version or never is quite ill-defined and the core of the debate here.

Much ado about SBAT

Posted Jul 21, 2023 9:59 UTC (Fri) by bluca (subscriber, #118303) [Link] (9 responses)

How many kernel-driven active firmware security bypasses are you seeing in the wild right now?

Much ado about SBAT

Posted Jul 21, 2023 10:42 UTC (Fri) by pbonzini (subscriber, #60935) [Link] (8 responses)

Secure Boot purports to only let you run signed code.

Ergo, any kernel execution bug is a secure boot bypass.

The idea that it has to be "actively exploited" only makes it clear that it's security theater.

Much ado about SBAT

Posted Jul 21, 2023 11:02 UTC (Fri) by bluca (subscriber, #118303) [Link] (7 responses)

> Secure Boot purports to only let you run signed code.
> Ergo, any kernel execution bug is a secure boot bypass.

Most definitely not, it's not a code integrity mechanism. There only a few real use cases of full code integrity deployed using Linux that I am aware of, and none of them even use UEFI.

> The idea that it has to be "actively exploited" only makes it clear that it's security theater.

Yep, 'boothole' and 'blacklotus' are definitely just 'theater', nothing to worry about

Much ado about SBAT

Posted Jul 21, 2023 13:11 UTC (Fri) by pbonzini (subscriber, #60935) [Link]

I don't know, if I want to write a rootkit I'll probably exploit a random vulnerability rather than try and thwart the signing. So if revocations are restricted to the latter, it will be security theater. That's my point. Maybe for bootloaders it's good enough, but not for something that receives as much untrusted code and data as a general purpose OS.

> Yep, 'boothole' and 'blacklotus' are definitely just 'theater', nothing to worry about

It's exactly because they're worrisome, that the SBAT number should be bumped for any vulnerability, no matter if actively exploited ot not.

Much ado about SBAT

Posted Jul 22, 2023 0:53 UTC (Sat) by geofft (subscriber, #59789) [Link] (5 responses)

> Yep, 'boothole' and 'blacklotus' are definitely just 'theater', nothing to worry about

I would, in fact, argue that anyone who cares about responding to boothole and batondrop but not ordinary privilege escalation vulnerabilities is performing security theater! What actual use cases consider either of these a danger that should not consider local privilege escalation a danger?

If your goal is to protect against in-the-wild exploits and not vulnerabilities (which lines up with mentioning blacklotus the exploit not batondrop the vulnerability), that's maybe reasonable - but you can't do that with SBAT, because whether an exploit exists isn't a property of the binary being executed, it's a property of the rest of the world. That's why everyone's asking about vulnerabilities, because those are a property of the binary being executed. If an exploit gets developed today for a bug in kernel 5.0 that's fixed in kernel 5.1, you can't go back in time and make the SBAT data of 5.0 and 5.1 different.

And there is already a mechanism for handling this, namely DBX. If that gets out of hand, then the only thing you can really do is to label every kernel version with some different metadata and have a mechanism for telling the firmware what the last unexploited kernel version is, which is a thing that not only has been proposed a few times in response to this proposal but is actually implemented in the real world for Chromebooks.

Much ado about SBAT

Posted Jul 22, 2023 1:40 UTC (Sat) by mjg59 (subscriber, #23239) [Link] (3 responses)

One of the distinctions is ease of identification - the earlier in the boot chain you can compromise trust, the less obvious it's going to be to a user. It also increases your ability to subvert aspects of measured boot (although black lotus failed to do this, even though technically possible), and so I'd argue that there is a greater risk. I've got a bunch of thoughts here that I'll try to write up.

dbx is a limited resource, and flagging every single individual image as revoked isn't viable - the easy solution is to force everyone to migrate to new kernel signing keys and then revoke all the old shims, but even that's a lot of entries. So having a centralised way to say "We believe that kernels with this property should be considered insecure in this particular fashion" and then have agents that can be trusted to enforce that policy is extremely useful from a maintainability perspective, but I also take your point that in the current porposal a vulnerability that was fixed (perhaps inadvertently) between 6.0 and 6.1 and is then later actively exploited in 6.0 is going to mean either point releases for 6.1 for no reason other than bumping SBAT or revoking kernels that don't have the vulnerability. The goal of the "single number" approach is, to my understanding, largely to make it easier for vendors to make assertions that they're shipping backports that fix the vulnerability, but maybe there's another approach that doesn't require this. I'll think about it some.

Much ado about SBAT

Posted Jul 22, 2023 2:14 UTC (Sat) by geofft (subscriber, #59789) [Link] (2 responses)

The 2009 "Security impact ratings considered harmful" paper which I very slightly coauthored https://www.usenix.org/legacy/event/hotos09/tech/full_pap... has a little bit of an analysis of how long it takes between the commit fixing a bug and the assignment of a CVE. There is a sizable long tail of CVE assignment well after the bug fix - the paper reports 14% of vulnerabilities had more than eight weeks of "impact delay", and you can see on the graph several vulnerabilities with a year or more of impact delay. So I think it will not be unusual to discover that an issue fixed in a released kernel had security impact. (After all, security bugs are just normal bugs....)

It'd be interesting to see this data for newer kernel versions.

Re ease of identification - the thing I'm imagining isn't (just) attacking the user's running OS, it's setting up a bootkit that boots a normal Linux kernel with literally any lockdown-integrity vulnerability (i.e. any kernel code execution bug, even if only exploitable by root) with a tiny initrd that immediately exploits it, does whatever nefarious thing it wants to do, and chainloads the real OS. It's true that this is likely to break measured boot, but I think it's basically as undiscoverable as any other bootkit: kernels boot pretty fast when you're not interested in hardware or userspace.

Much ado about SBAT

Posted Jul 22, 2023 2:24 UTC (Sat) by mjg59 (subscriber, #23239) [Link] (1 responses)

kexec doesn't hand over enough state to make it easy to perform a seamless handoff of (eg) graphics, so you'd need to jump through some hoops to avoid graphical artifacts or interruption of the boot splash in the process. I agree that how obvious this is is going to vary. But as we move towards initramfs being part of the signed payload, the initramfs→kexec approach becomes harder and you need to actually leave additional filesystem artifacts around.

Much ado about SBAT

Posted Aug 3, 2023 23:15 UTC (Thu) by Kamilion (subscriber, #42576) [Link]

... I get graphical corruption of plymouth and knocked out of graphical boot to read two or three `quiet`-unsuppressed kernel warning messages already.

RETBleed: WARNING: Spectre V2 mitigation leaves CPU vulnerable to RETBleed attacks, data leaks possible!
[drm:vmw_host_printf [vmwgfx]] *ERROR* Failed to send host log message.
systemd[1]: Invalid DMI field header.

I don't get the vmwgfx error outside of virtualbox, obviously.
(And this is a Skylake Xeon-Scalable Gold!)

I don't think I have a single system that can boot from firmware to desktop *without* plymouth choking somewhere. My ryzens exhibit similar but not identical "unsuppressed messages breaking graphical boot while `quiet` is asserted" issues.

I honestly got so sick of it I dropped `quiet` and defaulted my ubuntu spin to `verbose`.

Much ado about SBAT

Posted Jul 22, 2023 7:52 UTC (Sat) by smurf (subscriber, #17840) [Link]

> If an exploit gets developed today for a bug in kernel 5.0 that's fixed in kernel 5.1, you can't go back in time and make the SBAT data of 5.0 and 5.1 different

except when the SBAT number is time- or release-based and thus incremented as a matter of fact.

Much ado about SBAT

Posted Jul 21, 2023 10:48 UTC (Fri) by paulj (subscriber, #341) [Link] (6 responses)

> No, things that apply to this will be few and far between

So what will it apply to then? As per my comment: "The definition of a security issue seems to be unspecified"

> Did you just step out of a time machine from roundabout 2010 or so?

It was a bad idea to give MS the keys to booting PCs in 2010. It's a bad idea still.

Much ado about SBAT

Posted Jul 21, 2023 11:15 UTC (Fri) by bluca (subscriber, #118303) [Link]

> So what will it apply to then? As per my comment: "The definition of a security issue seems to be unspecified"

To what the people managing the revocations want it to apply to. Exactly how DBX revocations are applied to binaries that people managing DBX revocations want it to apply to.

> It was a bad idea to give MS the keys to booting PCs in 2010. It's a bad idea still.

Except for the tiny problem that nobody else wanted to touch it with a barge pole, of course, because managing a CA is hard and expensive work, and being in charge of security for the ecosystem multiplies that costs by a few orders of magnitude, it turns out. But none of this is news in any way and has been well know for more than a decade.

Much ado about SBAT

Posted Jul 21, 2023 13:41 UTC (Fri) by gray_-_wolf (subscriber, #131074) [Link] (4 responses)

> It was a bad idea to give MS the keys to booting PCs in 2010. It's a bad idea still.

On all motherboards I ever had that supported secure boot, it was possible to enroll my own keys. Not sure however if that is usually possible or if I just had good luck while picking the HW.

Much ado about SBAT

Posted Jul 21, 2023 13:53 UTC (Fri) by bluca (subscriber, #118303) [Link] (3 responses)

It's a feature mandated by the Windows Logo certification. Because you know, Microsoft bad or something.

Much ado about SBAT

Posted Jul 21, 2023 14:01 UTC (Fri) by ballombe (subscriber, #9523) [Link] (2 responses)

Only on x86 hardware, not on ARM hardware sold by Microsoft.

Much ado about SBAT

Posted Jul 21, 2023 15:16 UTC (Fri) by Wol (subscriber, #4433) [Link] (1 responses)

Iirc, not on arm hardware supplied with Windows ...

So you can't replace Windows with linux unless you have a bootloader signed by MS ...

Cheers,
Wol

Much ado about SBAT

Posted Jul 21, 2023 15:33 UTC (Fri) by pizza (subscriber, #46) [Link]

> Iirc, not on arm hardware supplied with Windows ...

Most consumer-focused ARM hardware is heavily locked down. Windows/Microsoft makes up a tiny portion of the market.

Much ado about SBAT

Posted Jul 21, 2023 19:16 UTC (Fri) by nijhof (subscriber, #4034) [Link] (2 responses)

You seem to have an idea of what issues this should be used for. So could you please list the 5 most recent issues that would have required a version bump?

Much ado about SBAT

Posted Jul 21, 2023 20:11 UTC (Fri) by pjones (subscriber, #31722) [Link] (1 responses)

The two that spring to mind, where in the past we've had to rotate signing keys as a result, are CVE-2019-20908 and CVE-2020-15780. Both of them let you inject ACPI tables during boot, which in turn lets you run unsigned code in the kernel.

Kernel memory corruption is a secure boot bypass

Posted Jul 25, 2023 19:35 UTC (Tue) by DemiMarie (subscriber, #164188) [Link]

There are also many, many privilege escalation vulnerabilities with the same result. That’s why forbidding kernel downgrades within a stable release series really is the only answer that can be supported upstream. The attack surface of the upstream kernel is just too broad to be able to do otherwise. Downstreams can shrink the attack surface massively by disabling kernel features not in use and by preventing unsigned privileged userspace code from running, but upstream cannot do either.

Use git revisions

Posted Jul 21, 2023 12:51 UTC (Fri) by epa (subscriber, #39769) [Link] (2 responses)

Suppose for the sake of argument that a given vulnerability has a particular commit abcde which "fixes" it. This need not be the exact commit fixing the bug, as long as you can say that all kernel versions descended from it are free of the vulnerability. (That implies that the vuln hasn't been somehow reintroduced in a later revision; if that happened, you'd need to pick a newer commit as the "fix".)

Then to be sure of booting a non-vulnerable kernel you need to pick one which is a descendent of commit abcde. The kernel build process can list all git revisions which are ancestors of the head commit and bake them into a binary blob which goes alongside the kernel image, similar to what happens for an initial ramdisk (initrd). Like an initrd, the data is loaded into memory on first boot but freed later. Then the kernel command line can have linux ancestor=abcde If the kernel isn't descended from that commit it will refuse to boot.

You still need to decide a sensible 'ancestor', but that is a matter of policy and can be an individual decision, not something a Linux kernel maintainer has to declare or increment. For those who prefer a time-based approach there could be a kernel parameter to insist the current build is newer than a certain timestamp -- based on git revision time rather than build time. And you could specify multiple 'ancestor' commits if needed.

If you don't want to carry around a big binary blob listing all git revisions, it's possible a Bloom filter could be used to get a more compact representation, at the cost of some false positives. Personally I could live with that, but security people may not like the idea.

Use git revisions

Posted Aug 1, 2023 17:50 UTC (Tue) by florianfainelli (subscriber, #61952) [Link] (1 responses)

> If the kernel isn't descended from that commit it will refuse to boot.

That decision cannot be left to the kernel however, otherwise you are breaking secure boot. Your proposal has merit but could also be somewhat difficult to manage. If said "fix" is applied to a stable branch, then the commit ID would be different (different ancestry), so then you have an explosion of commit IDs to track. Add to that whatever patches the distro may be applying and this becomes unmanageable. Hence the "unique number" which is simpler to track.

Use git revisions

Posted Aug 16, 2023 15:48 UTC (Wed) by epa (subscriber, #39769) [Link]

I imagine that all the usual kernel signing requirements would continue to apply. So you wouldn't be able to make a custom patched kernel that ignores the ancestor parameter. Nor would you be able to boot an older, signed kernel, since it would check that parameter when starting and refuse to continue. So I think my proposal preserves the security properties of secure boot.

You are right that you'd need a selection of commit ids to deal with cherry-picking. That is still more manageable and flexible than a global magic number.

Much ado about SBAT

Posted Jul 24, 2023 10:03 UTC (Mon) by wtarreau (subscriber, #51152) [Link]

What I found in this thread is that despite being marked "RFC", the proponents didn't seem to give any consideration to the maintainers' point of view. Each and every argument trying to explain why it didn't fit the model was simply dismissed or rejected. This is clearly the reason why the "RFC" aspect was quickly ignored, because in practice it didn't work like an RFC but as a forced push attempt, based sometimes on misinformed facts (e.g. "you do 3 releases a year"). In fact I think it got attention because it was marked RFC, but otherwise it could also have been completely ignored as well and it wouldn't have made that much noise.

Much ado about SBAT

Posted Jul 27, 2023 15:09 UTC (Thu) by kmeyer (subscriber, #50720) [Link] (1 responses)

There seem to be a number of difficult conceptual and process issues here but the biggest impediment by far seems to be the attitude and communication style of SBAT's advocate bluca (Luca Boccassi) in the comments here. They seem actively hostile to the kernel community and more broadly to anyone who has any question at all about how the scheme will work. This is unfortunate.

Much ado about SBAT

Posted Jul 27, 2023 15:37 UTC (Thu) by corbet (editor, #1) [Link]

Regardless of the applicability of this statement, I don't think that calling people out by name is going to help the situation or make the discussion more positive. I'd prefer if we could avoid doing that...?