LWN: Comments on "CHERI with a Linux on top"

CHERI plus/versus Rust

tialaramex — Mon, 06 Oct 2025 19:28:47 +0000

One nice property of Rust for CHERI is that Rust provides explicit provenance APIs. So you can see OK, what's going on (unsafely) here is legal and ought to still work under CHERI or, contrariwise, nope this is not even supposed to work so we need to expect that these parts must be rewritten or we can't target CHERI with this software.

All the safe Rust of course is in the first category, but also isn't reaping much benefit from CHERI.

core::ptr::without_provenance(addr) and core::ptr::with_exposed_provenance(addr) are largely text, one of them says we offer no justification for why this can be a pointer (maybe it isn't, if we never dereference it then it's fine even in CHERI AFAIU) the other offers the justification of exposure, a feature CHERI does not support, so it's effectively documenting why it won't work with CHERI.

On the other hand some_pointer.with_addr(addr) promises to work under CHERI because we're saying if we keep the CHERI capability bits from some_pointer, but use the address addr instead of the address from some_pointer that's also a valid pointer. On my x86-64 there are no capability bits, but on CHERI there are and on both this works _if_ it would be correct to do this, MIRI can check it, and CHERI can use it.

CHERI plus/versus Rust

erithax — Mon, 06 Oct 2025 11:06:35 +0000

Does CHERI provide additional memory safety even to Rust programs?

> ""what role do you see CHERI playing in an environment where a majority, even a vast majority, of all C code has been replaced with Rust?"" Shaw said that he wonders how successful TRACTOR will be, given that AI techniques may fall short of being able to reliably translate C for all of the different programs needed. Meanwhile, though, he does not see CHERI and Rust as being in conflict at all; the two can work together and it is something the project is putting effort into. ""There will be a CHERI Rust compiler.""

This answer does not clear up much. My guess is it would only improve the safety of `unsafe` Rust, like running in Miri does, but with much less performance degradation than Miri, enabling its use in production. While that is a nice safety improvement, we should then weigh whatever performance penalty CHERI has against the small amount of unsafe operations that it protects.

> While memory safety is definitely important, the compartmentalization afforded by CHERI is more interesting to him. ""Being able to get least privilege in software is a real big step forward, I think."" None of the current languages attack that problem, he said, so it would take ""a further evolution of language in order to support this whole concept nicely"".

I would agree that the non-memory-safety aspects of capability-based security are as important. However, although language support for capabilities should definitely improve, just about every language contains the groundworks of capabilities through encapsulated objects and parameter passing. Additionally, language support is not required for stronger use of capabilities in an OS: see Redox OS.

Also, unlike CHERI, Rust can afford memory safety on all existing hardware, awaiting rustc_codegen_gcc support for exotic platforms, which, mind you, probably won't support CHERI.
Lastly, I think there's a reason that patching fluid software to fix mistakes in fixed hardware is the norm, and not the other way around. What does CHERI address that could not be fixed in software?

Capability Revocation and Indirection

jake — Thu, 02 Oct 2025 17:25:05 +0000

> Then you're using the wrong database server :-)

Please do not continue down this path, Wol. You have been asked before. Your favorite hobby horse is off-topic on this article (and many, many others).

thanks,

jake

Capability Revocation and Indirection

Wol — Thu, 02 Oct 2025 17:19:23 +0000

> I don't see it scaling well to large workloads either. Imagine a database server with hundreds of GB of memory mapped into the process, no way that you want to sweep through all the pointers in that either. And even if you do it in the background concurrently, you will eat a lot of memory bandwidth, and you risk falling behind.

Then you're using the wrong database server :-)

Cheers,
Wol

Capability Revocation and Indirection

Vorpal — Thu, 02 Oct 2025 13:03:49 +0000

> While CHERI avoids indirection when using a capability/pointer, a consequence is that capability revocation (e.g. free(3)) requires sweeping the process address space to invalidate capabilities.

Oof, that seems like a complete deal breaker to me. My main interests are in low latency hard realtime code, and that would completely kill any RT guarantees.

I don't see it scaling well to large workloads either. Imagine a database server with hundreds of GB of memory mapped into the process, no way that you want to sweep through all the pointers in that either. And even if you do it in the background concurrently, you will eat a lot of memory bandwidth, and you risk falling behind.

Which means we are left with a small niche: small systems with no RT guarantees.

> during the pendency of a concurrent background sweep, a CoW-like scheme temporarily traps all reads to sweep specific pages on demand, permitting forward progress before the concurrent sweep completes.

Isn't there a race condition in that: if you copy the capability around you may be able to copy it from a yet-to-be-swept page to an already swept page while the sweep is somewhere in between those pages? Or maybe I'm misunderstanding you.

Capability Revocation and Indirection

NYKevin — Sat, 27 Sep 2025 23:15:21 +0000

I can imagine an alternative scheme, which looks roughly as follows:

1. Every "regular" capability is really a double indirection (a pointer-to-a-pointer) in disguise. I will use the term "outer pointer" to refer to the first layer of indirection (exposed to user code) and "inner pointer" to refer to the second layer (the pointee of the outer pointer).
2. When an allocation is created, we create an inner pointer for it. When an allocation is deallocated, we mark its inner pointer as invalid.
3. Inner pointers live in a special region of address space. When it fills up with dead pointers, you unmap the whole region, and map a fresh one somewhere else. The region is not allowed to contain any object other than an inner pointer (no "real" allocations).
4. A region that has ever been mapped for inner pointers during the lifetime of a process can never again be remapped to contain inner pointers (but it can be remapped for any other purpose, so this is not a pervasive restriction and should not break anything else). malloc or its equivalent would be responsible for the necessary bookkeeping, which might involve mapping regions at some fixed or regular pattern of offsets to reduce the amount of data that you need to track.
5. When an outer pointer is dereferenced by user code, you first check the validity of the outer pointer, then check that it points to a region currently mapped for inner pointers, and finally check the validity of the inner pointer.
6. In principle, you could run out of address space doing this, but that ought to take a rather long time if we're using 64-bit addresses. If we really insist on reusing inner pointer regions, one option could be to give each mapping and each outer pointer a generation number, but CHERI pointers are already wider than standard pointers, and I'm not sure this is worth it. Besides, then you're just running out of generation numbers instead.

I know that double indirection is significantly more expensive than single indirection... but sweeping address space not only seems like it should be similarly expensive, it gets slower the more memory we allocate (whereas double indirection is a fixed cost per dereference). How much memory do you have to allocate before you hit the break-even point?

The other obvious question is how much of this you can hardware-accelerate, and to what extent.

Capability Revocation and Indirection

wahern — Fri, 26 Sep 2025 12:24:59 +0000

The Cornucopia Reloaded paper I linked earlier has a decent summary and references. The most recent paper on the topic, also with a good summary and references, is A CHERI C Memory Model for Verified Temporal Safety, https://dl.acm.org/doi/pdf/10.1145/3703595.3705878. One of the earliest papers discussing revocation is the CHERIvoke paper, https://www.cl.cam.ac.uk/~tmj32/papers/docs/xia19-micro.pdf. Most CHERI-related papers are listed at https://www.cl.cam.ac.uk/research/security/ctsrd/cheri/ch...

Also worthwhile to read the core papers on CHERI, especially papers about and subsequent to the ARM Morello implementation. Once you understand the basic architecture, in particular the hidden 129th bit that tags a word (i.e. C pointer) in memory as a valid capability and which is copied along with the visible 128-bit value (e.g. in `char *b = *a;`), it's easy to see understand the problem space regarding revocation. Most of the early work in CHERI was finding and verifying the minimum software and hardware requirements for guaranteed spatial safety that was also maximally performant in hardware and practical to incorporate into existing platforms (language standards, ABIs, kernels, etc). Temporal safety, especially performant revocation, didn't receive as much attention until later, after the shape of capability pointers (i.e. 129-bit compressed pointers) had already largely been settled. But it's still an active area of research and may yet result in some design changes or at least suggest additional hardware facilities for future implementations.

Capability Revocation and Indirection

cpatulea — Fri, 26 Sep 2025 00:29:08 +0000

> capability revocation (e.g. free(3)) requires sweeping the process address space to invalidate capabilities

Any chance you might have a deeper reference for this?

Fantastic title!

kevinlyles — Thu, 25 Sep 2025 12:34:27 +0000

I haven't read the article yet, but the title made me chuckle.

Capability-Based Computer Systems

chmaynard — Thu, 25 Sep 2025 06:30:37 +0000

During the mid-1980s, I was looking for technical information about the IBM System/38 and ran across the book "Capability-Based Computer Systems" by Henry M. Levy, published by Digital Press.

From the Amazon.com summary:

"The book describes early descriptor architectures and explains the Burroughs B5000, Rice University Computer, and Basic Language Machine. The text also focuses on early capability architectures. Dennis and Van Horn's Supervisor; CAL-TSS System; MIT PDP-1 Timesharing System; and Chicago Magic Number Machine are discussed. The book then describes Plessey System 250, Cambridge CAP Computer, and Hydra System. The selection also discusses STAROS System and IBM System/38 ... The book highlights Intel iAPX 432, and then considers segment and objects, program execution, storage resources, and abstraction support."

Spectre mitigation overhead

wahern — Wed, 24 Sep 2025 23:02:38 +0000

> Are CHERI capabilities able to provide SPECTRE-resistant isolation between mutually distrustful privilege domains within a single address space?

Intrinsically, AFAIU, no. But hardware CHERI support, by requiring both bounds and (to varying extents) provenance information to accompany addresses, potentially makes it easier and more natural to avoid side-channels. And maybe more importantly, CHERI provides an opportunity to nail down ISA guarantees before widespread deployment. See Safe Speculation for CHERI, https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/202...

Capability Revocation and Indirection

wahern — Wed, 24 Sep 2025 22:49:16 +0000

While CHERI avoids indirection when using a capability/pointer, a consequence is that capability revocation (e.g. free(3)) requires sweeping the process address space to invalidate capabilities. In the simplest implementation it's stop-the-world during a linear word-by-word sweep of the address space. There have been several optimizations explored, including schemes to avoid reading memory unnecessarily by skipping words without the out-of-band capability tag bit, concurrent sweeping by leveraging page tables or memory coloring, etc. I think the latest upstreamed to CheriBSD is Cornucopia Reloaded[1]--during the pendency of a concurrent background sweep, a CoW-like scheme temporarily traps all reads to sweep specific pages on demand, permitting forward progress before the concurrent sweep completes.

CHERI is great for spatial safety, but the cost of avoiding indirection means temporal safety requires more work. Perhaps the next evolution will be exploring how linear or affine typing in application languages such as Rust could be leveraged to minimize the sweeping work, e.g. by automatically clearing capabilities as they're copied through the application from malloc through free. Or evolving allocation APIs and page table permission schemes so memory that doesn't need to store a capability/pointer can be skipped from sweeping entirely.

[1] https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/202...

Spectre mitigation overhead

notriddle — Wed, 24 Sep 2025 22:32:42 +0000

> the compartmentalization afforded by CHERI is more interesting to him

Rust treats speculative execution as completely out of scope. That, as far as I'm concerned, is its biggest weakness and the main reason you still need hardware isolation.

A quick Google drops me onto at least one paper <https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/202...> that claims to address speculative execution in CHERI, but I don't know if that's been incorporated into real cores, if it's long obsoleted by more recent innovation, or if I'm completely barking up the wrong tree.

Are CHERI capabilities able to provide SPECTRE-resistant isolation between mutually distrustful privilege domains within a single address space?