LWN: Comments on "Linus and Dirk on succession, Rust, and more" https://lwn.net/Articles/990534/ This is a special feed containing comments posted to the individual LWN article titled "Linus and Dirk on succession, Rust, and more". en-us Sun, 14 Sep 2025 09:39:40 +0000 Sun, 14 Sep 2025 09:39:40 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net "the right color of hair" https://lwn.net/Articles/997593/ https://lwn.net/Articles/997593/ carlosrodfern <div class="FormattedComment"> I didn't even know these comments were still going here. Sorry, that's right, I realized that later when I watched the video. I enjoyed watching your interviews, BTW.<br> <p> </div> Sat, 09 Nov 2024 05:08:47 +0000 Lemmings off a cliff https://lwn.net/Articles/996744/ https://lwn.net/Articles/996744/ soheil <div class="FormattedComment"> <span class="QuotedText">&gt; lemmings off a cliff</span><br> <p> What if the lemmings don't know they're following the other lemmings and think they're doing their own thing?<br> </div> Sat, 02 Nov 2024 04:53:41 +0000 "the right color of hair" https://lwn.net/Articles/992882/ https://lwn.net/Articles/992882/ farnz <p>That's the trouble with older generations - they're such sensitive snowflakes :-) <p>More seriously, this is the trouble with humour - it doesn't always translate well across media or cultural boundaries. Fri, 04 Oct 2024 09:54:35 +0000 Rust has provenance https://lwn.net/Articles/992880/ https://lwn.net/Articles/992880/ deltragon Since <a rel="nofollow" href="https://github.com/rust-lang/rfcs/blob/master/text/3559-rust-has-provenance.md">RFC #3559</a> (titled "Rust Has Provenance") was accepted in February, Rust certainly has provenance. It is true that the documentation has not quite caught up to this yet (and still refers to the Strict/Permissive Provenance APIs as "experimental"), this is also <a rel="nofollow" href="https://github.com/rust-lang/rust/pull/130350">changing</a> along with the stabilisation of these APIs. Fri, 04 Oct 2024 08:45:04 +0000 "the right color of hair" https://lwn.net/Articles/992840/ https://lwn.net/Articles/992840/ mikebenden <div class="FormattedComment"> JFC, the fact that y'all felt obligated to go over this obviously self-deprecating joke with such a fine toothed comb, for a total of 6 back-and-forth comments (and counting), is just... sad... :(<br> </div> Thu, 03 Oct 2024 19:08:05 +0000 "the right color of hair" https://lwn.net/Articles/992830/ https://lwn.net/Articles/992830/ dirkhh <div class="FormattedComment"> I believe it's pretty clear, if you watch the recording, that I was talking about the color of Linus' and especially my hair.<br> I used to be a red-haired. I am decidedly not any more. So mostly this was a tongue in cheek comment about myself and Linus.<br> </div> Thu, 03 Oct 2024 18:05:20 +0000 C is not as simple as it seems https://lwn.net/Articles/992570/ https://lwn.net/Articles/992570/ alison <div class="FormattedComment"> Do you have a blog? The comment is the outline of a fine ACCU conference talk.<br> </div> Wed, 02 Oct 2024 03:01:09 +0000 C can't do provenance https://lwn.net/Articles/992416/ https://lwn.net/Articles/992416/ farnz <blockquote> The other, far simpler option is for CNext to state that provenance rules only apply to a given platform (or configuration) if the compiler's documentation explicitly says that it does. Frankly, that strikes me as the obvious way to deal with this, and then none of these discussions are even necessary at all. </blockquote> <p>The issue is that without provenance rules, alias analysis becomes intractable, since any integer could be cast to a pointer, including in other modules. And without alias analysis, you have to assume that any write through a pointer, including in other threads that have a suitable ordering with your thread, could write to any variable. <p>Compiler writers handle this by assuming that certain things can't alias, even though the language standard doesn't prohibit that form of aliasing, and then hoping that their gut feelings work out when they write optimizations that assume that (e.g.) an integer and a pointer to a struct don't alias. This works most of the time, since compiler writers aren't evil, and their assumptions match those that programmers tend to make. <p>Unfortunately, some of the assumptions that compiler authors make contradict each other; each of them technically breaks the letter of the standard (so neither one is "right"), and each of them results in an optimization that improves code without surprising C programmers, but the combination of optimizations that assume different things results in a miscompilation. The intention behind formalizing provenance rules is to get to a place where the standard can be used to determine which of those optimizations is at fault when the combination surprises people. Tue, 01 Oct 2024 11:02:45 +0000 C can't do provenance https://lwn.net/Articles/992362/ https://lwn.net/Articles/992362/ Wol <div class="FormattedComment"> <span class="QuotedText">&gt; IMHO it is obvious that kernel emulation is slow, and it does not need to be measured (I don't even know if there are any operating systems that both support CHERI and implement exposed address emulation). </span><br> <p> The problem with common sense is that it is not common, and rarely makes sense.<br> <p> As such, a statement like "imho it is obvious" is almost certainly wrong. How often do we hear "it stands to reason", only to discover that said reasoning has missed the obvious and come to a conclusion diametrically opposed to empirical observation.<br> <p> Cheers,<br> Wol<br> </div> Mon, 30 Sep 2024 22:45:26 +0000 C can't do provenance https://lwn.net/Articles/992338/ https://lwn.net/Articles/992338/ NYKevin <div class="FormattedComment"> <span class="QuotedText">&gt; I don't think the 'escape hatch' is needed on CHERI, assuming the code uses the (special in CHERI-C) intptr_t type for any pointer&lt;=&gt;integer roundtrips.</span><br> <p> It is required for pointer &lt;=&gt; char[], which in C23 is legal for any type, including all pointers. It is not practically possible for the compiler to emit provenance-preserving operations for every byte that the program manipulates, so I would think it obvious that you have to draw a boundary there.<br> <p> The other problem is that intptr_t is a number. It is not an opaque object that is allowed to have magical properties. If any part of the program becomes aware of that number, by any means whatsoever, then it is allowed to reconstruct and dereference the pointer (provided the allocation still exists). That means you can flatten it into ASCII base 10 (or any other base), send it to a remote host as JSON (or any other format), receive it back from that same host or a different one, unpack it all back into a pointer, and dereference it. No hardware in the world will ever support tracking provenance across that sequence of operations.<br> <p> <span class="QuotedText">&gt; The escape hatch is needed for mainstream implementations where the HW does not carry around any provenance information, and the compiler is not tracking provenance once a pointer is 'exposed' (and in some cases, is fundamentally incapable to at compile time).</span><br> <p> Then you don't need to do anything special. UB means that the standard doesn't cover a situation. It does not mean that the compiler is forbidden from introducing an extension to make UB defined. The compiler can just say in its documentation "as an extension, on all architectures other than X, Y, and Z, provenance does not exist and all pointer &lt;=&gt; data conversions that were valid in C23 are still valid and the resulting pointers may still be dereferenced." Of course, this would be based on some hypothetical future version of the standard, since as I have explained, C23 has no reasonable support for provenance.<br> <p> The other, far simpler option is for CNext to state that provenance rules only apply to a given platform (or configuration) if the compiler's documentation explicitly says that it does. Frankly, that strikes me as the obvious way to deal with this, and then none of these discussions are even necessary at all.<br> <p> <span class="QuotedText">&gt; It would be nice to have some quantitative data backing that statement.</span><br> <p> That statement was specific to hardware that has provenance and requires kernel emulation of exposed pointer dereferences. IMHO it is obvious that kernel emulation is slow, and it does not need to be measured (I don't even know if there are any operating systems that both support CHERI and implement exposed address emulation). Your discussion of platforms that do not have provenance is frankly irrelevant to my statement.<br> <p> <span class="QuotedText">&gt; But I don't think that if any of the provenance proposals is adopted, that it would require implementations to somehow emulate CHERI on non-CHERI hw. </span><br> <p> Of course not, I was assuming that all readers were familiar with the definition of UB and the fact that the implementation is encouraged to do whatever makes sense (performance-wise) on a given platform when UB happens. I don't understand how you read this into my comment, when I so explicitly characterized provenance as a hardware feature and disclaimed its relevance to non-CHERI-like platforms.<br> </div> Mon, 30 Sep 2024 19:52:33 +0000 C can't do provenance https://lwn.net/Articles/992188/ https://lwn.net/Articles/992188/ joib <div class="FormattedComment"> I don't think the 'escape hatch' is needed on CHERI, assuming the code uses the (special in CHERI-C) intptr_t type for any pointer&lt;=&gt;integer roundtrips.<br> <p> The escape hatch is needed for mainstream implementations where the HW does not carry around any provenance information, and the compiler is not tracking provenance once a pointer is 'exposed' (and in some cases, is fundamentally incapable to at compile time).<br> <p> <span class="QuotedText">&gt; it leaves a ton of performance on the table</span><br> <p> It would be nice to have some quantitative data backing that statement.<br> <p> I don't have any quantitative data proving otherwise either, but AFAICS the 'escape hatch' is activated in situations like<br> <p> 1) Pointer a is exposed, creating an integer a1.<br> <p> 2) Some time later, a pointer b is created 'out of thin air'. Maybe b is created via a1, maybe not, the compiler doesn't know.<br> <p> Now the compiler must assume that a and b potentially alias, and thus some fancy optimizations cannot be done. But crucially, the compiler can still use provenance to reason about pointer c which has not been exposed, and do optimizations related to that pointer accordingly. Given that situations like the above are hopefully somewhat rare, I'm not buying the story about a major performance impact without benchmarks.<br> <p> I also don't understand what OS support would be needed? Or are you assuming that on mainstream HW the OS would emulate CHERI and "manually" keep track of provenance in the kernel, somehow?<br> <p> Perhaps this is where we disagree; I see the provenance proposals mainly as an effort to codify existing practices by optimizing compilers. I see CHERI as a separate effort trying to make computing safer by detecting violations at runtime, using provenance as a crucial tool to implement said detection. And there has been some collaboration between the CHERI folks and the ones writing the provenance proposals (which is very nice, I would very much like to see CHERI or something like it becoming mainstream, and it would be a bummer if C would go in an incompatible direction). But I don't think that if any of the provenance proposals is adopted, that it would require implementations to somehow emulate CHERI on non-CHERI hw. C-with-provenance on mainstream hardware would still be as dangerous and error-prone as it is today. Just hopefully with a bit less ambiguity whether something the compiler does is a miscompilation or perfectly allowed, once compilers implement the provenance rules.<br> </div> Mon, 30 Sep 2024 07:19:07 +0000 C can't do provenance https://lwn.net/Articles/992181/ https://lwn.net/Articles/992181/ joib <div class="FormattedComment"> <span class="QuotedText">&gt; C23 explicitly says that intptr_t is optional. If you don't provide it, then there is no line in the standard requiring pointer-to-integer conversions to be possible.</span><br> <p> I believe if the implementation provides an integer type large enough to hold a pointer, it must be possible to do such a pointer-to-integer conversion. So even without specifically intptr_t (which, in the history of C is a recent-ish invention anyway as it was introduced only in C99) it can be done, and many C implementation through history have done so.<br> <p> That being said, I think you're correct in that there's nothing in the standard requiring an implementation to provide such large enough integer types capable of storing a pointer. But that gets into the distinction between the standard and that a lot of C code out there is written under the assumption that such an integer type exists.<br> <p> Now, CHERI C is a bit special in that they make intptr_t contain the bounds and capability tag, making it possible to do pointer&lt;=&gt;integer roundtrips only with that type. That's probably a good practical compromise between the purity of the capability model, standards conformance, and still allowing roundtripping with a modest porting effort.<br> <p> <span class="QuotedText">&gt; Please link to the proposal you are discussing, Google can't find anything by that name. </span><br> <p> It's a typo, I meant PNVI (*sigh*). I think the latest proposal is n3005 at https://open-std.org/JTC1/SC22/WG14/www/docs/n3005.pdf . That link doesn't work for me at the moment but you can find it in the wayback machine.<br> <p> <span class="QuotedText">&gt; As I have repeatedly explained throughout this thread, provenance is not an optimization. It is a hardware constraint. You can't simply turn it off in difficult cases, because the dereference will trap whether the compiler wants it to or not.</span><br> <p> Well, for CHERI it's a hardware constraint. But like it or not, non-CHERI hw will be the vast majority for the foreseeable future, and AFAIU there's no plan to make C-with-provenance (if that ever happens) non-implementable on such hardware. For mainstream environments, the practical effect of provenance is to provide compiler writers with guidance on what kinds of optimizations are allowed.<br> </div> Mon, 30 Sep 2024 06:27:45 +0000 C can't do provenance https://lwn.net/Articles/992177/ https://lwn.net/Articles/992177/ Cyberax <div class="FormattedComment"> <span class="QuotedText">&gt; Yes, where the pointer came from is the whole point of provenance. That's because if the pointer came "from nowhere," or from an unrelated pointer, then on CHERI it will not have the correct metadata (or any metadata), and so dereferencing it will trap. </span><br> <p> This can technically happen with "far" pointers on the 32-bit segmented x86 architecture. Simply reading or trying to create a pointer to an invalid segment can cause an exception.<br> </div> Sun, 29 Sep 2024 22:30:08 +0000 C can't do provenance https://lwn.net/Articles/992176/ https://lwn.net/Articles/992176/ NYKevin <div class="FormattedComment"> Just to further clarify: I am aware that provenance proposals usually do have an escape hatch for "exposed" addresses. But to my understanding, such an escape hatch does not exist on CHERI or other hardware, so any such escape hatch would need to be emulated by the kernel if it's going to work at all. While that is technically possible to implement, it leaves a ton of performance on the table and requires OS support.<br> </div> Sun, 29 Sep 2024 22:11:39 +0000 C can't do provenance https://lwn.net/Articles/992175/ https://lwn.net/Articles/992175/ NYKevin <div class="FormattedComment"> <span class="QuotedText">&gt; I'm 75% sure that comparing the pointer value of a freed pointer is already UB, although it's somewhat common in practice.</span><br> <p> Unfortunately, text in <a href="https://en.cppreference.com/w/c/memory/free">https://en.cppreference.com/w/c/memory/free</a> misled me into thinking the standard allowed this (and then I couldn't find language in the draft standard directly contradicting it).<br> <p> <span class="QuotedText">&gt; Eh, I don't think this will fly at all. Like it or not, pointer&lt;=&gt;integer conversions and roundtrips are a fact of life in the C world , and any proposal must continue to support them.</span><br> <p> C23 explicitly says that intptr_t is optional. If you don't provide it, then there is no line in the standard requiring pointer-to-integer conversions to be possible.<br> <p> <span class="QuotedText">&gt; No, why? In the PVNI proposals </span><br> <p> Please link to the proposal you are discussing, Google can't find anything by that name. It did find a link to something under open-std.org titled "A Provenance-aware Memory Object Model for C," which does not contain the word "PVNI" anywhere on the page, but the page is not loading for me, so I can't examine it to determine whether it has anything to do with what you are saying.<br> <p> <span class="QuotedText">&gt; there's no requirement for perfect knowledge by the compiler, which is you point out is intractable. It just means the compiler must treat a pointer constructed in such a way as potentially aliasing any escaped pointer (called "exposed" in the PVNI proposals but AFAICS this is more or less the same thing as what compiler people call an address or pointer escaping).</span><br> <p> As I have repeatedly explained throughout this thread, provenance is not an optimization. It is a hardware constraint. You can't simply turn it off in difficult cases, because the dereference will trap whether the compiler wants it to or not.<br> </div> Sun, 29 Sep 2024 22:04:49 +0000 C can't do provenance https://lwn.net/Articles/992174/ https://lwn.net/Articles/992174/ NYKevin <div class="FormattedComment"> Then someone needs to correct <a href="https://en.cppreference.com/w/c/memory/free">https://en.cppreference.com/w/c/memory/free</a>, which says the following:<br> <p> <span class="QuotedText">&gt; The behavior is undefined if after free() returns, an access is made through the pointer ptr (unless another allocation function happened to result in a pointer value equal to ptr).</span><br> </div> Sun, 29 Sep 2024 21:59:32 +0000 C can't do provenance https://lwn.net/Articles/992173/ https://lwn.net/Articles/992173/ NYKevin <div class="FormattedComment"> Yes, where the pointer came from is the whole point of provenance. That's because if the pointer came "from nowhere," or from an unrelated pointer, then on CHERI it will not have the correct metadata (or any metadata), and so dereferencing it will trap. The Rust provenance rules are intended to make it possible for a (currently hypothetical) CHERI backend to emit the necessary instructions to preserve every pointer's metadata.<br> <p> What I'm getting at is that provenance is not an optimization technique. It is a hardware constraint.<br> </div> Sun, 29 Sep 2024 21:55:58 +0000 This is why so many programs use extensions https://lwn.net/Articles/992170/ https://lwn.net/Articles/992170/ netbsduser <div class="FormattedComment"> Yes, the language the ISO standard defines is, to quote Dennis Ritchie himself, "an unreal language that no one can or will actually use". Needless to say if you can't rely on pointers at least mostly looking like and behaving in use as addresses, you can't write an operating system in the language. How would you ever deal with memory-mapped I/O, for instance?<br> </div> Sun, 29 Sep 2024 19:55:55 +0000 C can't do provenance https://lwn.net/Articles/992133/ https://lwn.net/Articles/992133/ SLi <p>I do think "where the pointer came from" is a big part of it necessarily. Here's how Rust Unsafe Code Guidelines define it (it admits that "The exact form of provenance in Rust is unclear"): <p><a href="https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#pointer-provenance">https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#pointer-provenance</a> Sun, 29 Sep 2024 10:43:51 +0000 C can't do provenance https://lwn.net/Articles/992125/ https://lwn.net/Articles/992125/ NYKevin <div class="FormattedComment"> As I explained, the compiler is required to allow the programmer to reconstitute an arbitrary valid pointer from non-pointer data in multiple ways. So the optimization you describe is only possible if the compiler can either prove that you have not done that in a particular case, or if it can somehow prove that the pointers must not alias despite the fact that they may not obey strict provenance rules. In other words, this optimization only works in "easy" cases, and does not cover all legal uses of pointers. The compiler is required to disable it if it is not provably correct in a given case.<br> <p> Provenance is not a matter of optimization. It is not "optional," and you cannot simply turn it off when it becomes inconvenient. It is a real feature of some hardware (e.g. CHERI) that makes it impossible to dereference a possibly-invalid pointer. On that hardware, manufacturing (and dereferencing) an arbitrary pointer out of non-pointer data traps. Walking off the end of an array traps. UAF and double free both trap. And there are probably several other things that trap, but I think you get the idea. All of these traps happen even if the pointer is numerically equal to a valid pointer that exists elsewhere in the program, and could be validly dereferenced at the same exact address. This is because, on that hardware, every pointer has associated metadata that tells the hardware whether it is valid, and over what range of addresses, so the hardware can check every dereference for validity (similar to Valgrind's memcheck tool, but much stricter because it is not required to be compatible with C). In the context of programming languages, "pointer provenance" generally means the set of restrictions that must be enforced in order for the language to be compatible with CHERI and similar architectures. I have not seen the term used to refer generically to knowing where some specific pointer came from - that's usually called escape analysis or pointer analysis.<br> </div> Sun, 29 Sep 2024 09:44:07 +0000 C can't do provenance https://lwn.net/Articles/992084/ https://lwn.net/Articles/992084/ foom <div class="FormattedComment"> <span class="QuotedText">&gt; comparing the pointer value of a freed pointer is already UB</span><br> <p> Correct.<br> <p> From C23 "6.2.4 Storage durations of objects" (earlier versions say effectively the same):<br> <span class="QuotedText">&gt; If a pointer value is used in an evaluation after the object the pointer points to (or just past) reaches the end of its lifetime, the behavior is undefined. The representation of a pointer object becomes indeterminate when the object the pointer points to (or just past) reaches the end of its lifetime.</span><br> </div> Sat, 28 Sep 2024 03:54:51 +0000 C can't do provenance https://lwn.net/Articles/992077/ https://lwn.net/Articles/992077/ SLi <div class="FormattedComment"> Yes, it's definitely possible to do alias analysis (or even remove the need for it) without pointer provenance, in general. I just don't believe it's possible for C or C++.<br> </div> Fri, 27 Sep 2024 22:36:09 +0000 C can't do provenance https://lwn.net/Articles/992074/ https://lwn.net/Articles/992074/ joib <div class="FormattedComment"> <span class="QuotedText">&gt; Neither Rust nor C(++) depend on provenance to do alias analysis.</span><br> <p> I don't think this is correct (I don't know enough about the Rust compiler to say much about its internal workings, so the following applies to C(++)). For a trivial example, for something like<br> <p> int *a = malloc(10);<br> int *b = malloc(10);<br> <p> the alias analysis can determine that a and b point to disjoint objects. They may not call it provenance, as that term became popular only relatively recently, but what is if not making use of provenance to help alias analysis? The various provenance proposals are mostly about formalizing what compilers are already doing, and of course covering all (well, more of them at least) the corner cases. <br> <p> Type-based alias analysis is another bit of data the compiler can use to implement alias analysis, but not the only one. <br> </div> Fri, 27 Sep 2024 21:34:21 +0000 C can't do provenance https://lwn.net/Articles/992073/ https://lwn.net/Articles/992073/ joib <div class="FormattedComment"> <span class="QuotedText">&gt; Most of the (1) do not realize how little performance there will be left without alias analysis, which cannot be done without pointer provenance.</span><br> <p> Do we have some quantitative data on this? I'm not aware of any. -fno-strict-aliasing disables only part of the alias analysis machinery (TBAA), but it's impact seems moderate for most codebases.<br> </div> Fri, 27 Sep 2024 21:11:43 +0000 C can't do provenance https://lwn.net/Articles/992070/ https://lwn.net/Articles/992070/ joib <div class="FormattedComment"> It's been a while, but I'm not sure my recollection of the various provenance suggestions matches yours.<br> <p> <span class="QuotedText">&gt; * Make == compare pointers for provenance, which either requires hardware support (e.g. on CHERI or the like) or fat pointers with runtime allocator support (i.e. the allocator has to give every allocation a unique ID which is not reused on free and "never" overflows). This is necessary because the standard says that, if malloc does not fail, then the following code may not be UB on any code path (and the assert may not fire): int *x = malloc(sizeof(int)); free(x); int *y = malloc(sizeof(int)); if(x == y){ *x = 2; assert(*y == 2);}. The standard also says that == is transitive and so if we already knew that x == z, then you can dereference z instead of x, which effectively means that we can't give == a secret side effect that somehow propagates provenance from y to x (because z still would not have provenance).</span><br> <p> I'm 75% sure that comparing the pointer value of a freed pointer is already UB, although it's somewhat common in practice.<br> <p> <span class="QuotedText">&gt; document that your pointers cannot be represented by any integral type</span><br> <p> Eh, I don't think this will fly at all. Like it or not, pointer&lt;=&gt;integer conversions and roundtrips are a fact of life in the C world , and any proposal must continue to support them.<br> <p> <span class="QuotedText">&gt; But that's not good enough. A pointer is an object, and like any object, you may inspect the pointer's object representation through a char*. </span><br> <p> Yes. I think this makes the "PVI" provenance proposal intractable, since it's not feasible for a compiler to keep track of provenance via arbitrary integer manipulations (arithmetic, IO, IPC, etc.). Hence these various "PVNI" proposals that seem to be the ones the C committee is looking more seriously at.<br> <p> <span class="QuotedText">&gt; I can only think of one loophole that might be used to prohibit this, but it's a real stretch: You could declare that pointers produced in this manner are trap representations, and thus UB to create. </span><br> <p> No, why? In the PVNI proposals there's no requirement for perfect knowledge by the compiler, which is you point out is intractable. It just means the compiler must treat a pointer constructed in such a way as potentially aliasing any escaped pointer (called "exposed" in the PVNI proposals but AFAICS this is more or less the same thing as what compiler people call an address or pointer escaping).<br> </div> Fri, 27 Sep 2024 21:06:10 +0000 C can't do provenance https://lwn.net/Articles/992051/ https://lwn.net/Articles/992051/ NYKevin <div class="FormattedComment"> <span class="QuotedText">&gt; Most of the (1) do not realize how little performance there will be left without alias analysis, which cannot be done without pointer provenance.</span><br> <p> Neither Rust nor C(++) depend on provenance to do alias analysis. Rust uses lifetime-based alias analysis (i.e. if you have &amp;mut T, it may not alias anything, and if you have &amp;T, the pointee must be immutable or protected by an UnsafeCell) and C and C++ both use type-based alias analysis (i.e. if you have two pointers to distinct types, and neither type is char or a variation of char, then the pointers may not alias). In the case of Rust, it is difficult to uphold those invariants without some degree of provenance, but Rust handles this by splitting the language into safe and unsafe Rust. In safe Rust, borrow checking is far stricter than mere provenance, and in unsafe Rust, there is no such thing as provenance - you can manufacture whatever pointers or references you like, as long as any such references obey the aliasing and lifetime requirements (that is, a reference must always point at a valid allocation for the entire duration of the reference's lifetime, plus the two aliasing requirements mentioned before).<br> <p> Rust does have a provenance model documented in its ptr module, but it is non-normative and experimental (according to that very same documentation), and there's almost no information about it in the Rustonomicon. Based on a previous discussion we've had on this site, it is my understanding that some people take the view that it is wrong to claim that Rust has no provenance, because of the existence of this non-normative and experimental model. I disagree with that position but will mention it for completeness (and to save those very same people the trouble of telling me that I'm wrong in comment replies). What I think we can agree on, regardless, is the fact that Rust's aliasing analysis, as it is currently implemented in stable versions of the compiler, is not dependent on this (or any other) provenance model.<br> </div> Fri, 27 Sep 2024 17:24:51 +0000 C can't do provenance https://lwn.net/Articles/991946/ https://lwn.net/Articles/991946/ Wol <div class="FormattedComment"> Probably the only way round it is to declare a new --pointer-provenance flag - which cannot be the default - which then says any and all integer&lt;-&gt;pointer conversions that can be detected will be treated as pointer provenance errors by the compiler.<br> <p> Bit like the laws of the Medes and the Persians :-) You're not declaring such conversions illegal, you're just saying that you're not allowed to use them in combination with pointer provenance.<br> <p> Cheers,<br> Wol<br> </div> Fri, 27 Sep 2024 08:37:58 +0000 C can't do provenance https://lwn.net/Articles/991934/ https://lwn.net/Articles/991934/ SLi <div class="FormattedComment"> Fixed? But I don't think we have sane pointer provenance in C++ either, although it's in a better shape than C?<br> <p> This, I believe, is the problem. There are the classical two kinds of people:<br> <p> 1. Those who treat C as a high level assembler<br> 2. The abstract machine types.<br> <p> Most of the (1) do not realize how little performance there will be left without alias analysis, which cannot be done without pointer provenance.<br> <p> Many or most of (2) probably don't realize pointer provenance likely cannot be done in C or C++ as they exist.<br> <p> What compilers end up doing is assuming that we can do pointer provenance anyway, and hope that we don't run into impossible to debug absurdities because of the contradiction. I don't think that's a good way to do this, either...<br> </div> Fri, 27 Sep 2024 03:34:07 +0000 This is why so many programs use extensions https://lwn.net/Articles/991932/ https://lwn.net/Articles/991932/ DemiMarie <div class="FormattedComment"> In the Linux kernel, many of those undefined behaviors are actually well-defined, thanks to -fno-strict-aliasing, -fno-strict-overflow, -fno-delete-null-pointer-checks, and possibly some other compiler flags. Linux does not even try to conform to the C standard’s requirements for pointer arithmetic, and I agree with that decision.<br> </div> Fri, 27 Sep 2024 02:18:51 +0000 C can't do provenance https://lwn.net/Articles/991926/ https://lwn.net/Articles/991926/ NYKevin <div class="FormattedComment"> <span class="QuotedText">&gt; Just look at how convoluted it gets when they try to retrofit something like pointer provenance.</span><br> <p> After spending way too much time thinking about this, and a fair amount of time scrutinizing the draft standard, I do not believe provenance is feasible in C23. For a start, if you wanted to make a C implementation with full provenance, you would have to do at least the following:<br> <p> * Make == compare pointers for provenance, which either requires hardware support (e.g. on CHERI or the like) or fat pointers with runtime allocator support (i.e. the allocator has to give every allocation a unique ID which is not reused on free and "never" overflows). This is necessary because the standard says that, if malloc does not fail, then the following code may not be UB on any code path (and the assert may not fire): int *x = malloc(sizeof(int)); free(x); int *y = malloc(sizeof(int)); if(x == y){ *x = 2; assert(*y == 2);}. The standard also says that == is transitive and so if we already knew that x == z, then you can dereference z instead of x, which effectively means that we can't give == a secret side effect that somehow propagates provenance from y to x (because z still would not have provenance).<br> * Do not provide (u)intptr_t at all, and document that your pointers cannot be represented by any integral type (not even size_t or intmax_t). This makes all pointer-to-integer casts UB, which is necessary because the standard specifies that once such a cast has legally happened, the rest of the round-trip must produce a pointer that compares equal to the original (and therefore, as we just saw, it can be used to access the same allocation, effectively removing provenance altogether since you can reconstitute a pointer from non-pointer data). If you're feeling generous, you can emit an error on such casts (let's not get into the language-lawyering about whether or not you have to prove that all code paths reach the cast before you can refuse to compile it).<br> <p> But that's not good enough. A pointer is an object, and like any object, you may inspect the pointer's object representation through a char*. The standard also specifies that objects other than float NaNs must compare equal if their object representations are identical. So the enterprising programmer is allowed to convert a pointer into raw bytes and then reconstitute the pointer elsewhere from those bytes, and since the pointer is required to compare equal to the original, it also must be possible to dereference it and access the same object. You can even do something really crazy, like writing the object representation into some arbitrarily-complicated IPC mechanism, and then having another part of your program retrieve it. CHERI is obviously not going to support that, even if it could somehow track simple byte copies around your local address space.<br> <p> I can only think of one loophole that might be used to prohibit this, but it's a real stretch: You could declare that pointers produced in this manner are trap representations, and thus UB to create. The problem with that is that the standard explicitly specifies that trap representations are not valid object representations, so the programmer is within their rights to assume that a representation they got from a valid pointer (or any other properly initialized object) is not a trap representation. There does not appear to be any wiggle room in the standard for the same object representation sometimes being valid and sometimes being a trap, and I think it would be an absurd reading of the standard to allow such a thing.<br> <p> This could be fixed by amending the standard, most likely by stealing C++'s notion of a "trivially copyable type" and stating that pointers are trivially copyable unless the implementation specifies otherwise (and all other types are always trivially copyable, with the usual float NaN caveat). Of course, pointers are trivially copyable even in C++, so you'd presumably want to amend that standard as well, if you want provenance to exist in C++.<br> </div> Thu, 26 Sep 2024 22:47:11 +0000 C is not as simple as it seems https://lwn.net/Articles/991928/ https://lwn.net/Articles/991928/ Cyberax <div class="FormattedComment"> Technically, pointers can have attributes in C. Like the bad old FAR and NEAR pointers in x86-16.<br> </div> Thu, 26 Sep 2024 20:38:47 +0000 C is not as simple as it seems https://lwn.net/Articles/991924/ https://lwn.net/Articles/991924/ intelfx <div class="FormattedComment"> Correction, _potentially_ overlapping regions.<br> </div> Thu, 26 Sep 2024 20:23:10 +0000 C is not as simple as it seems https://lwn.net/Articles/991923/ https://lwn.net/Articles/991923/ intelfx <div class="FormattedComment"> <span class="QuotedText">&gt; What would be the use case for pointer arithmetic between things that are in 2 different memory allocations?</span><br> <p> Memmove between overlapping regions?<br> </div> Thu, 26 Sep 2024 20:22:35 +0000 C is not as simple as it seems https://lwn.net/Articles/991921/ https://lwn.net/Articles/991921/ NYKevin <div class="FormattedComment"> I'm not saying all (or even most) of these things have actual use cases. I'm just saying that pointers have a lot of complicated rules that are often not taught correctly or at all. Students are told that pointers are integers, and they are not.<br> </div> Thu, 26 Sep 2024 19:58:52 +0000 "the right color of hair" https://lwn.net/Articles/991915/ https://lwn.net/Articles/991915/ carlosrodfern <div class="FormattedComment"> It could also be just a sarcastic/humor statement to introduce the topic. Dirk Hohndel has lots of gray hair too :D. <br> </div> Thu, 26 Sep 2024 19:04:44 +0000 "the right color of hair" https://lwn.net/Articles/991909/ https://lwn.net/Articles/991909/ jake <div class="FormattedComment"> <span class="QuotedText">&gt; Nevertheless, he started with such a statement, which in general reflects the sentiment of many in this industry.</span><br> <p> I personally think he was just commenting on the fact that the color of (and/or amount of) kernel maintainers' hair has changed over the years. I pretty strongly doubt he was making an ageist statement, anyway ...<br> <p> jake<br> </div> Thu, 26 Sep 2024 16:54:31 +0000 "the right color of hair" https://lwn.net/Articles/991901/ https://lwn.net/Articles/991901/ carlosrodfern <div class="FormattedComment"> I did get the point from the beginning. Linus corrected him, and then he clarified his point. Nevertheless, he started with such a statement, which in general reflects the sentiment of many in this industry.<br> </div> Thu, 26 Sep 2024 15:13:19 +0000 C is not as simple as it seems https://lwn.net/Articles/991821/ https://lwn.net/Articles/991821/ smurf <div class="FormattedComment"> On a Harvard architecture system (like embedded 8-bit Atmel CPUs) plain C doesn't know *anything* about which part of the architecture your pointer refers to. It's just a value in a register. Literally everything you do with pointers that refer to the "wrong" address space (e.g. read-only strings stored in Flash) requires compiler-specific extensions, macros, and/or special functions.<br> <p> Fortunately GCC has some extensions that warn you if you use the wrong kind of pointer, but for any new project on these things I'd use Rust instead.<br> </div> Thu, 26 Sep 2024 10:36:12 +0000 "the right color of hair" https://lwn.net/Articles/991820/ https://lwn.net/Articles/991820/ farnz <p>I think you're possibly missing the point, though. Having the 80/90 year olds around and helping is awesome - they've got a depth of experience; however, would you bet the survival of your company on the 80+ years old engineer being around for the next 10 years? 20 years? 50 years? <p>At some point, the older generation is going to become unavailable to work on things. If the plan for the future is "I'll never retire, I'll never die, I'll never suffer a nasty disease like dementia", your project has a problem. If the plan is "well, we've got 3 employees under 50 who between them could replace the 80+ employee if necessary", you've got a future. Thu, 26 Sep 2024 10:31:31 +0000 C is not as simple as it seems https://lwn.net/Articles/991814/ https://lwn.net/Articles/991814/ joib <div class="FormattedComment"> Zero chance of this ever happening, but occasionally I think the ISO C committee should just give up on the idea of a somewhat high level abstract machine. Just say that memory is a big array, and pointers really are just integers. Get rid of TBAA. Etc. In essence, make C be the "portable macro-assembler" that many people stubbornly believe it is.<br> <p> Just look at how convoluted it gets when they try to retrofit something like pointer provenance. <br> <p> Of course for architectures with segmented memory, Harvard architectures, etc. you might have several of these big arrays representing the system memory, there you need some implementation-defined logic how comparing pointers to different segments would work. If you want a language with a more abstract view of the machine, choose another language designed with that in mind from the beginning. Like Rust? <br> <p> <p> </div> Thu, 26 Sep 2024 09:32:10 +0000