LWN: Comments on "Better types in C using sparse and smatch" https://lwn.net/Articles/696624/ This is a special feed containing comments posted to the individual LWN article titled "Better types in C using sparse and smatch". en-us Sat, 04 Oct 2025 03:12:27 +0000 Sat, 04 Oct 2025 03:12:27 +0000 https://www.rssboard.org/rss-specification lwn@lwn.net Better types in C using sparse and smatch https://lwn.net/Articles/698305/ https://lwn.net/Articles/698305/ damien.lespiau <div class="FormattedComment"> Something I always wanted: units annotation and verify that expressions are then homogeneous. Similarly, making sure we don't assign a value /pass a function argument in Hz when we expect kHz, ...<br> </div> Fri, 26 Aug 2016 09:44:27 +0000 Weird https://lwn.net/Articles/698083/ https://lwn.net/Articles/698083/ mathstuf <div class="FormattedComment"> Do you have links to back this up? It seems odd that there'd be a `std::process` module in the standard library[1] if using it with the standard library causes problems (at least without some kind of documentation).<br> <p> [1]<a href="https://doc.rust-lang.org/nightly/std/process/index.html">https://doc.rust-lang.org/nightly/std/process/index.html</a><br> </div> Tue, 23 Aug 2016 20:43:19 +0000 Weird https://lwn.net/Articles/697718/ https://lwn.net/Articles/697718/ ncm <div class="FormattedComment"> Sorry, that was supposed to be "decrypting, rendering, decompressing, deserializing... ".<br> </div> Fri, 19 Aug 2016 15:15:51 +0000 Weird https://lwn.net/Articles/697574/ https://lwn.net/Articles/697574/ andresfreund <div class="FormattedComment"> <font class="QuotedText">&gt; Or, trying to write my pet project in C++, I'm left wondering how on earth I interact with the hardware to the extent that I actually know what is going on at the hardware level?</font><br> <p> Huh? There's no difference between C and C++ on that end of things.<br> </div> Thu, 18 Aug 2016 16:37:05 +0000 Weird https://lwn.net/Articles/697573/ https://lwn.net/Articles/697573/ Wol <div class="FormattedComment"> <font class="QuotedText">&gt; There is no defensible reason for a programmer competent in C to choose it over C++ for a new program. </font><br> <p> Why then, looking at lilypond and libreoffice C++ code, do I think "what the hell is going on here", yet when I looked at the (C) code for mdadm, I felt at home straight away?<br> <p> Or, trying to write my pet project in C++, I'm left wondering how on earth I interact with the hardware to the extent that I actually know what is going on at the hardware level?<br> <p> Cheers,<br> Wol<br> </div> Thu, 18 Aug 2016 16:26:43 +0000 Weird https://lwn.net/Articles/697528/ https://lwn.net/Articles/697528/ tuna <div class="FormattedComment"> If you want to make libraries that are usable from many different languages it is probably easier to use C than C++.<br> </div> Thu, 18 Aug 2016 12:07:21 +0000 Weird https://lwn.net/Articles/697514/ https://lwn.net/Articles/697514/ micka <div class="FormattedComment"> All I found myself was this issue:<br> <a href="https://github.com/rust-lang/rust/issues/16799">https://github.com/rust-lang/rust/issues/16799</a><br> <p> Especially from comment<br> <a href="https://github.com/rust-lang/rust/issues/16799#issuecomment-171170041">https://github.com/rust-lang/rust/issues/16799#issuecomme...</a><br> <p> which states (if I understand correctly) that it was unsafe to fork when rust used a runtime, but when the runtime was removed, the only problem left was the hashmap implementation using a rng with a shared seed (the rng being used to prevent DOS by hashmap collision).<br> </div> Thu, 18 Aug 2016 09:46:15 +0000 Weird https://lwn.net/Articles/697510/ https://lwn.net/Articles/697510/ farnz <p>I can't find any such limitation in versions of Rust post the decision to not use a green-threading model. In prerelease versions, the userspace thread manager could get confused by <tt>fork()</tt>, but the thread manager has gone away. Thu, 18 Aug 2016 08:55:05 +0000 Weird https://lwn.net/Articles/697493/ https://lwn.net/Articles/697493/ Cyberax <div class="FormattedComment"> <font class="QuotedText">&gt; Yet, standard library usage in forking programs seems to be considered undefined behaviour, including the loss of all memory-safety guarantees.</font><br> Really? How? Borrow checker is entirely compile time and after forking the new copy will go on independently.<br> <p> Standard library does NOT run any background threads and RNG duplication might be an expected outcome.<br> <p> I can't find any recent admonitions to not use libstd in forking programs and having actually used it, I kinda doubt that there are any serious issues.<br> </div> Thu, 18 Aug 2016 03:32:25 +0000 Weird https://lwn.net/Articles/697490/ https://lwn.net/Articles/697490/ lsl <div class="FormattedComment"> Yet, standard library usage in forking programs seems to be considered undefined behaviour, including the loss of all memory-safety guarantees. You're supposed to use #![no_std] and libcore only. The reason seems to be that libstd code might kick off threads (IO-related modules?) or get its RNG state duplicated on fork or a host of other things.<br> <p> So while the new Rust with mostly-excised runtime itself might be used in forking programs, touching the standard library is still considered to result in nasal demons by its developers.<br> </div> Thu, 18 Aug 2016 02:02:12 +0000 Weird https://lwn.net/Articles/697488/ https://lwn.net/Articles/697488/ Cyberax <div class="FormattedComment"> Standard library code is not dependent on any runtime except for jemalloc ( <a href="https://doc.rust-lang.org/book/custom-allocators.html">https://doc.rust-lang.org/book/custom-allocators.html</a> ). There is no "life before main()" of any kind and the runtime doesn't store any state at all if you disable unwinding.<br> <p> <p> </div> Wed, 17 Aug 2016 23:54:44 +0000 Weird https://lwn.net/Articles/697485/ https://lwn.net/Articles/697485/ lsl <div class="FormattedComment"> Only if you forego using any standard library code. If you want to fork, the stdlib is verboten.<br> </div> Wed, 17 Aug 2016 23:40:39 +0000 Weird https://lwn.net/Articles/697483/ https://lwn.net/Articles/697483/ Cyberax <div class="FormattedComment"> ?<br> <p> You can fork as much as you want with Rust. It doesn't create any threads behind the scenes.<br> <p> Of course, if you use TLS or create threads yourself then you're on your own.<br> </div> Wed, 17 Aug 2016 23:06:30 +0000 Weird https://lwn.net/Articles/697481/ https://lwn.net/Articles/697481/ lsl <div class="FormattedComment"> <font class="QuotedText">&gt; Rust also is really easy to expose to C or C++, it doesn't have any runtime with garbage collector or static initializers that live before main(). You really can treat it as a safe version of C.</font><br> <p> Not quite, as you must not fork a program linked to Rust code. You only get a spawn-like interface where the runtime takes care to safely fork the program, followed by an immediate call to exec (just like with Go).<br> <p> Did this really change recently? Not long ago, the Rust developers' position was something along the lines of "fork won't ever be safe to do in Rust".<br> </div> Wed, 17 Aug 2016 22:46:49 +0000 Weird https://lwn.net/Articles/697361/ https://lwn.net/Articles/697361/ flussence <div class="FormattedComment"> <font class="QuotedText">&gt; As for the wrapping part, there is value in leveraging existing libraries and C is the current lowest common denominator (try using a Python or Ruby library from any of C, JavaScript, D, or Perl).</font><br> <p> Easy enough: <a href="https://metacpan.org/search?size=20&amp;q=Inline%3A%3A&amp;search_type=modules">https://metacpan.org/search?size=20&amp;q=Inline%3A%3A&amp;...</a><br> <p> (Although given the general quality of Ruby code in the wild, it's probably safer for the internet if we don't try to take it out of its native environment of locked down containers…)<br> </div> Tue, 16 Aug 2016 19:29:01 +0000 Weird https://lwn.net/Articles/697296/ https://lwn.net/Articles/697296/ ncm <div class="FormattedComment"> The best choice of library to first code in Rust, to get immediate reward for the effort, is one that processes untrusted input -- decrypting, rendering, decrypting, deserializing, taking remote commands. Such plugins account for a majority of vulnerabilities in Firefox. Of course you still have to fuzz them, but failures are much easier to account for when you know they haven't corrupted random memory, and success is easier to trust.<br> </div> Tue, 16 Aug 2016 03:12:56 +0000 Weird https://lwn.net/Articles/697274/ https://lwn.net/Articles/697274/ halla <div class="FormattedComment"> Yes, indeed -- that's why I would like to start with one library, and maybe even just do a Qt-based inteface wrapper around that. We've always kept our code split up nicely, so it should be possible. And I'm so sick and tired of ambiguous ownership...<br> </div> Mon, 15 Aug 2016 19:56:11 +0000 Weird https://lwn.net/Articles/697273/ https://lwn.net/Articles/697273/ Cyberax <div class="FormattedComment"> Rust can interface with C easily and there are code generators to create automatic bindings for C libraries.<br> <p> Rust also is really easy to expose to C or C++, it doesn't have any runtime with garbage collector or static initializers that live before main(). You really can treat it as a safe version of C.<br> <p> Of course, translating a huge codebase is not going to be easy. To get advantage of Rust you really need to encode Rust's notion of ownership into the interface with C/C++ and that's not always trivial.<br> </div> Mon, 15 Aug 2016 19:49:34 +0000 Weird https://lwn.net/Articles/697272/ https://lwn.net/Articles/697272/ halla <div class="FormattedComment"> This post really helps sell Rust to me... My particular problem is that I maintain a million-line application written in C++ that consists of a dozen or two libraries, a hundred or so plugins, and those libraries use C++ or C libraries. I would like to experiment with rewriting the most core libr ary in something like Rust, but that still means that that library needs to:<br> <p> a) use a C library<br> b) handle file io and other standard stuff<br> c) provide a base for the C++ libraries to build onto<br> d) make it possible to write plugins in C++ or Python that this core library can load<br> <p> I'm sure the a) and b) are provided for -- but I cannot figure out whether c) and d) are possible.<br> </div> Mon, 15 Aug 2016 19:39:52 +0000 Weird https://lwn.net/Articles/697267/ https://lwn.net/Articles/697267/ excors <div class="FormattedComment"> I think one important feature of modern language design is a recognition of the substantially different high-level concepts that are all handled in C using pointers. E.g. a C pointer can represent:<br> <p> * Ownership of an object (i.e. you are responsible for freeing it eventually)<br> * A non-owning reference to an object (you mustn't free it)<br> * Same as above but for arrays instead of individual objects<br> * A non-owning reference to a range of elements within an array<br> * A non-owning reference to memory of unspecified type (e.g. for memcpy)<br> * An optionally-present value<br> * A return value from a function<br> * Any arbitrary pointer-sized number that you happen to store as a pointer type<br> * Various other stuff (pointers to struct members, polymorphic types, etc)<br> <p> In C it's too easy for a programmer to lose track of the meaning of each pointer, so you get memory leaks (forgetting that a particular pointer is meant to own a resource), double-frees (thinking a non-owning reference owns its resource), null pointer crashes (some code thinks a value is optional, other code thinks it's required), use of uninitialised data (mixing up function inputs and outputs), etc.<br> <p> Languages like Java try to solve the symptoms of those bugs, not the root cause: they remove the distinction between owning and non-owning references by having the garbage collector treat every reference as a potential owner, so it usually doesn't matter if the programmer loses track (except when it does matter because there are resources other than memory), and they let you catch and ignore null pointer dereferences, and they remove the ability to point inside an array, etc, so they can claim the language is safe.<br> <p> (From a brief inspection, it seems Go is nearly as poor as Java, except it adds array slices.)<br> <p> C++ adds features that can represent some of the concepts much more cleanly: RAII objects that enforce ownership (with lifetime determined by scope or by some parent object), "T&amp;" reference types for non-owning non-optional references, std::vector for arrays. C++11 adds unique_ptr for ownership with arbitrary lifetimes, shared_ptr for when you can't define a single owner, "T&amp;&amp;" for transfer of ownership, std::array, etc.<br> <p> I think it's generally possible for well-written C++11 code to almost entirely avoid raw pointers, which will make it easier to understand and much less prone to memory-safety errors. But since C++ evolved from C over decades, it's not a very clean or coherent design, and it's happy to push you back onto raw pointers when you want something it doesn't support. But at least it's going some way in the right direction.<br> <p> I'm less familiar with Rust but I get the impression that it's solving this much more successfully, because it's designed around these concepts. Every object has an unambiguous owner, ownership can be transferred, there are "&amp;T" non-owning reference types, array slices, std::option for optional values, raw pointers when you need to do something weird (limited to explicitly unsafe scopes), etc. It's flexible enough to do anything you could do with pointers in C, and efficient enough to compile them down into the same instructions - but those concepts are fundamental parts of the language design, so the compiler can verify you're using them correctly and the libraries are all designed to work nicely with them, which is a major benefit.<br> <p> That does seem to make Rust harder to start using: you have to clearly understand all those different concepts, and the syntax for them, and how your code intends to use them, before you can write code the compiler will accept, whereas C lets you hack everything together with simple pointers and not worry about the details of ownership etc until a user reports a memory leak. But they aren't *new* concepts in Rust, they're ones any C programmer should already understand intuitively even if they don't recognise it in those terms.<br> <p> <p> Going back to the original article here, I suppose I don't really see "safe (i.e. non-NULL) pointer" as a step in the right direction towards a memory-safe version of C. It doesn't correspond to any of those fundamental concepts behind pointers, it's just describing a minor part of their mechanics, so it's kind of a dead end. A good solution would need much more substantial changes to the language, and then it would be as uncomfortable to C programmers as C++ and Rust are.<br> </div> Mon, 15 Aug 2016 18:37:29 +0000 Weird https://lwn.net/Articles/697221/ https://lwn.net/Articles/697221/ ncm <div class="FormattedComment"> Making C better led directly to C++. There is no defensible reason for a programmer competent in C to choose it over C++ for a new program. All it takes to start is file names with a *.cc suffix, and the right compiler. If you don't like some feature in C++, you are not obliged to use it in your program. But the prospect of faster, more reliably correct programs written more quickly is a benefit you cannot rationally justify avoiding. Pottering about with hacks on C to help you catch problems that C++ already eliminated a decade ago is a tragic waste of your short time on Earth.<br> <p> Learning Rust would certainly slow you down, for a while. Rust is mostly an opportunity for the next generation of serious programmers, and those who will teach them. But before you know it, the most interesting programs will be coded in Rust, and you will need to know it to read them. <br> </div> Mon, 15 Aug 2016 02:52:00 +0000 Weird https://lwn.net/Articles/697203/ https://lwn.net/Articles/697203/ neilbrown <div class="FormattedComment"> <font class="QuotedText">&gt; you can't learn Rust without learning new insights about the craft and nature of programming.</font><br> <p> I have no doubt that you are correct, but these new insights do not come for free. Much as I love learning new things, I know that my capacity to do this is limited so I need to pace myself. Had I decided to write this project in Rust, I am quite confident that I would not have progressed a far as I have. Sometimes it makes sense to work with what you've got, even if that is "C".<br> <p> Also, you are making an assumption that is worth highlighting. You are assuming that if some language is problematic, then the solution is to use a different language. I understand the thinking behind that assumption because programming languages have always effectively been isolated silos. But the "replace" approach doesn't always work so well: witness Python 3.<br> <p> Maybe there is another way. A significant strength of the Linux kernel project is the incremental approach to improvements. Today's kernel is very different from Linux 1.0, but it is still "the same Linux". What if we could do that with a Language? The C standards process does to an extent, and "C11" is still "C", even though it is very different to K&amp;R C. But there a limits to how much change can happen there.<br> It has always been possible for different projects to use different versions of C, thanks to the macro pre-processor. Having "list_for_each_entry" and similar is the kernel is a real boon.<br> Having pluggable semantic checks could be seen as just another step in that sort of approach. Why are you so sure that replacing C is a better approach than making C better.<br> I like the familiarity and universality of C, and the safety of Rust. Why should I not want both?<br> <p> <font class="QuotedText">&gt; It's hard to believe...</font><br> <p> I would suggest that the evidence is against you there. My own observations tell me that people are, in general, quite capable of believing whatever they want to believe.<br> So I think you are really saying "I don't want to believe...". I assure you that I completely support your right to believe whatever you choose, but know that I will likely make different choices.<br> <p> </div> Sun, 14 Aug 2016 04:31:08 +0000 Weird https://lwn.net/Articles/697198/ https://lwn.net/Articles/697198/ mathstuf <div class="FormattedComment"> Even C doesn't do that in its standard (though you'll want to clarify "full capability"). Sure, it has inline assembly, but that's more a compiler thing than a language thing. Steve Klabnik is working on a kernel in Rust to teach how to write kernels (he's writing a book alongside it), so you can use Rust to at least use the nitty gritty assembly code you need to do things like turn on 64-bit mode or talk to the VGA. In fact, Rust lets you build type-safe wrappers around these abstractions that hide the peeks and pokes instead of passing around arbitrary magic values (such as certain pointer values) everywhere. Sure, that's possible in C too, but C compilers don't help you nearly as much.<br> <p> As for the wrapping part, there is value in leveraging existing libraries and C is the current lowest common denominator (try using a Python or Ruby library from any of C, JavaScript, D, or Perl). Here, though, Rust has the benefit of being able to export a C ABI so that you can use to it as a base instead of C.<br> <p> You might want to check out Corrode which is a Haskell program for converting C code into Rust code. Not exactly idiomatic Rust, but it gets you up the massive step of even starting such a project.<br> </div> Sun, 14 Aug 2016 01:26:30 +0000 Weird https://lwn.net/Articles/697196/ https://lwn.net/Articles/697196/ Cyberax <div class="FormattedComment"> Mostly through aliasing analysis. A typical C code compiler has to assume that most pointers are aliased and has to generate extra load/store operations.<br> </div> Sat, 13 Aug 2016 22:52:51 +0000 Weird https://lwn.net/Articles/697195/ https://lwn.net/Articles/697195/ hummassa <div class="FormattedComment"> Rust and C++ are far more optimizable than C. You can write pretty good, semi-optimal code in C, but Rust and C++ can express much more succintly some code that is far easier to the compiler to optimize.<br> </div> Sat, 13 Aug 2016 22:43:50 +0000 Weird https://lwn.net/Articles/697187/ https://lwn.net/Articles/697187/ ballombe <div class="FormattedComment"> As long as C is the only language to give access to the full capability of the hardware, I will need to write C code.<br> Each time you are using bindings to a C library, remember that someone had to write C code.<br> </div> Sat, 13 Aug 2016 20:33:28 +0000 Weird https://lwn.net/Articles/697181/ https://lwn.net/Articles/697181/ tao <div class="FormattedComment"> Rust &amp; C++ faster than C? Through magic? Or are the existing C-compilers worse than the C++ and Rust compilers?<br> </div> Sat, 13 Aug 2016 17:47:45 +0000 Weird https://lwn.net/Articles/697158/ https://lwn.net/Articles/697158/ ncm <div class="FormattedComment"> I can see using programs to try to shore up the quality of existing C programs that people depend on and that you can't afford to replace. I simply cannot imagine writing 18,000 lines of new C code that must then be decorated and analyzed by extra-lingual tools just to get the most basic defined behavior. <br> <p> For new code, the way to get correct programs is to write them correctly in the first place, in a language that doesn't go out of its way to make it hard to do that. Even C++ has a safe subset that encourages code that is faster than C and overwhelmingly more pleasant to write and read. For a modern experience, Rust is maturing nicely, is fully as fast as C++ (and faster than C), is ready now for personal projects, and should be ready for industrial use in only 5-10 years. Unlike, say, Java, you can't learn Rust without learning new insights about the craft and nature of programming.<br> <p> It's hard to believe that the population still coding C has not self-selected for those not interested in cultivating new understanding.<br> </div> Sat, 13 Aug 2016 02:26:25 +0000 Better types in C using sparse and smatch https://lwn.net/Articles/697018/ https://lwn.net/Articles/697018/ k3ninho <div class="FormattedComment"> <font class="QuotedText">&gt;It may allow us to automatically catch a lot more errors and provide reliable API documentation, but it might — as James Bottomley feared — end up as "a lot of pain, for what gain?"</font><br> <p> Free and open source software has taught me about software engineering as a way of solving a problem you have -- scratch your own itch. If James Bottomley can't see a problem to which well-defined API details are the solution, that doesn't discount that there is value in this metadata about the Linux kernel. If there's a strong culture of versioning the API, then we can retire unsafe interfaces in a controlled way and spin off the retired interfaces to their own abstraction layer. <br> <p> 'Don't break userspace' is caveman talk when you might instead have a daemon reading the expected-kernel-version and can log that it's using old, deemed-unsafe interfaces before dropping it into an isolated cgroup running through an abstraction later that filters possible exploits. Or the use-patterns that come with a set of interface designs can be retired for a collection of interfaces that give a better mental model of the workflow you're trying to get Linux to do for you, or a collection of interfaces that are faster to process your data, or have feature-set collections of interfaces which are incompatible together but which achieve e.g. throughput vs latency goals for different users of the interfaces. The core part of that is 'we know what we promised you would work in release X.Y.Z, which is currently buried in git history rather than published clearly.<br> <p> (Filed under 'ideas are cheap, show the code or shut up'.)<br> K3n.<br> </div> Thu, 11 Aug 2016 15:09:18 +0000