Maintainer opinions on Rust-for-Linux
Miguel Ojeda gave a keynote at FOSDEM 2025 about the history of the Rust-for-Linux project, and the current attitude of people in the kernel community toward the experiment. Unlike his usual talks, this talk didn't focus so much on the current state of the project, but rather on discussing history and predictions for the future. He ended up presenting quotes from more than 30 people involved in kernel development about what they thought of the project and expected going forward.
Background information
Ojeda began by explaining Rust-for-Linux for those audience members who may not have been familiar with it, defining it as an attempt to add support for the Rust language to the Linux kernel. The project is not only about adding Rust to drivers, he said, it's about eventually having first-class support for the language in the core of the kernel itself. The long-term goal is for kernel maintainers to be able to choose freely between C and Rust.
Despite the enormity of such a task, many people are already building drivers on top of the in-progress Rust bindings. Ojeda took a moment to discuss why that is. The Linux kernel is already extensible, and introducing a new language has a huge cost — so why would people go to so much effort just to be able to write drivers in Rust, specifically?
In answering that question Ojeda said he wanted to give a more satisfying answer
than the usual justification of "memory safety
".
He put up a piece of example code in C, asking whether the function did what the
comment describes:
/// Returns whether the integer pointed by `a` /// is equal to the integer pointed by `b`. bool f(int *a, int *b) { return *a == 42; }
The developers in the audience were all fairly certain that the presented code did not, in fact, follow the comment. Ojeda put up the obvious corrected version, and then showed the equivalent code in Rust:
// Corrected C bool f(int *a, int *b) { return *a == *b; } // Equivalent Rust versions before and after being fixed fn f(a: &i32, b: &i32) -> bool { *a == 42 } fn f(a: &i32, b: &i32) -> bool { *a == *b }
The Rust functions are nearly identical to the C functions. In fact, they compile to exactly the same machine code. Apart from how a function declaration is spelled, there's no obvious difference between C and Rust here. And neither language helps the programmer catch the mismatch between the comment and the behavior of the function. So why would anyone want to write the latter instead of the former?
Ojeda's answer is: confidence. As a maintainer, when somebody sends in a patch, he may or may not spot that it's broken. In an obvious case like this, hopefully he spots it. But more subtle logic bugs can and do slip by. Logic bugs are bad, but crashes (or compromises) of the kernel are worse. With the C version, some other code could be relying on this function to behave as documented in order to avoid overwriting some part of memory. That's equally true of the Rust version. The difference for a reviewer is that Rust splits things into "safe" and "unsafe" functions — and the reviewer can concentrate on the unsafe parts, focusing their limited time and attention on the part that could potentially have wide-ranging consequences. If a safe function uses the incorrect version of f, it can still be wrong, but it's not going to crash. This lets the reviewer be more confident in their review.
Ojeda "hopes this very simple example piques your curiosity, if you're a C
developer
", but it also answers why people want to write kernel components
in Rust. Kernel programming is already complex; having a little more assurance
that you aren't going to completely break the entire kernel is valuable.
The goals listed in the original Rust-for-Linux RFC do include other points, such as the hope that Rust code will reduce logic bugs and ease refactoring. But a core part of the vision of the project has always been to make it easier to contribute to the kernel, thereby helping to get people involved in kernel development who otherwise wouldn't feel comfortable.
History
With that context established, Ojeda then went into the history of the Rust-for-Linux project. The idea of using Rust with the Linux kernel is actually more than a decade old. The earliest example is rust.ko, a simple proof of concept written in 2013, before Rust had even reached its 1.0 version in 2015. The actual code that would become the base of the Rust-for-Linux project, linux-kernel-module-rust, was created in 2018 and maintained out of tree for several years.
Ojeda created the Rust-for-Linux GitHub organization in 2019, but it wasn't
until 2020 that the project really got going. During the
Linux Plumbers Conference
that year, a number of people
gave
a collaborative talk about the project. It was "a pipe dream
" at
that point, but enough people were interested that the project started to pick
up steam and a number of people joined.
Through the end of 2020 and the beginning of 2021, contributors put together the first patch set for in-tree Rust, set up the mailing list and Zulip chat instance, and got Rust infrastructure merged into linux-next. 2021 also saw the first Kangrejos, the Rust-for-Linux conference, run through LWN's BigBlueButton instance. In 2022, the set of Rust-for-Linux patches went from version 5 to version 10, before finally being merged in time for the Linux 6.1 long-term support release.
From that point on, the project (which had always been working with the upstream Rust project) began to collaborate with Rust language developers more closely. The project gained additional infrastructure, including a web site, and automatically rendered kernel documentation. Several contributors also began expanding the initial Rust bindings. Ojeda specifically called out work by Boqun Feng, Wedson Almeida Filho, Björn Roy Baron, Gary Guo, Benno Lossin, Andreas Hindborg, Alice Ryhl, Trevor Gross, and Danilo Krummrich, placing their contributions on a timeline to highlight the growth of the project over time.
The future
Ojeda ended his talk with a series of quotes that he had solicited from many different kernel developers and people associated with the Linux community about what they thought about the future of the project. People often form their impressions of what a community thinks of a topic based on what causes the most discussion — which has the effect of amplifying controversial opinions. To get a clear picture of what existing kernel maintainers think of the Rust-for-Linux experiment, Ojeda reached out to many people who had been involved in discussing the project so far — whether that was as a proponent or as a detractor. There were many more quotes than he could go through in the time remaining, but he did highlight a few as particularly important or insightful. Interested readers can find the full list of quotes in his slides.
Daniel Almeida, who has contributed several patches to the kernel, particularly around laying the groundwork for GPU drivers in Rust, said:
2025 will be the year of Rust GPU drivers. I am confident that a lot will be achieved once the main abstractions are in place, and so far, progress has been steady, with engineers from multiple companies joining together to tackle each piece of the puzzle. DRM maintainers have been very receptive too, as they see a clear path for Rust and C to coexist in a way that doesn't break their established maintainership processes. In fact, given all the buy-in from maintainers, companies and engineers, I'd say that Rust is definitely here to stay in this part of the kernel.
Ojeda agreed with Almeida's prediction, saying that he expected 2025 to be a big year for Rust in the graphics subsystem as well. Many Rust-for-Linux contributors were (unsurprisingly) also optimistic about the project's future. Even people who were less enthused about the language did generally agree that it was going to become a growing part of the kernel. Steven Rostedt, a kernel maintainer with contributions in several parts of the kernel, thought that the language would be hard for other kernel contributors to learn:
It requires thinking differently, and some of the syntax is a little counter intuitive. Especially the use of '!' for macros, but I did get use to it after a while.
Despite that, he thought Rust in the kernel was likely to continue growing. Even if Rust is eventually obsoleted by a safer subset of C for kernel programming, Rust would still have provided the push to develop things in that direction, Rostedt said. Wolfram Sang, who maintains many I2C drivers (an area the Rust-for-Linux project would like to expand into), also expressed concerns about the difficulty of learning Rust:
I am really open to including Rust and trying out what benefits it brings. Yet personally, I have zero bandwidth to learn it and no customer I have will pay me for learning it. I watched some high level talks about Rust in Linux and am positive about it. But I still have no experience with the language.
This left him with concerns around reviewing Rust code, saying that he "simply
cannot review a driver written in Rust
". He asked for help from someone who
is able to do that, and hoped to learn Rust incrementally in the process.
Luis Chamberlain, who among other things maintains the kernel's loadable module support, thought that there were still some blocking requirements for the adoption of Rust, but was eager to see them tackled:
I can't yet write a single Rust program, yet I'll be considering it for anything new and serious for the kernel provided we get gcc support. I recently learned that the way in which we can leverage Coccinelle rules for APIs in the kernel for example are not needed for Rust -- the rules to follow APIs are provided by the compiler for us.
While both mild reservations and cautious optimism were common responses to Ojeda's request for comment, several respondents were less restrained. Josef Bacik, a maintainer for Btrfs and the block device I/O controller, said:
I've been working exclusively in Rust for the last 3 months, and I don't ever want to go back to C based development again.
Rust makes so many of the things I think about when writing C a non-issue. I spend way less time dealing with stupid bugs, I just have to get it to compile.
[...]
I wish Rust were more successful in the linux kernel, and it will be eventually. Unfortunately I do not have the patience to wait that long, I will be working on other projects where I can utilize Rust. I think Rust will make the whole system better and hopefully will attract more developers.
The quotes Ojeda gathered expressed a lot of thoughtful, nuanced opinions on the future of the Rust-for-Linux project. To crudely summarize: the majority of responses thought that the inclusion of Rust in the Linux kernel was a good thing; the vast majority thought that it was inevitable at this point, whether or not they approved. The main remaining obstacles that were cited were the difficulty of learning Rust, which may be difficult to change, and GCC support, which has been in progress for some time.
The responding kernel developers also thought that, even though Rust-for-Linux was clearly growing, there was a lot more work to be done. Overall, the expectation seems to be that it will take several more years of effort to have all of the current problems with Rust's integration addressed, but there is a willingness — or at least a tolerance — to see that work done.
[While LWN could not attend FOSDEM in person this year, and the video of Ojeda's talk is not yet available, I did watch the stream of the talk as it was happening in order to be able to report on it.]
Index entries for this article | |
---|---|
Conference | FOSDEM/2025 |
Posted Feb 10, 2025 18:15 UTC (Mon)
by mb (subscriber, #50428)
[Link] (97 responses)
That's not correct. Crashing is safe. But it's really a bad idea to crash the kernel.
I think the more obvious advantage of the Rust version of "f" is that the references are guaranteed to point to valid memory. Which is not the case for the C variant.
Posted Feb 10, 2025 18:40 UTC (Mon)
by jengelh (guest, #33263)
[Link] (3 responses)
Posted Feb 11, 2025 20:31 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
* A C++ reference must be initialized to point at something, and cannot be NULL (or nullptr).
There is nothing which prevents you from destroying the pointee out from under the reference, which causes UB if you use the reference afterwards, so it is unsafe. It does prevent a few potentially dangerous patterns such as null pointers and pointers that get changed at runtime, but that is not the same thing as memory safety.
Posted Feb 13, 2025 2:14 UTC (Thu)
by milesrout (subscriber, #126894)
[Link] (1 responses)
Posted Feb 14, 2025 12:01 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link]
This will also happen if you use an operator, take AddAssign the trait implementing the += operator
description += " and then I woke up.";
... Is eventually just AddAssign::add_assign(&mut description, " and then I woke up.");
So if we lent out any reference to description that's still outstanding, we can't do this, likewise if we only have an immutable reference we can't use that to do this either. But the visible syntax shows no sign that this is the case.
In practice because humans are fallible, the most important thing is that this is checked by the compiler. It's valuable to know that my_function takes a mutable reference, when you're writing the software, when you're reading software other people wrote, and most definitely when re-reading your own software - but it's *most* valuable that the compiler checks because I might miss that, all three times.
Posted Feb 10, 2025 18:43 UTC (Mon)
by ncultra (✭ supporter ✭, #121511)
[Link] (92 responses)
Posted Feb 10, 2025 19:05 UTC (Mon)
by daroc (editor, #160859)
[Link] (91 responses)
So yes, an incorrect value returned to the rest of the kernel could cause a crash later. And the logic can always be incorrect. But safe Rust in never going to cause a null pointer dereference — which there's no good way to annotate in C. There are a number of properties like that.
Posted Feb 10, 2025 19:18 UTC (Mon)
by mb (subscriber, #50428)
[Link] (32 responses)
That way certain parts of the program can be guaranteed to not crash, because range checks have already been done at one earlier point and that assurance is passed downstream via the type system, for example.
But that needs a bit more complex example to show.
Posted Feb 11, 2025 17:18 UTC (Tue)
by acarno (subscriber, #123476)
[Link] (31 responses)
That said - the properties you encode in Ada aren't quite equivalent (to my understanding) as those in Rust. Ada's focus is more on mathematical correctness, Rust's focus is more on concurrency and memory correctness.
Posted Feb 11, 2025 18:28 UTC (Tue)
by mb (subscriber, #50428)
[Link] (30 responses)
For a simple example see https://doc.rust-lang.org/std/path/struct.Path.html
There are many much more complex examples in crates outside of the std library where certain operations cause objects to become objects of other types because the logical/mathematical properties change. This is possible due to the move-semantics consuming the original object, so that it's impossible to go on using it. And with the new object type it's impossible to do the "old things" of the previous type.
Posted Feb 11, 2025 21:37 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 12, 2025 0:06 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (28 responses)
1. If some precondition holds, you either already have an instance of that type, or can easily get one.
To avoid repeating myself over and over again: Assume that all of these rules are qualified with "..., unless you use some kind of escape hatch, like a type cast, reflection, unsafe, etc."
(1) and (4) are doable in nearly every reasonable programming language. (2) is possible in most languages that have visibility specifiers like pub or private. (5) is impossible in most "managed" languages because they (usually) require objects to have some minimal runtime gunk for purposes such as garbage collection and dynamic dispatch, but systems languages should be able to do at least the zero-cost wrapper without any difficulty, and managed languages may have smarter optimizers than I'm giving them credit for.
(3) is the real sticking point for most languages. It is pretty hard to do (3) without some kind of substructural typing, and that is not something that appears in a lot of mainstream programming languages besides Rust.
Just to briefly explain what this means:
* A "normal" type may be used any number of times (in any given scope where a variable of that type exists). Most types, in most programming languages, are normal. Substructural typing refers to the situation where at least one type is not normal.
It can be argued that Rust's types are affine at the level of syntax, but ultimately desugar into linear types because of the drop glue (i.e. the code emitted to automatically drop any object that goes out of scope, as well as all of its constituent parts recursively). If there were an option to "opt out" of generating drop glue for a given type (and fail the compilation if the type is used in a way that would normally generate drop glue), then Rust would have true linear types, but there are a bunch of small details that need to be worked out before this can be done.
Posted Feb 12, 2025 20:02 UTC (Wed)
by khim (subscriber, #9252)
[Link] (10 responses)
Ughm… before it would be usable you wanted to say? Isn't that trivial? Like this The big question is: what to do about
Posted Feb 12, 2025 20:14 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (9 responses)
Posted Feb 12, 2025 20:30 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link]
Posted Feb 12, 2025 20:30 UTC (Wed)
by khim (subscriber, #9252)
[Link] (7 responses)
Isn't that how linear types would work in any other language, too? Precisely because of halting problem such systems have to reject certain programs that are, in fact, valid. That's what Rust does with references, too. And like usual you may solve the issue with If you would attempt to use this code you'll find out that biggest issue is not with this, purely theoretical, problem, but with much more practical issue: lots of crates assume that types are affine and so many code fragments that “should be fine” in reality are generating drops at failure cases. Not in sense “the compiler have misunderstood something and decided to materialize drop that's not needed there”, but in a sense “the only reason drop shouldn't be used at runtme here is because of properties that compiler doesn't even know about and couldn't verify”. Unwinding `panic!` is the most common offender, but there are many others. IOW: problem with linear types in Rust are not with finding better syntax for them, but with the decisions about what to do with millions of lines of code that's already written… and that's not really compatible with linear types.
Posted Feb 12, 2025 22:02 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (6 responses)
Posted Feb 12, 2025 22:18 UTC (Wed)
by khim (subscriber, #9252)
[Link] (5 responses)
But that's precisely what I'm talking about. If you would take this exact Linear types are easy to achieve in Rust… but then you need to rewrite more-or-less all I don't see how changes to the language may fix that. If you go with linking tricks then you can cheat a bit, but “real” linear types, if they would be added to the language, would face the exact same issue that we see here: compiler would need to prove that it doesn't need drop in exact same places and would face the exact same issues with that. IOW: changing compiler to make “real” linear types and changing the compiler to make If someone is interested in bringing them to Rust that taking that implementation that already exist and looking on the changes needed to support it would be much better than discussions about proper syntax.
Posted Feb 12, 2025 22:42 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
> IOW: changing compiler to make “real” linear types and changing the compiler to make static_assert-based linear types work are not two different kinds of work, but, in fact, exactly the same work.
Currently, the reference[1] says that this is allowed to fail at compile time:
if false {
I agree that, if a given type T is manifestly declared as linear, then the compiler does have to prove that <T as Drop>::drop is never invoked. But I tend to assume that no such guarantee will be provided for arbitrary T, because Rust compile times are already too slow as it is, so you will have to use whatever bespoke syntax they provide for doing that, or else it will continue to be just as brittle as it is now.
Speaking of bespoke syntax, the most "obvious" spelling would be impl !Drop for T{} (like Send and Sync). But if you impl !Drop, then type coherence means you can't impl Drop, and therefore can't put a const assert in its implementation. Maybe they will use a different spelling, to allow for (e.g.) drop-on-unwinding-panic to happen (as a pragmatic loophole to avoid breaking too much code), but then the const assert probably won't work either (because the compiler will not attempt to prove that the panic drop glue is never invoked).
[1]: https://doc.rust-lang.org/reference/expressions/block-exp...
Posted Feb 12, 2025 23:02 UTC (Wed)
by khim (subscriber, #9252)
[Link] (3 responses)
C++17 solved that with the if constexpr that have such guarantees. Why is it not a problem with C++, then? No. There are no need for that. Unlike traits resolution And yes, C++ have a rule that if something couldn't be instantiated then it's not a compile-time error if that instantiation is not needed. That included destructors. It was there since day one, that is, from year C++98 and while MSVC was notoriously bad at following that rule clang was always pretty precise. If some instantiations of that To the large degree it's chicken-end-egg issue: C++ have, essentially, build all it's advanced techniques around SFINAE and thus compilers learned to handle it well, in Rust very few developer even know or care that it's analogue exist in a language, thus it's not handled correctly in many cases. But no, it's not matter of complicated math or something that should slow down the compilation, on the contrary that's something that's relatively easy to implement: C++ is in existential proof. Yes, but, ironically enough, that would require a lot of work because if you do that then you are elevating the whole thing to the level of types, errors are detected pre-monomorphisation and now it's not longer an issue of implementing things carefully but it becomes a typesystem problem.
Posted Feb 13, 2025 0:14 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
It has to be in the type system in some form, or else generic code cannot coherently interact with it (unless we want to throw away literally all code that drops anything and start over with a system where drop glue is not a thing - but that's so obviously a non-starter that I cannot imagine you could be seriously proposing it). We do not want to recreate the C++ situation where all the types match, but then something blows up in monomorphization.
To be more concrete: Currently, in Rust, you can (mostly) figure out whether a given generic specialization is valid by reading its trait bounds. While it is possible to write a const block that imposes additional constraints, this is usually used for things like FFI and other cases where the type system doesn't "know" enough to stop us from doing something dangerous. Outside of those special cases (which typically make heavy use of unsafe), the general expectation is that if a type matches a given set of trait bounds, then I can use it as such.
This is useful for pedagogical reasons, but it is also just a heck of a lot more convenient. When I'm writing concrete code, I don't have to grep for std::static_assert or read compiler errors to figure out what methods I'm allowed to call. I can just read the trait bounds in the rustdoc and match them against my concrete types. When I'm writing generic code, I don't have to instantiate a bunch of specializations to try and shake out compiler errors one type at a time. The compiler will check the bounds for me and error out if I write something incoherent, even before I write any tests for my code.
But trait bounds are more than just a convenient linting service. They are a semver promise. If I have a function foo<T: Send>(t: T), I am not allowed to change it to foo<T: Send + Sync>(t: T) without breaking backwards compatibility. If you wrote foo::<RefCell>(cell) somewhere in your code, I have promised that that will continue to work in future releases, even if I never specifically thought about RefCell. Droppability completely breaks this premise. If droppability is only determined at monomorphization, then I can write a function bar<T>(t: T) -> Wrapper<T> (for some kind of Wrapper type) that does not drop its argument, and then later release a new version that has one unusual code path where the argument does get dropped (by one of its transitive dependencies, just to make the whole thing harder to troubleshoot). Under Rust as it currently exists, that is not a compatibility break, and it would be very bad if it was. We would have to audit every change to every generic function for new drop glue, or else risk breaking users of undroppable types. Nobody is actually going to do that, so we're simply not going to be semver compliant in this hypothetical.
Posted Feb 13, 2025 8:00 UTC (Thu)
by khim (subscriber, #9252)
[Link] (1 responses)
Well… it's like Greenspun's tenth rule. Rust didn't want to “recreate the C++ situation” and as a result it just made bad, non-functional copy. As you saw “something blows up in monomorphization” is already possible, only it's unreliable, had bad diagnosis and in all aspects worse that C++. Perfect is enemy of good and this story is great illustration IMNSHO. Yeah. A great/awful property which works nicely, most of the time, but falls to pieces when you really try to push. It's convenient till you need to write 566 methods instead of 12. At this point it becomes both a PITA and compilation times skyrocket. But when these bounds are not there from the beginning they are becoming more of “semver rejection”. That's why we still have no support for lending iterator in At some point you have to accept that your language couldn't give you perfect solution and can only give you good one. Rust developers haven't accepted that yet and thus we don't have solutions for many issues at all. And when choice is between “that's impossible” and “that's possible, but with caveats” practical people pick the latter option. Except, as you have already demonstrated, that's not true, this capability already exist in Rust – and “perfect solutions” don't exist… after 10 years of development. Maybe it time to accept the fact that “perfect solutions” are not always feasible. That's already a reality, Rust is already like this. It's time to just accept that.
Posted Feb 13, 2025 18:19 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link]
This is exactly the same attitude that everyone had towards borrow checking before Rust existed.
Sure, *eventually* you probably do have to give up and stop trying to Change The World - a language has to be finished at some point. But Rust is very obviously not there yet. It would be a shame if they gave up on having nice, well-behaved abstractions just because some of the theory is inconveniently complicated. Traits and bounds are improving, slowly, along with most of the rest of the language. For example, the standard library already makes some use of specialization[1][2], a sorely missing feature that is currently in the process of being stabilized.
Rust is not saying "that's impossible." They're saying "we want to take the time to try and do that right." I say, let them. Even if they fail, we can learn from that failure and adapt. But if you never try, you can never succeed.
[1]: https://rust-lang.github.io/rfcs/1210-impl-specialization.html
Posted Feb 12, 2025 20:28 UTC (Wed)
by plugwash (subscriber, #29694)
[Link] (16 responses)
There are two problems though.
1. A type can be forgotten without being dropped, known as "leaking" the value. There was a discussion in the run up to rust 1.0 about whether this should be considered unsafe, which ultimately came down on the side of no.
Posted Feb 12, 2025 21:55 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (15 responses)
A weakly-undroppable type may not be dropped. It may be leaked, forgotten, put into an Arc/Rc cycle, or smuggled inside of ManuallyDrop. It may also be dropped on an unwinding panic (with the understanding that panics are bad, unwinding is worse, and some features just won't play nicely with them). It does not provide a safety guarantee, so unsafe code must assume that undroppable types may still get lost by other means.
A strongly-undroppable type may not be dropped, and additionally provides a safety guarantee that its drop glue is never called. It cannot be dropped on unwinding panic, so such panics are converted into aborts if they would drop the object (you can still use catch_unwind to manually handle the situation, if you really want to). Unsafe code may assume that the object is never dropped, but still may not make any other assumptions about the object's ultimate fate.
An unleakable type may not be leaked or forgotten. It must always be dropped or destructured. You may not put it into Rc, Arc, std::mem::forget, ManuallyDrop, MaybeUninit (which is just ManuallyDrop in a funny hat), Box::leak, or anything else that could cause it to become leaked (unsafe code may do some or all of these things, but must not actually leak the object). You also may not define a static of that type, because statics don't get dropped at exit and are functionally equivalent to leaking an instance on startup.
A strongly-linear type is both strongly-undroppable and unleakable. It cannot be dropped or leaked by any means, and is subject to all of the constraints listed above. It may only be destructured by code that has visibility into all of its fields.
Now my commentary:
* Weakly-undroppable types are already very useful as a form of linting. For example, the std::fs::File type currently does not have a close() method, because the type already closes itself on drop. But that means that it swallows errors when it is closed. The documentation recommends calling sync_all() (which is equivalent to fsync(2)) if you care about errors, but I imagine that some filesystem developers would have choice words about doing that in lieu of checking the error code from close(2). If File were weakly-undroppable, then it could provide a close() method that returns errors (e.g. as Result<()>) and fails the compilation if you forget to call it. This isn't a safety issue since the program won't perform UB if you forget to close a file, so we don't need a strong guarantee that it is impossible to do so. We just want a really strong lint to stop the user from making a silly mistake. It would also help with certain problems involving async destructors, but I don't pretend to understand async nearly well enough to explain that. On the downside, it would interact poorly with most generic code, and you'd probably end up copying the semantics of ?Sized to avoid massive backcompat headaches (i.e. every generic type would be droppable by default, and would need to be qualified as ?Drop to allow undroppable types).
Of course, there is another problem: We cannot guarantee that an arbitrary Turing-complete program makes forward progress. If the program drops into an infinite loop, deadlock, etc., then no existing object will ever get cleaned up, meaning that everything is de facto leaked whether our types allow for it or not. To some extent, this is fine, because a program stuck in an infinite loop will never execute unsafe code that makes assumptions about how objects are cleaned up. To some extent, it is not fine, because we can write this function (assuming that ?Leak means "a type that can be unleakable"):
fn really_forget<T: Send + 'static + ?Leak>(t: T){
...or some variation thereof, and there is probably no general way to forbid such functions from existing. So any type that is Send + 'static (i.e. has no lifetime parameters and can be moved between threads) should be implicitly Leak.
The "obvious" approach is to make 'static imply Leak, and require all unleakable (and maybe also undroppable) types to have an associated lifetime parameter, which describes the lifetime in which they are required to be cleaned up. More pragmatically, you might instead say that 'static + !Leak is allowed as a matter of type coherence, but provides no useful guarantees beyond 'static alone, and unsafe code must have a lifetime bound if it wants to depend on something not leaking. I'm not entirely sure how feasible that is in practice, but it is probably more theoretically sound than just having !Leak imply no leaks by itself, and unsafe code probably does want to have a lifetime bound anyway (it provides a more concrete and specific guarantee than "no leaks," since it allows you to assert that object A is cleaned up no later than object B).
Posted Feb 12, 2025 22:33 UTC (Wed)
by farnz (subscriber, #17727)
[Link] (5 responses)
Note that Rust could, with some effort, have a method fn close (self) -> io::Result<()>, without the weakly-undroppable property, so that developers who really care can get at the errors from closing a file. It'd be stronger if it'd fail the compilation if you forgot to call it, but it'd resolve the issue with those filesystem developers.
In practice, though, I'm struggling to think of a case where sync_all( also known as fsync(2)) is the wrong thing, and checking returns from close(2) is the right thing. The problem is that close returning no error is a rather nebulous state - there's not really any guarantees about what this means, beyond Linux telling you that the FD is definitely closed (albeit this is non-standard - the FD state is "unspecified" on error by POSIX) - whereas fsync at least guarantees that this file's data and its metadata are fully written to the permanent storage device.
Posted Feb 12, 2025 23:06 UTC (Wed)
by NYKevin (subscriber, #129325)
[Link] (4 responses)
High-performance file I/O is an exercise in optimism. By not calling fsync, you accept some probability of silent data loss in exchange for higher performance. But there's an even more performant way of doing that: You can just skip all the calls to File::write(), and for that matter, skip opening the file altogether, and just throw away the data now.
Presumably, then, it is not enough to just maximize performance. We also want to lower the probability of data loss as much as possible, without compromising performance. Given that this is an optimization problem, we can imagine various different points along the tradeoff curve:
* Always call fsync. Zero probability of silent data loss (ignoring hardware failures and other things beyond our reasonable control), but slower.
Really, it's the middle one that makes no sense, since one main memory cache miss is hardly worth writing home about in terms of performance. Maybe if you're writing a ton of small files really quickly, but then errno will be in cache and the performance cost becomes entirely unremarkable.
Posted Feb 13, 2025 10:13 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (3 responses)
This is the core of our disagreement - as far as I can find, the probability of silent data loss on Linux is about the same whether or not you check the close return code, with the exception of NFS. Because you can't do anything with the FD after close, all FSes but NFS seem to only return EINTR (if a signal interrupted the call) or EBADF (you supplied a bad file descriptor), and in either case, the FD is closed. NFS is slightly different, because it can return the server error associated with a previous write call, but it still closes the FD, so there is no way to recover.
Posted Feb 13, 2025 17:54 UTC (Thu)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
Of course error-on-close is recoverable, you just delete the file and start over. Or if that doesn't work, report the error to the user so that they know their data has not been saved (and can take whatever action they deem appropriate, such as saving the data to a different filesystem, copying the data into the system clipboard and pasting it somewhere to be preserved by other means, etc.).
Posted Feb 13, 2025 18:12 UTC (Thu)
by Wol (subscriber, #4433)
[Link]
Until you can't start over ... which is probably par for the course in most data entry applications ...
Cheers,
Posted Feb 14, 2025 10:49 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
Deleting the file is the worst possible thing to do with an error on close - two of the three are cases where the data has been saved, and it's an oddity of your code that resulted in the error being reported on close. The third is one where the file is on an NFS mount, the NFS server is set up to write data immediately upon receiving a command (rather than on fsync, since you won't get a delayed error for a write if the NFS server is itself caching) and you didn't fsync before close (required on NFS to guarantee that you get errors).
But even in the latter case, close is not enough to guarantee that you get a meaningful error that tells you that the data has not been saved - you need fsync, since the NFS server is permitted to return success to all writes and closes, and only error on fsync.
And just to be completely clear, I think this makes error on close useless, because all it means in most cases is either "your program has a bug" or "a signal happened at a funny moment". There's a rare edge case if you have a weird NFS setup where an error on close can mean "data lost", but if you're not in that edge case (which cannot be detected programmatically, since it depends on the NFS server's configuration), the two worst possible things you can do if there's an error on close are "delete the file (containing safe data) and start over" and "report to the user that you've saved their data, probably, so that they can take action just in case this is an edge case system.
On the other hand, fsync deterministically tells you either that the data is as safe as can reasonably be promised, or thatit's lost, and you should take action.
Posted Feb 13, 2025 13:24 UTC (Thu)
by daroc (editor, #160859)
[Link] (8 responses)
You are of course right in general; the price that Idris and Agda pay for being able to say that some programs terminate is that the termination checker is not perfect, and will sometimes disallow perfectly okay programs. So I don't think it's necessarily a good idea for Rust to add termination checking to its type system, but it is technically a possibility.
Posted Feb 13, 2025 15:13 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (7 responses)
Can't Rust have several checkers? If any one of them returns "okay", then the proof has passed and the code is okay. The programmer could then also add hints, maybe saying "run these checkers in this order", or "don't bother with these checkers", or whatever. So long as the rule is "any positive result from a checker is okay", that could reduce the checking time considerably.
Oh - and I don't know how many other languages do this sort of thing, but DataBASIC decrements -1 to 0 to find the end of an array :-) It started out as a "feature", and then people came to rely on it so it's standard documented behaviour.
(I remember surprising a C tutor by adding a bunch of bools together - again totally normal DataBASIC behaviour, but it works in C as well because I believe TRUE is defined as 1 in the standard?)
Cheers,
Posted Feb 13, 2025 15:34 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (5 responses)
If we could ensure that "undecided" was small enough, we'd not have a problem - but the problem we have is that all known termination checkers reject programs that humans believe terminate.
Posted Feb 13, 2025 18:01 UTC (Thu)
by Wol (subscriber, #4433)
[Link] (4 responses)
It only takes one checker to return success, and we know that that Rust code is okay. So if we know (or suspect) which is the best checker to run, why can't we give Rust hints, to minimise the amount of checking Rust (has to) do. Which then means we can run more expensive checkers at less cost.
Cheers,
Posted Feb 13, 2025 18:31 UTC (Thu)
by daroc (editor, #160859)
[Link] (2 responses)
So the tradeoff will always be between not being able to check this property, or being able to check it but rejecting some programs that are probably fine.
That said, I do think there is a place for termination checking in some languages — the fact that Idris has it lets you do some really amazing things with dependent typing. Whether _Rust_ should accept that tradeoff is a matter of which things it will make harder and which things it will make easier, not just performance.
Posted Feb 14, 2025 15:12 UTC (Fri)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Feb 14, 2025 15:27 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
Idris then uses this to determine whether it can evaluate a function at compile time (total functions) or whether it must defer to runtime (partial functions). This becomes important because Idris is dependently typed, so you can write a type that depends on the outcome of evaluating a function; if that function is total, then the type can be fully checked at compile time, while if it's partial, it cannot.
Posted Feb 14, 2025 11:02 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
Posted Feb 13, 2025 16:06 UTC (Thu)
by adobriyan (subscriber, #30858)
[Link]
It works because "_Bool + _Bool" is upcasted to "int + int" first and then true's are implicitly upcasted to 1's.
Posted Feb 11, 2025 14:33 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (56 responses)
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3422.pdf>
There's a lot of work in making C great again (lol). There are certainly rough corners in C, and they should be fixed, but dumping the language and learning a completely new language, just to eventually find that the new language has different rough corners, is a bad idea. Let's fix the language instead.
Posted Feb 11, 2025 14:51 UTC (Tue)
by intelfx (subscriber, #130118)
[Link] (35 responses)
Not to come off as a zealot, but I'm really skeptical that nullability annotations can cover even a fraction of the convenience and safety benefits that pervasive use of Option<> brings to Rust (let alone ADTs in general, because there's so much more than just Option<>).
Reducing this to "C has rough corners, so what, Rust has different rough corners too" feels almost disingenuous.
Posted Feb 11, 2025 15:12 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (34 responses)
Time will tell. If _Optional proves to not be enough, and there's something else needed in C to have strong safety against NULL pointers, we'll certainly research that. So far, _Optional looks promising.
Posted Feb 11, 2025 20:22 UTC (Tue)
by khim (subscriber, #9252)
[Link] (33 responses)
To understand why addition of Just look on the signature of strstr: it takes two pointers to immutable strings… yet returns pointer to a mutable string! WTH? What's the point? What kind of safety it is? The point is that this function was created before introduction of Both C++ and Rust solve the issue in the same way: instead of trying to decide whether the result is mutable or immutable string When you add new invariants to the type system to really fully benefit from them one needs to, essentially, rewrite everything from scratch… and if you plan to rewrite all the code anyway, then why not pick another, better and more modern language? P.S. The real irony is, of course, that kernel developers understand that better than anyone. They pretty routinely do significant and complicated multi-year surgery to rebuild the whole thing on the new foundations (how many years did it took to remove BKL, remind me?), but when the proposal is not to replace the language… the opposition becomes religious and not technical, for some reason…
Posted Feb 11, 2025 22:23 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (32 responses)
You may be happy that C23 changed that.
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3220.pd...>
The prototype of strstr(3) is now
QChar *strstr(QChar *s1, const char *s2);
which means it returns a pointer that is const-qualified iff s1 is const-qualified. It is (most likely) implemented as a type-generic macro with _Generic().
The other string APIs have of course been improved in the same way.
So yes, the language evolves slowly, but it continuously evolves into a safer dialect.
---
> And everyone remembers the fate of noalias, isn't it?
We're discussing something regarding noalias. It's not easy to come up with something good enough, though. Don't expect it to be there tomorrow. But it's not been forgotten.
---
> Both C++ and Rust solve the issue in the same way: instead
Having researched into string APIs for some years almost half of my workday, returning a pointer is usually more useful. You just need to get right the issue with qualifiers. It needed _Generic(), but we've arrived there, finally.
Posted Feb 11, 2025 22:53 UTC (Tue)
by khim (subscriber, #9252)
[Link] (31 responses)
So it took 33 years to replace one function. Great. How quickly would it propagate through project of Linux size, at this rate? 330 or 3300 years? Sure, but evolution speed is so glacial that it only makes sense if you postulate, by fiat, that full rewrite in another language is not an option. And don't forget that backers may, at some point, just throw up in towel, unable to deal with excruciating inability to change anything of substance. Apple have Swift, Google is still picking between Rust and Carbon, but eventual decision would be one or another, but, notably, not C or C++, Microsoft seems to think about Rust too… so who would be left to advance C and C++ after all players that actually wanted to change it would leave? The question is not whether it's forgotten or not, but if we can expect to see program with most pointers either market And the simple answer, given the above example, is that one may spend maybe 10 or 20 years rewriting Linux kernel in Rust (yes, that's big work, but a journey of a thousand miles begins with a single step)… or go with with C – and then never achieve that. Simply because in 50 or 100 year, well before C would become ready to adopt such paradigm, everything would be rewritten in something else than C anyway. Simply because it's hard to find about under 40 (let alone anyone under 30) who may even want to touch C if they have a choice. No, it's not. It's only “more useful” if you insist on zero-string abominations. If your strings are proper slices (or standalone strings on the heap… only C conflates them, C++, Rust and even such languages as Java and C# have separate types) then returning pointer is not useful. You either need to have generic type that returns slice or return index. And returning index is more flexible. No, you also need to guarantee that C would continue to be used. That's a tall order. I wonder why no one ever made proper C/C++ replacement (as in: language that is designed to interoperate with C/C++ but is not built on top of C/C++ core) before Rust… but now, when it's done, we may finally face the question about why should we continue to support strange and broken C semantics with null-terminated strings… invent crazy schemes, CPU extensions – all to support something that shouldn't have existed in the first place. That's not the question for the next 3-5 years, but in 10 years… when world would separate into competing factions… it would be interesting to see if any of them would stay faithful to C/C++ and what would they pick up instead to develop “sovereign software lands”.
Posted Feb 11, 2025 23:22 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (10 responses)
Not one. The entire libc has been updated in that direction.
> How quickly would it propagate through project of Linux size, at this rate?
The problem was devising the way to do it properly. Once we have that, the idea can be propagated more easily. It's not like you need a decade to update each one function.
The kernel can probably implement this for their internal APIs pretty easily. The kernel already supports C11, so all the pieces are there. The bottleneck is in committer and reviewer time.
> Simply because it's hard to find about under 40 (let alone anyone under 30) who may even want to touch C if they have a choice.
I'm 31. I hope to continue using C for many decades. :)
> It's only “more useful” if you insist on zero-string abominations.
I do enjoy NUL-terminated strings, yes, which is why I find returning pointers more useful. The problems with strings, why they have been blamed for so long, wasn't really fault of strings themselves, but of the language, which wasn't as expressive as it could be. That's changing.
Posted Feb 11, 2025 23:45 UTC (Tue)
by khim (subscriber, #9252)
[Link] (7 responses)
You certainly would be able to do that, after all Cobol 2023 and Fortran 2023 both exist. The question is how many outside of “Enterprise” (where nothing is updated… ever, except when it breaks down and falls apart completely) would care. The problem with strings are precisely strings. It's not even the fact that NUL is forbidden to be used inside (after all Rust strings are guaranteed UTF-8 inside). The problem lies with the fact that something that should be easy and simple (just look in register and you know the length) is incredibly hard with null-terminated strings. It breaks speculations, requires special instructions (like what SSE4.2 added or “no fault vector load”, used by RISC-V), plays badly with many algorithms (why operation that shouldn't change anything in memory at all – like splitting string in two – ever changes anything?). Null-terminated strings are not quite up there with the billion dollar mistake but they are very solid contenders for the 2nd place. Try again. _Generic is C11 and tgmath is C99. Means were there for 12 or 24 years (depending on how you are counting), there was just no interest… till 100% guaranteed job-security stance that C would never be replaced (simply because all prospective “C killers” were either built around C core or were unable to support effective interop with C) was threatened by Rust and Swift. Then and only then wheels started moving… but I'm pretty sure they would pretty soon be clogged again… when it would be realized that on one side only legacy projects are interested in using C anyway and the other side the majority of legacy projects just don't care to change anything unless they are forced to do that. Yeah, but that's precisely the issue: while existing kernel developers may want to perform such change they are already already overworked and overstressed… and newcomers, normally, want nothing to do with C. I guess the fact that exceptions like you exist gives it a chance… but it would be interesting to see how it'll work. Kernel is one of the few projects that can actually pull that off.
Posted Feb 13, 2025 8:46 UTC (Thu)
by aragilar (subscriber, #122569)
[Link] (6 responses)
Posted Feb 13, 2025 9:59 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (5 responses)
Posted Feb 13, 2025 10:28 UTC (Thu)
by khim (subscriber, #9252)
[Link] (4 responses)
How would backports hurt anyone? Sure, you can only use GCC 12 on RHEL 7, but that beast was released more than ten years ago, before first version of Rust, even! Sure, at some point backporting stops, but I don't think the hold ups are “enterprise distros” (at least not RHEL specifically): these, at least, provide some updated toolchains. GCC 12 was released in a year 2022, thus it's pretty modern, by C standards. “Community distros” don't bother, most of the time.
Posted Feb 14, 2025 14:26 UTC (Fri)
by taladar (subscriber, #68407)
[Link] (3 responses)
Posted Feb 14, 2025 14:29 UTC (Fri)
by khim (subscriber, #9252)
[Link] (2 responses)
Who even cares what they use for he development of the platform itself? Developers shouldn't even care about that, it's internal implementations detail.
Posted Feb 17, 2025 8:49 UTC (Mon)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Feb 17, 2025 9:15 UTC (Mon)
by khim (subscriber, #9252)
[Link]
How is that relevant? Linux kernel was all too happy to adopt features not implemented by clang, and patches needed to support clang – and clang, at that point, was already used by Android, the most popular Linux distrubution used by billions… why RHEL should be treated differently? Let RHEL developers decide what to do with their kernel: they can create special kgcc package (like they already did years ago) or rework features in any way they like.
Posted Feb 12, 2025 6:14 UTC (Wed)
by interalia (subscriber, #26615)
[Link] (1 responses)
In theory the kernel could switch easily enough given review time as you say, but would doing this also require bumping the required compiler version for the kernel? If so I'm not sure if they would feel safe for doing so for quite a few years, and Rust would also advance in the meantime.
Posted Feb 12, 2025 8:41 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link]
Any compiler that supports C11 should be able to support these.
Here's an example of how to write such a const-generic API:
```
#define my_strchr(s, c) \
Posted Feb 12, 2025 11:17 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (19 responses)
Actually, I worked at a project that used counted strings (not terminated by a NUL, unless we needed to pass them to syscalls), and even there, functions returning a pointer were overwhelmingly more used than ones returning a count.
Consider the creation of a counted string:
```
Equivalent code that uses a count would be more complex (and thus more unsafe):
```
Posted Feb 12, 2025 12:04 UTC (Wed)
by excors (subscriber, #95769)
[Link] (17 responses)
sds s = sdsempty();
(and in the unlikely event that you're doing a lot of concatenation and really care about minimising malloc calls, you can add `s = sdsMakeRoomFor(s, sdslen(s1) + sdslen(s2) + sdslen(s3));` near the top). That makes it both simpler and safer than the original code. You should never be directly manipulating the length field.
(Of course in almost all other languages the equivalent code would be `s = s1 + s2 + s3;` which is even more simpler and safer.)
Posted Feb 12, 2025 12:40 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (16 responses)
I disagree with the last sentence. It was true in the past, without powerful static analyzers. Managed memory within APIs hides information to the compiler (and static analyzer), and thus provides less safety overall, provided that you have a language expressive enough and a static analyzer powerful enough to verify the program.
Consider the implementation of mempcpy(3) as a macro around memcpy(3) (or an equivalent inline function that provides the same information to the compiler):
#define mempcpy(dst, src, n) (memcpy(dst, src, n) + n)
A compiler (which knows that memcpy(3) returns the input pointer unmodified; this could be expressed for arbitrary APIs with an attribute in the future, but for now the compiler knows memcpy(3) magically) can trace all offsets being applied to the pointer 'p', and thus enforce array bounds statically. You don't need dynamic verification of the code.
With a managed string like you propose, you're effectively blinding the compiler from all of those operations. You're blindly handling the trust into the string library. If the library has a bug, you'll suffer it. But also, if you misuse the library, you'll have no help from the compiler.
Posted Feb 12, 2025 12:49 UTC (Wed)
by khim (subscriber, #9252)
[Link] (15 responses)
Why? What's the difference? If everything is truly “static enough” then managed string can be optimized away. That's not a theory, if you would look on Rust's example then temporary string is completely elided and removed from the generated code, C compiler (which is, essentially, the exact same compiler) should be able to do the same. So you would trust your ad-hoc code, but wouldn't trust widely tested and reviewed library. Haven't the history of Linux kernel fuzzing shown us that this approach simply doesn't work?
Posted Feb 12, 2025 13:04 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (14 responses)
I personally use NUL-terminated strings because they require less (almost none) ad-hoc code. I'm working on a hardened string library based on <string.h>, providing some higher-level abstractions that preclude the typical bugs.
<https://github.com/shadow-maint/shadow/tree/master/lib/st...>
> Why? What's the difference?
Complexity. Yes, you can write everything inline and let the compiler analyze it. But the smaller the APIs are, the less work you impose on the analyzer, and thus the more effective the analysis is (less false negatives and positives). You can't beat the simplicity of <string.h> in that regard.
Posted Feb 12, 2025 13:16 UTC (Wed)
by khim (subscriber, #9252)
[Link] (13 responses)
Nope. Things don't work like that. Smaller API may help human to manually optimize things, because humans are awfully bad at keeping track of hundreds and thousands of independent variables, but really good at finding non-trivial dependencies between few of them. Compiler optimizer is the exact opposite: it doesn't have smarts to glean all possible optimizations from a tiny, narrow, API, but it's extremely good at finding and eliminating redundant calculations in different pieces on thousands lines of code. Possibly. And if your goal is something extremely tiny (like code for a smallest possible microcontrollers) then this may be a good choice (people have successfully used Rust on microcontrollers, but usually without standard library since it's too bit for them). But using these for anything intended to be used on “big” CPUs with caches measured in megabytes? Why?
Posted Feb 12, 2025 13:27 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (12 responses)
I never cared about optimized code. I only care about correct code. C++ claims to be safer than C, among other things by providing (very-)high-level abstractions in the library. I think that's a fallacy.
There's a reason why -fanalyzer works reasonably well in C and not in C++. All of that complexity triggers many false positives and negatives. Not being able to run -fanalyzer in C++ makes it a less safe language, IMO.
The optimizer might be happy with abstractions, but the analyzer not so much. I care about the analyzer.
> But using these for anything intended to be used on “big” CPUs with caches measured in megabytes? Why?
Safety.
My string library has helped find and fix many classes of bugs (not just instances of bugs) from shadow-utils. It's a balance between not adding much complexity (not going too high-level), but going high enough that you get rid of the common classes of bugs, such as off-by-ones with strncpy(3), or passing an incorrect size to snprintf(3), with for example a macro that automagically calculates the size from the destination array.
You'd have a hard time introducing bugs with this library. Theoretically, it's still possible, but the library makes it quite difficult.
Posted Feb 12, 2025 13:53 UTC (Wed)
by khim (subscriber, #9252)
[Link] (2 responses)
Then why are you even using C and why do we have this discussion? No, it's not. The fact that we have complicated things like browsers implemented in C++ but nothing similar was ever implemented in C is proof enough of that. C++ may not be as efficient than C (especially if we care about size and memory consumption) but it's definitely safer. But if you don't care about efficiency then any memory safe language would do better! Even BASIC! Why do you care about analyzer if alternative is to use something that simply makes most things that analyzer can detect impossible. Or even something like WUFFS if you need extra assurances? But again: all these tricks are important if your goal is speed first, safety second. If you primary goal is safety then huge range of languages from Ada to Haskell and even Scheme would be safer. These are all examples of bugs that any memory-safe language simply wouldn't allow. C++ would allow it, of course, but that's because C++ was designed to be “as fast C but safer”… one may discuss about if it achieved it or not, but if you don't target “as fast C” bucket then there are bazillion languages that are safer.
Posted Feb 13, 2025 10:43 UTC (Thu)
by alx.manpages (subscriber, #145117)
[Link] (1 responses)
Posted Feb 13, 2025 19:05 UTC (Thu)
by Cyberax (✭ supporter ✭, #52523)
[Link]
No, you don't. No human can keep track of all of the C pitfalls in non-trivial code.
Even the most paranoid DJB code for qmail had root holes, and by today's standards it's not a large piece of software.
Posted Feb 14, 2025 23:30 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
Yes, I agree. However, IIRC, it is because its main author (David Malcolm) is vastly more familiar with C than C++. Clang also has something like it in some of its `clang-tidy` checks, but I agree that GCC's definitely has a different set of things it covers, so they can coexist nicely.
Posted Feb 15, 2025 0:12 UTC (Sat)
by mb (subscriber, #50428)
[Link] (7 responses)
Why not use an interpreted language then?
>My string library has helped find and fix many classes of bugs ...
Sure. Thanks for that.
>but the library makes it quite difficult.
Modern languages make it about impossible.
Posted Feb 15, 2025 0:24 UTC (Sat)
by alx.manpages (subscriber, #145117)
[Link] (6 responses)
Because C is my "mother tongue" regarding computers. I can write it much better than other languages, just like I can speak Valencian better than other --possibly easier-- languages.
Posted Feb 15, 2025 0:51 UTC (Sat)
by mb (subscriber, #50428)
[Link] (5 responses)
That explains your "reasoning" indeed.
Posted Feb 15, 2025 22:29 UTC (Sat)
by alx.manpages (subscriber, #145117)
[Link] (4 responses)
Posted Feb 15, 2025 22:40 UTC (Sat)
by mb (subscriber, #50428)
[Link] (3 responses)
Posted Feb 15, 2025 23:05 UTC (Sat)
by alx.manpages (subscriber, #145117)
[Link] (2 responses)
Why did you put it in quotes? Were you implying that my reasoning is inferior than yours? Isn't that offensive? Please reconsider your language.
> And because "I always did it like this" isn't a reasoning that helps in discussions.
It is, IMO. I'm not a neurologist. Are you? I'm not a expert in how people learn languages and how learning secondary languages isn't as easy as learning a mother tongue. But it is common knowledge that one can speak much better their mother tongue than languages learned after it. It should be those that argue the opposite, who should justify.
Or should I take at face value that I learnt the wrong language, and that somehow learning a different one will magically make me write better *without regressions*? What if it doesn't? And why should I trust you?
Posted Feb 15, 2025 23:12 UTC (Sat)
by mb (subscriber, #50428)
[Link] (1 responses)
I will from now on block you here on LWN and anywhere else.
Posted Feb 15, 2025 23:31 UTC (Sat)
by alx.manpages (subscriber, #145117)
[Link]
Okay. You don't need to. Just asking me to not talk to you would work just fine. I won't, from now on. I won't block you, though.
Posted Feb 12, 2025 12:15 UTC (Wed)
by khim (subscriber, #9252)
[Link]
But why would you need all that complexity? If you work with strings a lot… wouldn't you have convenience methods? It Rust you would write something like this: Sure, NUL-terminated strings are a bed design from the beginning to the end, but also The only justification for that design is the need to produce something decent without optimizing compiler and in 16KB (or were they up to 128KB by then?) of RAM. Today you have more RAM in your subway ticket and optimizing compilers exist, why stick to all this manual manipulations where none are needed?
Posted Feb 11, 2025 20:13 UTC (Tue)
by roc (subscriber, #30627)
[Link] (17 responses)
And C has so many "rough edges". These aren't even the biggies. The complete lack of lifetime information in the type system, and the UB disaster, are much worse. Saying "well, Rust has rough edges too" and implying that that makes them kind of the same is misdirection.
Posted Feb 11, 2025 22:31 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (16 responses)
I suspect it's possible to add those guarantees in C with some new attribute that you could invent, or some other technique. There's an experimental compiler that did that (or so I heard). If someone adds such a feature to GCC or Clang, and proves that it makes C safer, I'm sure people will pick it up, and it will eventually be standardized.
Posted Feb 12, 2025 20:31 UTC (Wed)
by roc (subscriber, #30627)
[Link] (15 responses)
Posted Feb 12, 2025 20:53 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (14 responses)
<https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3423.pdf>
Some of the ideas might be adoptable in ISO C. The trade-offs, etc., I don't know them. I asked the author of the paper to propose standalone features that could be acceptable in ISO C, so that we can discuss them.
Who would adopt new C dialects that are safer? Programmers that want to keep writing C in the long term without having their programs replaced by rusty versions. I would.
Posted Feb 12, 2025 21:30 UTC (Wed)
by Cyberax (✭ supporter ✭, #52523)
[Link] (9 responses)
> TrapC memory management is automatic, cannot memory leak, with pointers lifetime-managed not garbage collected
Is impossible. You need to have something like a borrow-checker for that, and it requires heavy investment from the type system.
Without that, you're limited to region inference (like in Cyclone C), and it's not powerful enough for anything serious.
Posted Feb 17, 2025 18:02 UTC (Mon)
by anton (subscriber, #25547)
[Link] (8 responses)
The compiler would either prove that the code is safe or insert run-time checks, based on a sophisticated type system. I.e., one would get what rewriting in Rust gives, but one would need less effort, and could do it piecewise.
This work sounded promising, but there has not been the transfer from the research project into production. Instead, after the research project ended, even the results of the research project mostly vanished (many dead links). What I found is the Ivy package, Deputy and Heapsafe manual. But
Instead of adding such annotations to C, people started to rewrite stuff in Rust, which seems to be a more expensive proposition. My current guess is that it's a cultural thing: Many, too many C programmers think their code is correct, so there is no need to add annotations that may slow down the code. And those who think otherwise have not picked up the Ivy ideas, but instead switched to Rust when that was available.
Posted Feb 17, 2025 19:03 UTC (Mon)
by mbunkus (subscriber, #87248)
[Link] (2 responses)
Just look at most C/C++ code bases (other languages, too) & observe how many variables aren't marked "const" that easily could be. Or how many functions could be "static" but aren't. Or that the default is to copy pointers instead of moving them.
Rust has the huge advantage of having made the safe choices the default ones instead of the optional ones, and the compiler helps us remembering when we forget. In C/C++ all defaults are unsafe, and there's almost no help from the compiler.
Posted Feb 18, 2025 8:16 UTC (Tue)
by anton (subscriber, #25547)
[Link] (1 responses)
Posted Feb 18, 2025 16:11 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
Posted Feb 17, 2025 19:13 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (4 responses)
Ivy uses a garbage collector.
C just offloads the safety proof to the developer. Rust is really the first language that tries to _assist_ users with proving the lifetime correctness.
What's worse, there is no real way to make it much more different from Rust. Any other attempt to implement the lifetime analysis will end up looking very similar. We already see that with SPARK in Ada: https://blog.adacore.com/using-pointers-in-spark
Posted Feb 17, 2025 20:11 UTC (Mon)
by daroc (editor, #160859)
[Link] (3 responses)
There's
one project I've had my eye on that essentially replaces garbage collection with incremental copying and linear references. It's definitely not ready for production use yet, and is arguably still a form of garbage collection even though there's no pauses or separate garbage collector, but it's an interesting approach. Then there's languages like Vale that are experimenting with Rust-like approaches but with much better ergonomics.
None of which means you're wrong — your options right now are basically garbage collection, Rust, or manual memory management — but I do feel hopeful that in the future we'll see another academic breakthrough that gives us some additional options.
Posted Feb 17, 2025 22:13 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
I really doubt that the status quo (borrow checker or GC) is going to change. We probably will get more powerful primitives compatible with the borrow checker, though.
Posted Feb 17, 2025 22:48 UTC (Mon)
by daroc (editor, #160859)
[Link] (1 responses)
Posted Feb 18, 2025 8:40 UTC (Tue)
by anton (subscriber, #25547)
[Link]
Concerning porting from malloc()/free() to something that guarantees no dangling pointers, no double free() and maybe no leaking: The programmer who uses free() uses some reasoning why the usage in the program is correct. If the new memory-management paradigm allows expressing that reasoning, it should not be hard to port from malloc()/free() to the new memory-management paradigm. One problem here is that not free()ing malloc()ed memory is sometimes a bug (leakage) and sometimes fine; one can mark such allocations, but when the difference is only clear in usage of functions far away from the malloc(), that's hard.
Posted Feb 13, 2025 4:32 UTC (Thu)
by roc (subscriber, #30627)
[Link]
All that paper says about the TrapC compiler that it is "in development".
That document makes the extraordinary claim that "TrapC memory management is automatic, cannot memory leak, with pointers lifetime-managed not garbage collected". It nowhere explains how this is done, not even by example. Strange for such an extraordinary and important achievement.
I can see why C advocates want to believe that a memory-safe extension of C is just around the corner. I'll believe it when I see it.
Posted Feb 14, 2025 23:28 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Posted Feb 14, 2025 23:39 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link] (1 responses)
There _are_ attempts to do that with C. I know of this one: https://github.com/pizlonator/llvm-project-deluge/blob/de...
Posted Feb 15, 2025 22:22 UTC (Sat)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 11, 2025 20:15 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
That's a lot of "might" for an unknowable future. That's a bad gamble. And its predicated upon this irrational steady state/ zero sum idea that well, if Rust is better in some ways than C that just means it's worse in other ways. Not so. Rust isn't _perfect_ but that doesn't preclude being better. Seven needn't be the largest possible number in order to be a bigger number than three, Rust can be a better choice than C while BOTH of these propositions remain true: C was a good idea in the 1970s (fifty years ago!); Rust is not perfect and will itself be replaced in time.
Posted Feb 11, 2025 22:47 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link]
There are features having been accepted into C2y a few months ago, which will be available (most likely) in GCC 16. For example, there's the countof() operator.
<https://thephd.dev/the-big-array-size-survey-for-c-results>
My patch was actually ready around November, but I'm holding it due to lack of agreement in the name of the operator.
That's one of the features that will bring some safety to the language soon. And there are some obvious extensions that will make it even better. Consider this code:
```
That is, being able to reference the number of elements of an array even if it's an array parameter (thus, a pointer). That's not yet standardized, but we're working on getting this into GCC soon. There are several other features that will similarly make the language safer.
Features will arrive to GCC soon, even if there's not a release of ISO C in the near future.
> It might then be implemented across major compilers like GCC
The author of the proposal has already a working implementation in Clang. It's not yet in mainline Clang, but I don't expect that it would take much time to mainline it once it's been accepted into C2y. It might be way sooner than you expect.
> Rust isn't _perfect_ but that doesn't preclude being better.
I don't say it's not better. But most changes in the FOSS community are merged very slowly, precisely to prove that they are good. Pressing Rust is the opposite of that. Maybe Rust proves to be as good as it sounds, but only time will say it. Every change is suspect of being worse, until it is proven beyond reasonable doubt that it isn't.
Posted Feb 11, 2025 22:42 UTC (Tue)
by ojeda (subscriber, #143370)
[Link]
What I was trying to showcase is that, with the same amount of syntax/effort here, Rust gives us extra benefits that C does not.
In the slides I placed a grid of 4 combinations: (correct, incorrect) on one axis, (safe, unsafe) on the other. (I recommend opening the slides to see it).
The idea is that those are orthogonal -- all 4 combinations are possible. So, for instance, a perfectly correct C function that takes a pointer and dereferences it, will still always be considered "unsafe" in Rust terms.
In the "safe quandrants", we know the functions and their code are "safe" just by looking at them -- there is no need to look at other code or their callers. This is a "property" of the "source code" of those functions -- it is not a property of the binary, or something that requires whole-system analysis to know.
And knowing that is already valuable, since as implementors we know we will not introduce UB from within the function. And, as callers, that we will not introduce UB by just calling it.
There are caveats to that, of course (e.g. if we already had UB elsewhere, we can fabricate invalid inputs), but it is a very powerful distinction. For instance, if we copy-paste those two functions (i.e. even the incorrect one) into a safe program, even replacing an existing correct function, we shouldn't be able to introduce UB.
And this helps across time, too. In C, even if today you have a perfect C program, it is very hard to make a change that keeps it perfect, even just in terms of not triggering UB.
I hope that clarifies a bit. The "explanation" above is of course very informal and hand-wavy -- the idea in the talk was not to explain it in detail, but rather it was meant to be revealing for C developers, since it shows a "difference" that "is not there" (after all, the binaries end up being the same, no?), i.e. it tries to hint at what the concepts of "safe function" and "safe code" are about and get C programmers to think "hmm... that sounds interesting, I will look it up".
Posted Feb 10, 2025 18:56 UTC (Mon)
by roc (subscriber, #30627)
[Link] (6 responses)
Posted Feb 10, 2025 19:04 UTC (Mon)
by andreashappe (subscriber, #4810)
[Link] (5 responses)
reading through the thread (as well as through the lwn.net comments) was quite interesting. Lots of misquoting. I really liked the moderate takes (from both sides) and the calmed-down approach by dave airlie, etc.
Posted Feb 10, 2025 19:32 UTC (Mon)
by dralley (subscriber, #143766)
[Link] (2 responses)
It is extremely that he will merge the patch when it is submitted to him in the merge window, regardless of CH's NACK. But I personally wish he wouldn't just let the subject sit there and generate drama in the meantime. Even if the R4L devs themselves know what's up, it discourages everyone on the outside and leads to needless negativity.
Posted Feb 12, 2025 20:41 UTC (Wed)
by plugwash (subscriber, #29694)
[Link]
If he just quietly merges it as part of a big merge that is much harder to do.
Posted Feb 12, 2025 21:01 UTC (Wed)
by jmalcolm (subscriber, #8876)
[Link]
Linus has a unique super-power. He gets to decide what to merge and what not to (not in a sub-system but in the kernel). In the end, kernel maintainers only authority comes from the fact that Linus takes takes submissions from them.
So, the ultimate statement from Linus will simply be if he merges the Rust changes or not.
If he does, that is not only a VERY strong statement but it destroys the entire argument being leveled against Rust here. Once it is in, it is in. The whole "I need to maintain absolute purity to prevent the cancer" line of reasoning evaporates almost completely.
If Linus rejects it, he has the choice of again making it a very strong statement or of being very clear about the the technical reasons for the rejection. In which case, the whole thing because a technical debate as it should be.
In the end, I think Linus is making the right decision to keep his powder dry for the real strike taken at merge time. I also think this supports his ability to take a hard-line against the "social" side of the issue now and then the technical side of the issue later when he merges the code.
Posted Feb 10, 2025 19:36 UTC (Mon)
by boqun (subscriber, #154327)
[Link] (1 responses)
I think that response was particularly addressing Hector Martin's
"If shaming on social media does not work, then tell me what does, because I'm out of ideas."
and not for the posted patchset or any Rust-for-Linux process.
Although I appreciate Hector's support for Rust-for-Linux project, I must say I don't fully agree with what he proposed there.
> reading through the thread (as well as through the lwn.net comments) was quite interesting. Lots of misquoting. I really liked the moderate takes (from both sides) and the calmed-down approach by dave airlie, etc.
Posted Feb 11, 2025 6:58 UTC (Tue)
by tchernobog (guest, #73595)
[Link]
But there is a double standard at play here, if shaming on a public kernel mailing list is accepted.
They might have different reach and the way you "post a message" might be different, but the result is the same.
This is why I am disappointed that Linus's response is only addressing one half of the issue.
Posted Feb 10, 2025 21:43 UTC (Mon)
by raven667 (subscriber, #5198)
[Link] (7 responses)
Posted Feb 10, 2025 21:51 UTC (Mon)
by intelfx (subscriber, #130118)
[Link] (5 responses)
Posted Feb 11, 2025 8:53 UTC (Tue)
by edomaur (subscriber, #14520)
[Link]
Posted Feb 11, 2025 17:52 UTC (Tue)
by raven667 (subscriber, #5198)
[Link] (3 responses)
Posted Feb 11, 2025 21:39 UTC (Tue)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
* It should assume deep familiarity with C in general, and kernel C in particular. Don't explain what a pointer is or why UAF is a bad thing.
Posted Feb 12, 2025 5:33 UTC (Wed)
by mcon147 (subscriber, #56569)
[Link]
Posted Feb 12, 2025 7:27 UTC (Wed)
by johill (subscriber, #25196)
[Link]
For maintainers, I'd really like to see more coverage along the lines of what you described over in the typestate comment just now, because I think that's critical for designing the correct APIs for use within the kernel (or any bigger application). Not that I could do it, I have barely enough exposure to the ideas (and rust code) to maybe think it'd be a good idea.
(However, one of the first things I was envisioning being able to ensure with rust in a kernel context would, I believe, require true linear types, so having all the information should probably come with a description of the boundaries too.)
Posted Feb 14, 2025 18:36 UTC (Fri)
by sarahn (subscriber, #154471)
[Link]
Since UCSC offers or has offered courses on both Linux device drivers and on rust, they could probably be convinced to offer a "rust for Linux device driver developers" course.
Posted Feb 11, 2025 9:35 UTC (Tue)
by josh (subscriber, #17465)
[Link] (32 responses)
Rust does catch this: And in fairness to C, C can catch unused variables too, if you turn on the right (non-default) warning options: This is the kind of thing C projects don't turn on (it isn't part of -Wall, only -Wextra), or that C projects explicitly avoid because they find it annoying. Rust has it on by default, for this exact sort of occasion. (Rust can also catch a few kinds of semantic mismatches in very specific circumstances: try writing an `impl Sub` that adds rather than subtracting.)
Posted Feb 11, 2025 10:12 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (31 responses)
Posted Feb 11, 2025 11:20 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (2 responses)
C compilers could (not warn) for underscored names while C programmers are waiting for C23.
Posted Feb 11, 2025 11:28 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
Posted Feb 13, 2025 5:26 UTC (Thu)
by aaronmdjones (subscriber, #119973)
[Link]
Posted Feb 11, 2025 11:24 UTC (Tue)
by q3cpma (subscriber, #120859)
[Link] (2 responses)
Posted Feb 11, 2025 11:41 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
That leads to you wanting a simple and obvious way to suppress the warning when you're not yet using a parameter, and yet to still give it a name. The alternative (and what most C and C++ codebases I've seen do) is to simply not bother enabling the warning at all, because it's too noisy.
[1] First, you want the codebase to be warning free because that means that a new warning appearing is something to fix; you want warnings to tell you that something is wrong with the codebase, and not to be something that you routinely ignore because they're meaningless. Second, you want maintainers to be able to say "patches 1 through 5 are applied, but I don't like the way you frobnicate your filament in patch 6, so please redo patches 6 onwards with the following advice in mind". And third, if I see a function bool do_useful_thing(int *operations_table, int index, bit_mask), it's hard to work out what the last parameter will mean later. If I see bool do_useful_thing(int *operations_table, int index, bit_mask _ignored_cpus_mask), it's obvious that that last parameter will be an ignored CPUs bitmask when ignoring CPUs is implemented later.
Posted Feb 11, 2025 13:05 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
When I was programming in C regularly, it happened a lot, and it was a pain in the arse - the warnings swamped everything else.
As soon as you start using callbacks, passing functions to libraries, what have you (and it's libraries that are the problem, where the generic case needs the parameter but your case doesn't), you want to be able to say "I know this variable isn't used, it's there for a reason".
This was Microsoft C6 with warning level 4, and I just couldn't find a way to suppress a warning - everything I tried might suppress the warning I was trying to get rid of, but just triggered a different warning instead.
Fortunately, it was me that set the rules about errors, warnings etc, so the rule basically said "all warnings must be fixed or explained away".
Cheers,
Posted Feb 11, 2025 11:44 UTC (Tue)
by josh (subscriber, #17465)
[Link] (14 responses)
One of the most common cases for it: you need to pass a closure accepting an argument, and you want to ignore the argument: |_| do_thing()
Posted Feb 11, 2025 12:30 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (13 responses)
I understand the reasons for this, and why it's also a useful distinction to be able to make, but I've also seen people get caught out by it because they're trying to copy locking habits from other languages, and write let _ = structure.mutex.lock(); expecting that this means they hold the mutex - where mutex has type mutex: Mutex<()> to imitate a plain lock from another language.
Posted Feb 11, 2025 12:48 UTC (Tue)
by heftig (subscriber, #73632)
[Link] (2 responses)
Posted Feb 11, 2025 14:05 UTC (Tue)
by khim (subscriber, #9252)
[Link]
Have you submitted a request? Clippy is not supposed to be “opionated by default”, but it is Ok for it to have “opionated lints” (as long as they are not default) and that one sounds both useful (for some people) and easy to implement.
Posted Feb 11, 2025 14:22 UTC (Tue)
by farnz (subscriber, #17727)
[Link]
Posted Feb 11, 2025 13:56 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (9 responses)
Posted Feb 11, 2025 14:46 UTC (Tue)
by Wol (subscriber, #4433)
[Link]
If you give a crossbow-man a musket, of course he's going to try to shoot his foot off :-)
Cheers,
Posted Feb 11, 2025 15:36 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (7 responses)
And it's unhealthy to not talk about the footguns in a language you like - just because you like it doesn't mean it's completely perfect :-)
Posted Feb 11, 2025 15:45 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (6 responses)
Posted Feb 11, 2025 15:59 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
Posted Feb 11, 2025 20:42 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 12, 2025 3:58 UTC (Wed)
by geofft (subscriber, #59789)
[Link] (2 responses)
So if you did write let _ = structure.mutex.lock();, yes, you would unlock the mutex immediately, but you also wouldn't have the ability to access the data behind the mutex unless you gave a name to the variable. Because Rust prevents you from completely forgetting to lock the mutex and accessing the data without first locking it, it also prevents you from ineffectively locking the mutex and accessing the data after you unlocked it.
Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.
Posted Feb 12, 2025 7:09 UTC (Wed)
by mb (subscriber, #50428)
[Link]
That's true. It's not done like this in the vast majority of cases.
But misusing this wouldn't (and must not) cause UB.
Posted Feb 12, 2025 11:04 UTC (Wed)
by farnz (subscriber, #17727)
[Link]
There isn't such a pattern in idiomatic Rust, but it gets written when you're still thinking in terms of C++ std::mutex or similar facilities from other languages.
And that makes this a very important footgun to call out, since someone who learnt about concurrency using Rust won't even realise this is an issue, while someone who comes from another language will perceive it as Rust's promises around "fearless concurrency" being broken unless they've already been made aware of this risk - or ask a Rust expert to explain their bug.
Posted Feb 12, 2025 8:31 UTC (Wed)
by ralfj (subscriber, #172874)
[Link]
That said, I agree this is quite surprising, and I've been bitten by this myself in the past.
Posted Feb 11, 2025 12:22 UTC (Tue)
by TomH (subscriber, #56149)
[Link] (2 responses)
In C that's only technically allowed from C23 on but gcc won't complain unless -pedantic is used, though clang will.
Posted Feb 11, 2025 12:43 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
And yes, I can (and have) worked around this with documentation comments, but those have a bad habit of getting stale, such that the doc comment refers to a parameter called "ignored_cpu_mask" that's been removed and replaced by a "active_cpu_mask"…
Posted Feb 11, 2025 17:35 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
See this Rust issue as well for when `_arg` naming is unwelcome: https://github.com/rust-lang/rust/issues/91074
Posted Feb 11, 2025 14:37 UTC (Tue)
by gray_-_wolf (subscriber, #131074)
[Link]
int f(int *a, int *b) {
Sure, takes one extra line, but I find it more readable compared to the attributes.
Posted Feb 11, 2025 14:42 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (5 responses)
In the following program, I don't use argc. Simply don't name it.
```
Posted Feb 11, 2025 14:47 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (4 responses)
Posted Feb 11, 2025 15:15 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link] (3 responses)
Then you'll need what others have mentioned.
Either a comment:
int
or a cast to void:
int
...
But Rust's _ is no documentation name either, so the C equivalent is indeed no name at all.
Posted Feb 11, 2025 15:18 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (2 responses)
Posted Feb 11, 2025 15:30 UTC (Tue)
by alx.manpages (subscriber, #145117)
[Link]
Ahh, sorry, I missed that. How about this?
#define _ [[maybe_unused]]
int
The _() function already exists (for internationalization of strings), which is why I wouldn't shadow it with this macro, but if you don't use _(), when you could define the undescore to be [[maybe_unused]]. Or you could find another name that serves you.
Posted Feb 11, 2025 21:33 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link]
Posted Feb 11, 2025 18:11 UTC (Tue)
by dowdle (subscriber, #659)
[Link] (2 responses)
Posted Feb 11, 2025 18:20 UTC (Tue)
by dowdle (subscriber, #659)
[Link] (1 responses)
Posted Feb 11, 2025 19:06 UTC (Tue)
by daroc (editor, #160859)
[Link]
Posted Feb 12, 2025 6:53 UTC (Wed)
by mirabilos (subscriber, #84359)
[Link] (18 responses)
Ahem. He presented an example of a logic bug in safe code just a few lines further above.
Perhaps Rust induces partial blindness?
Posted Feb 12, 2025 9:14 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (17 responses)
Buffer overflows in user-space tends to be one of the most dangerous things. Because other bugs rarely grant permissions to an attacker.
But in a kernel, or in a setuid root program, logic errors can be as bad as any UB. Toggling a conditional without triggering UB can similarly result in granting permissions. If jumping to a new language completely kills the memory classes of bugs (and while it may significantly reduce them, it cannot kill them all) but can reintroduce subtle logic errors, we're not much better. Plus, all the churn makes it impossible to analyze where and why the bug was introduced.
On the other hand, with good APIs, one can write C code which is theoretically not memory-safe, but which is hardened enough that such bugs will be rare, and with these, the logic bugs will also be rare. Overall, I think C is still a safer language than Rust, even if it can theoretically have more buffer overflows.
I do think Rust is a good experiment, in order to test ideas that might be later introduced in C. I agree with other commenters that old C programmers were too reticent to improving the language in necessary ways (e.g., killing 0 as a null pointer constant), and maybe the Rust pressure has allowed us to improve C.
Posted Feb 12, 2025 15:33 UTC (Wed)
by taladar (subscriber, #68407)
[Link] (3 responses)
We literally have 50 years of proof that there is a wide gap between theory and practice. The "sufficiently disciplined/diligent/... programmer" model has just failed us and it is time to admit that maybe, just maybe, 20 years into the development of programming languages was not yet the time when the perfect language emerged.
Posted Feb 12, 2025 15:46 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (2 responses)
Plan9 was also better than Unix in many ways, and we still use a Unix clone today. We have adapted it with the ideas of Plan9 (e.g., proc(5)). Plan9 was a useful experiment, just like Rust is useful today to backport improvements into C.
Linux is far from being a Unix V7/BSD/SysV clone, let alone Unix V1, just like GNU C2y isn't K&R C anymore.
Posted Feb 12, 2025 20:20 UTC (Wed)
by jmalcolm (subscriber, #8876)
[Link] (1 responses)
Unless I misunderstand, C2y is a placeholder for a C standard that does not yet exist. So, it is a bit of a weird place to make the argument around C maturity from to start with. I think C2y is expected to become C26 but it could easily be C27 or C28 (in 2028). It could be never. Being a "future" standard, the reference also supports the other side of the argument regarding the CURRENT state of the C language.
But, if that is what we mean by GNU C2y, I should point out that Clang has a C2y flag as well (-std=c2y). C2y is not a GNU standard. If anything, I would expect GNU C2y to refer to only the proposed changes to the current C standard that the GNU compiler has implemented.
How does limiting your reference to the proposed changes in C2y to the subset implemented in GNU software support your argument?
Too much kool-aid? Who are we referring to?
Posted Feb 12, 2025 20:49 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link]
While GCC has not yet merged some changes, there are patches for them which will likely be available in the next version of the compiler. Most of those changes will be backported to older dialects (so available if you specify -std=gnu17), so while they're not written in stone in the standard, and are not already available in GCC, they will be very soon available in all GNU dialects, as *stable* features.
Posted Feb 12, 2025 20:42 UTC (Wed)
by roc (subscriber, #30627)
[Link] (12 responses)
Is this bait? There is ample real-world evidence to the contrary, too much to even cite here, but you can start by comparing the CVE record for rustls vs OpenSSL.
Mass-rewriting of C into Rust might introduce a few logic errors. It is likely to eliminate more, since Rust provides powerful tools for avoiding logic errors, newtypes and the Result type, that are infeasible in C. And if you choose Rust instead of C for new code (e.g. new kernel drivers), the logic error advantage of Rust is clear.
Posted Feb 12, 2025 21:12 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link] (11 responses)
I don't know what C dialect is used by OpenSSL, but I looked at a random file in their source code, and immediately identified some less-than-ideal patterns. Also, the language isn't enough; you also need to design non-error-prone APIs, which is something that no tool can help you with (at least not enough).
That Rust defaults to not allow certain things is nice. We should severely restrict what is allowed in modern dialects of C.
Here's the example of unsafe code I identified in OpenSSL:
<https://github.com/openssl/openssl/blob/6f3ada8a14233e76d...>
```
A better API (only implementable as a macro) would ask for a number of elements and a type:
```
Here's an example of such an API:
And here are the commits that added such an API, which document the kind of improvements it adds:
Posted Feb 12, 2025 21:43 UTC (Wed)
by mb (subscriber, #50428)
[Link] (1 responses)
Posted Feb 12, 2025 23:05 UTC (Wed)
by alx.manpages (subscriber, #145117)
[Link]
It is not, if you're asking that. I considered not responding, because your comment looked plausibly trolling, but I think I should respond, in case your comment was honest.
Signed-off-by: Alejandro Colomar <alx@kernel.org>
Posted Feb 13, 2025 4:48 UTC (Thu)
by roc (subscriber, #30627)
[Link] (8 responses)
Rust helps with this because you have to encode ownership and thread-safety information in the API. E.g.
fn new_blagh(s: &str) -> Blagh
In the Rust API, unlike the C API:
And in Rust you can go much much further towards making APIs difficult to misuse. For just one example see how rustls's API prevents certain kinds of configuration errors: https://docs.rs/rustls/latest/rustls/struct.ConfigBuilder...
Posted Feb 13, 2025 10:15 UTC (Thu)
by alx.manpages (subscriber, #145117)
[Link] (7 responses)
I only work with single-threaded projects at the moment. That aspect is not very appealing to me at the moment.
> * The callee can be sure that it is not responsible for freeing `s`
Consistent use of the [[gnu::malloc(free)]] attribute can help with that too.
---
Look, Rust does have very good ideas. I don't claim it doesn't. But:
- You can improve your C code's safety significantly just by designing good APIs and self-limiting to a subset of the language.
- As a longer-term goal, you can probably add those Rust features to C.
In the end, a safe language is a language that prevents you from accidentally granting rights to an attacker.
You may call it luck, but I have been refactoring shadow-utils at a very high rate (including complaints by packagers that were worried that such rate of churn would introduce security vulnerabilities almost certainly) for quite a few years already. So far, I have introduced 0 remote holes in a heck of a long time (at least, for what we know). The code is now significantly safer than it was before I started. The more I refactor it, the safer I feel when doing so. You just need to follow some rules, and at least you'll have a hard time introducing a vulnerability. It's not impossible, but it's all a compromise.
I know the language so damn well that that offsets the theoretical benefits that Rust could give me. People talk their mother tongue better, even if it's significantly more complex than other languages, because they know it by heart. They say Norwegian is similar to English but simpler (and thus easier), but we speak English because we already know it. Would it be better if there was a big-bang change in the world to make Norwegian the global language? Maybe it would help learners in the long term, but we'll agree that it's not a good idea. The same holds for Rust and C, IMO.
Plus, for a setuid-root set of programs (which is what I'm mainly working on at the moment, apart from the Linux man-pages project), a logic error is as bad as a buffer overflow. If I toggle a conditional and grant root privileges to a random user in su(1), I've screwed as badly as if I would have caused the worst UB. That also diminishes the reasons for using Rust, _in my case_.
Posted Feb 14, 2025 23:33 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (6 responses)
I hope your projects survive you moving on. Projects using arcane knowledge to hold themselves up are at risk of becoming like one of the `roff` implementations (`nroff`?): inscrutable to even the other Unix prophets so as to be left alone after the original author's untimely end[1].
[1] At least if my memory of a BSD podcast which interviewed Bryan Cantrill where it was mentioned is accurate.
Posted Feb 14, 2025 23:45 UTC (Fri)
by alx.manpages (subscriber, #145117)
[Link]
That's a valid concern. I try to educate the other co-maintainers and regular contributors on those matters. But I should be more careful on that effort, just in case.
Posted Feb 15, 2025 23:40 UTC (Sat)
by mirabilos (subscriber, #84359)
[Link] (4 responses)
J�rg Schilling (rip.) also used to maintain a fork, similarily.
Posted Feb 17, 2025 7:32 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
Here's the source of my claim as well: https://www.youtube.com/watch?v=l6XQUciI-Sc&t=5315s
Posted Feb 17, 2025 21:23 UTC (Mon)
by mirabilos (subscriber, #84359)
[Link] (2 responses)
No, we’re talking about the 32V-based one.
Posted Feb 17, 2025 21:55 UTC (Mon)
by excors (subscriber, #95769)
[Link] (1 responses)
Posted Feb 17, 2025 22:42 UTC (Mon)
by mirabilos (subscriber, #84359)
[Link]
Posted Feb 13, 2025 2:10 UTC (Thu)
by milesrout (subscriber, #126894)
[Link] (3 responses)
>Ojeda's answer is: confidence. As a maintainer, when somebody sends in a patch, he may or may not spot that it's broken. In an obvious case like this, hopefully he spots it. But more subtle logic bugs can and do slip by. Logic bugs are bad, but crashes (or compromises) of the kernel are worse. With the C version, some other code could be relying on this function to behave as documented in order to avoid overwriting some part of memory. That's equally true of the Rust version. The difference for a reviewer is that Rust splits things into "safe" and "unsafe" functions — and the reviewer can concentrate on the unsafe parts, focusing their limited time and attention on the part that could potentially have wide-ranging consequences. If a safe function uses the incorrect version of f, it can still be wrong, but it's not going to crash. This lets the reviewer be more confident in their review.
The safety of a function marked "unsafe" depends on its correctness, which in turns depends on the correctness of any other code it depends on. That means that it is not true that you can focus your review of a patch purely on those functions marked "unsafe" and be assured that if the code was safe before it will be safe afterwards if "unsafe" code is untouched. The safety of an "unsafe" function typically depends on the maintenance of invariants by functions marked "safe".
The usual approach taken is apparently to restrict this to the "crate" scope by convention. That is, a crate's (package's) safety (if it contains "unsafe" code, which obviously any kernel Rust code must, right?) might depend on the correctness of other code within the crate, but should not depend on the correctness of code outside the crate to remain safe. For example, entrypoints to a crate should enforce invariants but internal functions may assume them.
When reviewing a crate as a whole it is true that if all the "unsafe"-marked code is safe, then the crate will be safe. But that review may require the inspection of "safe" that operates on data structures, for example, the invariants of which are relied on by the "unsafe"-marked code.
But when reviewing a patch to a crate, it cannot be assumed that you only need to check the correctness of the "unsafe" bits to know it is safe. You need to review the patch, even if it only affects "safe" functions, in the context of all the ways that code might be relied upon by unsafe code. So the review burden is not really any less in practice.
Posted Feb 13, 2025 2:28 UTC (Thu)
by intelfx (subscriber, #130118)
[Link] (2 responses)
That’s called “unsoundness” and it’s very much avoided.
Posted Feb 13, 2025 6:20 UTC (Thu)
by mb (subscriber, #50428)
[Link]
Unsafe code can depend on the behavior of safe code, but there are conventional boundaries to that.
But there are also exceptions to that, too. Let me give you and example: It may be reasonable for some unsafe code in a crate to assume that the std implementation of Vec doesn't randomize the indexes internally. If the unsafe code assumes that [5] will give access to the element number 5 of the Vec, then it is depending on the safe API of Vec to do the right thing. It would be possible to implement a memory-safe Vec that randomizes index accesses and never goes out of bound.
What would not be Ok for example: Assume that random joe's crate implements PartialOrd correctly for some type. It's commonly understood as being unsound, if unsafe code depends on PartialOrd being implemented incorrectly.
Another more common example is that your unsafe code may depend on other crate-internal safe code. For example some safe initialization code often is required for unsafe code to access initialized memory.
However, the starting point for the safety analysis *always* is the unsafe-block.
Posted Feb 13, 2025 10:48 UTC (Thu)
by excors (subscriber, #95769)
[Link]
In this case I think you could choose to put the bounds check inside the `unsafe` block so the `unsafe` block has no preconditions, in which case a safety analysis would only have to consider the block (and anything called from within the block) and not the whole function. But in general you probably want to minimise the scope of `unsafe` blocks, so the compiler won't let you accidentally call other `unsafe` functions where you didn't intend to - the same motivation as https://doc.rust-lang.org/edition-guide/rust-2024/unsafe-... - which will result in `unsafe` blocks with preconditions that non-`unsafe` code must uphold.
That Rustonomicon page also gives an example where an `unsafe` block depends on the entire module upholding its preconditions. If you can't avoid module-level preconditions, I think it's important to minimise the size of your module to keep the safety analysis manageable. Reviewers will need to be careful about any patches to non-`unsafe` code in that module, and in any modules that it calls, but at least you can still be confident that patches to any other modules can't break it.
It can still crash
It can still crash
It can still crash
* A C++ reference points at the same thing for its entire lifetime - it cannot be changed to point at something else (like a const pointer, unlike a pointer-to-const).
* For that matter, there is no syntax to distinguish the reference from the pointee - it auto-dereferences where appropriate.
* For some reason, much of the documentation insists on avoiding the word "pointing" in relation to references, instead claiming that a reference is an "additional name" or "alias" for an object. But they are pointers, or at least there is no practical implementation other than as a pointer (on most reasonable architectures, in the general case, excluding cases where the optimizer manages to elide a whole object, etc.).
It can still crash
It can still crash
It can still crash
It can still crash
It can still crash
There are endless such possibilities.
That has nothing to do with "unsafe", though.
Ada does this too
Ada does this too
It's quite common that the type system is used for logical correctness, too.
This type has nothing to do with safety, but it makes working with paths much less error prone than manually poking with strings. And it makes APIs better by clearly requiring a Path type instead of a random string.
If you do operations with such types the errors can't accidentally be ignored (and then lead to runtime crashes).
For example the compiler won't let you ignore the fact that paths can't always be converted to UTF-8 strings. It forces you to handle that logic error, unless you explicitly say in your code to crash the program if the string is not UTF-8.
Ada does this too
Ada does this too
2. You can't get an instance if the precondition fails to hold.
3. You can't invalidate the precondition while holding an instance (but you may be able to invalidate it and give up the instance in one operation). If you are allowed to obtain multiple instances that pertain to exactly the same precondition, then this rule is upheld with respect to all of those instances.
4. Functions can require you to have an instance of the type in order to call them.
5. The type is zero-size, or is a zero-cost wrapper around some other type (usually whatever object the precondition is all about), so that the compiler can completely remove it once your program has type-checked.
* An "affine" type may be used at most once (per instance of the type). In Rust, any type that does not implement Copy is affine when moved. Borrowing is not described by this formalism, but by happy coincidence, borrows simplify quite a few patterns that would otherwise require a fair amount of boilerplate to express.
*A "linear" type must be used exactly once. Linear types don't exist in Rust or most other languages, but Haskell is experimenting with them. Linear types enable a few typestate patterns that are difficult to express in terms of affine types (mostly of the form "invalidate some invariant, do some computation, then restore the invariant, and make sure we don't forget to restore it"). To some extent, this sort of limitation can be worked around with an API similar to Rust's std::thread::scope, but it would be annoying if you had to nest everything inside of closures all the time.
* "Ordered" types must be used exactly once each, in order of declaration. "Relevant" types must be used at least once each. These pretty much do not exist at all, at least as far as I can tell, but there is theory that explains how they would work if you wanted to implement them.
> there are a bunch of small details that need to be worked out before this can be done
Ada does this too
#[repr(transparent)]
struct Linear<T>(T);
impl<T> Drop for Linear<T> {
fn drop(&mut self) {
const { assert!(false) };
}
}
impl<T> Linear<T> {
fn unbox(self) -> T {
// SAFETY: type Linear is #[repr(transparent)]
let t: T = unsafe { transmute_copy(&self) };
_ = ManuallyDrop::new(self);
t
}
}
panic!
– not everyone likes to use -C panic=abort
. And without -C panic=abort
compiler would complain about any code that may potentially even touch panic!
Ada does this too
Ada does this too
> It does not promise that an unreachable const block is never evaluated (in the general case, that would require solving the halting problem).
Ada does this too
ManualDrop
and maybe some unsafe
.Ada does this too
Ada does this too
replace
and put it into your crate without telling the compiler that it needs to do some special dance then everything works.unsafe
code and standard library – which would, essentially, turn it into a different language.static_assert
-based linear types work are not two different kinds of work, but, in fact, exactly the same work.Ada does this too
// The panic may or may not occur when the program is built.
const { panic!(); }
}
> Currently, the reference[1] says that this is allowed to fail at compile time:
Ada does this too
const
evaluation happens after monomorphisation, not before. It's exactly like C++ templates and should work in similar fashion and require similar resources. Templates in C++ were handled decently by EDG 30 years ago and computers were much slower back then.static_assert
in linear type destructor is happening when they shouldn't happen it's matter to tightening the specifications and implementations, not question of doing lots of math and theorem-proving.Ada does this too
> We do not want to recreate the C++ situation where all the types match, but then something blows up in monomorphization.
Ada does this too
for
. And said lending iterator was a showcase of GATs four years ago.Ada does this too
[2]: https://doc.rust-lang.org/src/core/iter/traits/iterator.r...
Ada does this too
2. Values are dropped on panic, if the compiler can't prove your code won't panic then it will have to generate the drop glue.
Ada does this too
* I'm not sure that strongly-undroppable types provide much of a useful improvement over weakly-undroppable types, and it would be absurd to provide both features at once. But it could be argued that having one exceptional case where drop glue is invoked is a recipe for bugs, so it might be a cleaner implementation. OTOH, if you have chosen to have unwinding panics, you probably don't want them to magically transform into aborts just because some library in your project decided to use an undroppable somewhere. I currently think that weakly-undroppable is the more pragmatic choice, but I think there are valid arguments to the contrary.
* Unleakable types would allow the original API for std::thread::scope to be sound, and would probably enable some other specialized typestates. They're also pretty invasive, although probably not quite as much as undroppable types. They would not solve the File::close() problem.
* Strongly-linear types are just the combination of two of the above features. If either feature is too invasive to be practical, then so are strongly-linear types. But they would provide a strong guarantee that every instance is cleaned up in one and only one way.
std::thread::spawn(move || _t = t; loop{std::thread::park();});
}
Errors on close
For example, the std::fs::File type currently does not have a close() method, because the type already closes itself on drop. But that means that it swallows errors when it is closed. The documentation recommends calling sync_all() (which is equivalent to fsync(2)) if you care about errors, but I imagine that some filesystem developers would have choice words about doing that in lieu of checking the error code from close(2). If File were weakly-undroppable, then it could provide a close() method that returns errors (e.g. as Result<()>) and fails the compilation if you forget to call it
Errors on close
* Never call fsync and never check the close return code. Significant probability of silent data loss, but faster.
* Never call fsync, but do check the close return code. Presumably a lower probability of silent data loss, since you might catch some I/O errors, but almost as fast as not checking the error code (in the common case where there are no errors). In the worst case, this is a main memory load (cache miss) for the errno thread-local, followed by comparing with an immediate.
Errors on close
* Never call fsync and never check the close return code. Significant probability of silent data loss, but faster.
* Never call fsync, but do check the close return code. Presumably a lower probability of silent data loss, since you might catch some I/O errors, but almost as fast as not checking the error code (in the common case where there are no errors). In the worst case, this is a main memory load (cache miss) for the errno thread-local, followed by comparing with an immediate.
Errors on close
Errors on close
Wol
But, by the nature of Linux's close syscall, error on close means one of three things:
Errors on close
Ada does this too
Ada does this too
Wol
It's not the runtime of the checker that's the problem; it's that we cannot write a checker that definitely accepts or rejects all reasonable programs. The underlying problem is that, thanks to Rice's Theorem (a generalisation of Turing's Halting Problem), a checker or combination of checkers can, at best, give you one of "undecided", "property proven to hold", or "property proven to not hold". There's two get-outs we use to make this tractable:
Running several checkers instead of one
Running several checkers instead of one
Wol
Running several checkers instead of one
Running several checkers instead of one
Idris has both as part of its totality checker; a function can be partial (in which case it may never terminate or produce a value - it can crash or loop forever) or total (in which case it must either terminate for all possible inputs, or produce a prefix of a possibly infinite result for all possible inputs).
Uses of termination checking
If you run all the currently known termination checker algorithms that actually come up with useful results (with the exception of "run until a timeout is hit, say it might not terminate if the timeout is hit, or it does terminate if the timeout is not hit"), you're looking at a few seconds at most. The pain is not the time that the algorithms we know of take, it's the fact that most of them return "undecided" on programs that humans can tell will terminate.
Running several checkers instead of one
Ada does this too
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
_Optional
leads nowhere one only needs to look on the fate of const
. It was added when C was relatively young, and it was kinda-sorta adopted… but not really.const
and thus it have to work both with mutable and immutable strings (there was difference between them when it was introduced)… so the best they could do is to add this blatantconst
safety violation. And everyone remembers the fate of noalias, isn't it?find
(both in C++ and in Rust) returns position.nullability annotations in C
> immutable strings… yet returns pointer to a mutable
> string! WTH? What's the point? What kind of safety it is?
> of trying to decide whether the result is mutable or
> immutable string find (both in C++ and in Rust) returns
> position.
> You may be happy that C23 changed that.
nullability annotations in C
const
or noalias
.nullability annotations in C
> I'm 31. I hope to continue using C for many decades. :)
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
alx@devuan:~/tmp$ cat strchr.c
const char *my_const_strchr(const char *s, int c);
char *my_nonconst_strchr(char *s, int c);
( \
_Generic(s, \
char *: my_nonconst_strchr, \
void *: my_nonconst_strchr, \
const char *: my_const_strchr, \
const void *: my_const_strchr \
)(s, c) \
)
alx@devuan:~/tmp$ gcc -Wall -Wextra -pedantic -S -std=c11 strchr.c
alx@devuan:~/tmp$
```
nullability annotations in C
>
> No, it's not. It's only “more useful” if you insist on zero-string abominations.
> If your strings are proper slices (or standalone strings on the heap…
> only C conflates them, C++, Rust and even such languages as Java and C# have separate types)
> then returning pointer is not useful.
> You either need to have generic type that returns slice or return index.
> And returning index is more flexible.
s.str = malloc(s1.len + s2.len + s3.len);
p = s.str;
p = mempcpy(p, s1.str, s1.len);
p = mempcpy(p, s2.str, s2.len);
p = mempcpy(p, s3.str, s3.len);
s.len = p - s.str;
```
s.str = malloc(s1.len + s2.len + s3.len);
s.len = 0;
s.len += foo(s.str + s.len, s1.str, s1.len);
s.len += foo(s.str + s.len, s2.str, s2.len);
s.len += foo(s.str + s.len, s3.str, s3.len);
```
nullability annotations in C
s = sdscatsds(s, s1);
s = sdscatsds(s, s2);
s = sdscatsds(s, s3);
sdsfree(s);
nullability annotations in C
> With a managed string like you propose, you're effectively blinding the compiler from all of those operations.
nullability annotations in C
nullability annotations in C
> But the smaller the APIs are, the less work you impose on the analyzer, and thus the more effective the analysis is (less false negatives and positives).
nullability annotations in C
nullability annotations in C
> I never cared about optimized code.
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
Modern languages do that for free, though.
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
And because "I always did it like this" isn't a reasoning that helps in discussions.
nullability annotations in C
nullability annotations in C
>Please reconsider your language.
nullability annotations in C
nullability annotations in C
[str1, str2, str3].concat().into_boxed_str()
And that's it. In C-like language that doesn't use “dot” to chain functions it would be something like:
string_to_frozen_string(concat_strings(str1, str2, str2))
Or, maybe, even just
concat_strings(str1, str2, str2)
string.h
interface is awful, as whole.nullability annotations in C
nullability annotations in C
> through this pointer" when what everyone wants most of
> the time is "nothing modifies the referent while I'm
> holding this pointer", i.e. what Rust gives you.
nullability annotations in C
nullability annotations in C
nullability annotations in C
There was a research project at Berkeley (George Necula et al.) across several years (including 2006), apparently called Ivy (although early presentations did not use that name). The idea was that existing C code could be made safe piecewise (which requires sticking with the ABI among other things, unlike C implementations with fat pointers) by adding annotations in some places.
nullability annotations in C
nullability annotations in C
My understanding (from hearing a talk by George Necula) is that the Ivy tools would complain if they do not know anything about array bounds. And that you can turn off such complaints for parts of the source code that you have not enhanced with annotations yet, like Rust's nullability annotations in C
unsafe
.
Note that Rust's unsafe does not turn off complaints; it gives you access to abilities that you can use unsoundly, but just adding unsafe to code that Rust rejects will not normally cause it to accept it.
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
Porting from malloc()/free() to garbage collection is easy: just delete the calls to free() (or define them as noops). There is one pathological case for conservative garbage collectors (a linked list that grows at the end where you move the root pointer along the list; any spurious pointer to some element will cause the list to leak), but it's a rare idiom.
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
nullability annotations in C
<https://thephd.dev/the-big-array-size-survey-for-c>
wchar_t *
wmemset(size_t n;
wchar_t wcs[n], wchar_t wc, size_t n)
{
for (size_t i = 0; i < _Countof(wcs); i++)
wcs[i] = wc;
}
```
It can still crash
The elephant in the room
The elephant in the room
The elephant in the room
The elephant in the room
The elephant in the room
The elephant in the room
The elephant in the room
Rust tutorial via mailing list
Rust tutorial via mailing list
Rust tutorial via mailing list
Rust tutorial via mailing list
Rust tutorial via mailing list
* The "getting started" section should probably just be a copy or alias of https://docs.kernel.org/rust/quick-start.html. The rest of the tutorial should be written assuming that those steps have been done and that (e.g.) all R4L unstable features are enabled and the compiler is not going to complain about them.
* It should not demonstrate any API that would never be used in (most) kernel code. Do not demonstrate how to use std::boxed::Box, because std is not linked into R4L. Instead, if a Box is to be explained at all, it should show how to use kernel::alloc::kbox::Box, which has a different API.
* It should not hide behind the 'nomicon. Unsafe code should get a proper tutorial of its own, preferably as a continuation of the main tutorial. It should not be handwaved as "too hard" or "you won't need it," because anyone doing Rust-C FFI absolutely will need to use unsafe in certain places, and those are the exact places where maintainers will be expected to look when changing C APIs.
Rust tutorial via mailing list
Rust tutorial via mailing list
Rust tutorial via mailing list
Catching the mismatch
And neither language helps the programmer catch the mismatch between the comment and the behavior of the function.
warning: unused variable: `b`
--> src/main.rs:1:15
|
1 | fn f(a: &i32, b: &i32) -> bool {
| ^ help: if this is intentional, prefix it with an underscore: `_b`
|
= note: `#[warn(unused_variables)]` on by default
demo.c: In function ‘f’:
demo.c:3:21: warning: unused parameter ‘b’ [-Wunused-parameter]
3 | bool f(int *a, int *b) {
| ~~~~~^
One key usability difference here (and C compilers could adopt this); Rust allows you to signal to the compiler that a parameter or variable is intentionally unused for now with a single-character prefix, _. The equivalent warning in GCC requires an attribute to suppress it; so you'd have to write something like bool f(int *a, [[gnu::unused]] int *b) or bool f(int *a, __attribute__((unused)) int *b) to suppress the warning (I may have put the attribute in the wrong place).
Catching the mismatch
Catching the mismatch
I just checked, because I recalled that the Standard reserves names beginning with underscore; but that only applies to globally visible names, not local ones, so it'd be fine to use a name beginning underscore to suppress the warning.
Catching the mismatch
Catching the mismatch
Catching the mismatch
It makes a huge difference to the usability of the warning. Ideally, you want your codebase to be warning-free after each reviewed patch is applied[1], so you want to be able to quickly suppress an unused variable warning when it's only going to exist until the next patch in the series is applied. You also want the name to stay around, so that if the code is usable as-is, with only part of the series applied, anyone using it has a clue about what value to fill in.
Catching the mismatch
Catching the mismatch
Wol
Catching the mismatch
One "interesting" surprise in Rust, however, is that in let bindings, there is a significant difference between let _ = … and let _foo = …; in the former, whatever you put in place of … is dropped immediately, while in the latter, it's dropped when _foo goes out scope (just as it would be if you used foo instead of _foo).
Underscores in Rust
As useful as this can sometimes be, e.g. when you really don't care about a Underscores in Rust
Result
, I wish clippy would suggest using drop(…)
instead of allowing let _ = …
.
Underscores in Rust
There's a lint to help with that - #![warn(let_underscore_drop)] will catch all the footgun cases where you've used let _ =. It won't stop you using it completely, just in the cases where there's a risk of changed behaviour due to the rules around drop timing.
Underscores in Rust
Underscores in Rust
Underscores in Rust
Wol
All usable languages have footguns - if they didn't, they'd also be blocking you from doing something useful.
Underscores in Rust
Underscores in Rust
It's very specifically a special case where you name a let binding _; you can't read it (_ isn't a real variable), and it drops anything bound to it immediately. _foo and foo behave in exactly the same way, however.
Underscores in Rust
Underscores in Rust
It's not quite as bad as you might think from the above example about a mutex, because a Rust-y mutex API (such as std::sync::Mutex) returns a guard object, which holds the mutex locked until it's dropped, and there's no way to get to the value protected by the mutex without having a guard object. (That is, if you want a mutex-protected structure, you write it as a mutex object that wraps the rest of the data, to make you deal with the mutex before getting to the data, as opposed to a structure with several members, one of which is the mutex that protects the other members, where you can easily bypass the mutex intentionally or unintentionally.) Usually this is implemented via the Deref/DerefMut traits, where you can treat the guard object as a smart pointer and do let mut guard = mutex.lock(); *guard += 1 or whatever, but you can also choose to design an API where the guard object has some sort of methods that return borrowed references to the data. The borrows cannot outlast the guard object, and the mutex is locked so long as the guard object remains in scope.
Underscores in Rust
Underscores in Rust
But there are rare exceptions:
https://docs.rs/tokio/1.43.0/tokio/sync/struct.Semaphore....
Underscores in Rust
Or in other words, there usually isn't a pattern of "get this RAII object and keep it around for its side effect while doing other stuff". Either you get the RAII object and actually reference it in the stuff you're doing, or you're using some non-RAII API like raw bindings to explicit lock() and unlock() calls where automatic drop isn't relevant.
Underscores in Rust
Catching the mismatch
The comment trick q3cpma points out would work, but I'd normally want to have a name in there because of the partially applied patch series problem, where I want people to be aware that this parameter does have a meaning.
Catching the mismatch
Catching the mismatch
Catching the mismatch
(void)b;
...
}
Unused parameters in C are easy
alx@devuan:~/tmp$ cat unused.c
#include <stdio.h>
int
main(int, char *argv[])
{
puts(argv[0]);
return 0;
}
alx@devuan:~/tmp$ gcc -Wall -Wextra unused.c
alx@devuan:~/tmp$ ./a.out
./a.out
```
OK, so how do I name the parameter for documentation purposes while marking it as unused?
Unused parameters in C are easy
Unused parameters in C are easy
main(int /*argc*/, char *argv[])
{...}
main(int argc, char *argv[])
{
(void) argc;
}
The cast-to-void is nice, when you want a name.
I'm sorry, you've confused me; why is Rust's use of _ignored_cpu_mask "no documentation name either"? All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own. And the C things so far are all much more work than just adding an `_` to the name to indicate that it's currently deliberately unused - my contention is that if you don't make it really simple, people will prefer to turn off the entire warning rather than use the convention for unused parameters.
Unused parameters in C are worse than in Rust
Unused parameters in C are worse than in Rust
> All names starting with an underscore are flagged as explicitly unused, not just the underscore on its own.
main(_ int argc, char *argv[])
{...}
Unused parameters in C are worse than in Rust
Video now available
Video now available
Video now available
Too much kool-aid
Too much kool-aid
Too much kool-aid
Too much kool-aid
Too much kool-aid
Too much kool-aid
Too much kool-aid
Too much kool-aid
ret = OPENSSL_zalloc(sizeof(*ret))
```
ret = XXX_ZALLOC(1, OSSL_COMP_CERT);
```
<https://github.com/shadow-maint/shadow/blob/master/lib/al...>
<https://github.com/shadow-maint/shadow/commit/6e58c127525...>
<https://github.com/shadow-maint/shadow/commit/09775d3718d...>
Too much kool-aid
No, you're not talking to a random text generator
Too much kool-aid
vs
Blagh new_blagh(char* s);
* The callee can be sure that `s` won't be modified by another thread while `foo` is running
* The callee can be sure that it is not responsible for freeing `s`
* The callee understands that if `Blagh` contains part of `s` then it needs to copy that data into `Blagh`
* The caller can be sure that the returned object does not depend on `s` being kept alive
* The caller can be sure that it retains ownership of `s`
* The caller can be sure that it has exclusive ownership of `Blagh`, i.e. nothing depends on it keeping `Blagh` alive
Of course Rust not only forces you to write these critical properties into the code, it also checks that the caller and callee maintain them.
Too much kool-aid
Too much kool-aid
C arcange knowledge, bus factor
Too much kool-aid
Too much kool-aid
Too much kool-aid
https://mbsd.evolvis.org/cvs.cgi/src/usr.bin/oldroff/nroff/ still shows some of the horrors.
Too much kool-aid
Too much kool-aid
The safety of unsafe code can depend on the correctness of safe code
The safety of unsafe code can depend on the correctness of safe code
The safety of unsafe code can depend on the correctness of safe code
For example that trust should not cross crates.
But it is reasonable to assume that the std library does the right thing instead.
The safety of unsafe code can depend on the correctness of safe code