A GPIO driver in Rust
| C version | Rust version | ||||
|---|---|---|---|---|---|
|
|
Posted Jul 19, 2021 16:15 UTC (Mon)
by bredelings (subscriber, #53082)
[Link] (15 responses)
-BenRI
Posted Jul 19, 2021 16:18 UTC (Mon)
by NHO (subscriber, #104320)
[Link] (1 responses)
Posted Jul 19, 2021 17:10 UTC (Mon)
by pbonzini (subscriber, #60935)
[Link]
Posted Jul 19, 2021 16:55 UTC (Mon)
by logang (subscriber, #127618)
[Link] (9 responses)
Linux style requires the open brace of the function to be on it's own line, there be a blank line after a function's variable declarations, and often a blank line before the return statement. The Rust code is also using 4 char indents and ignores the line length limit which allows a few lines to require less wrapping. Not to mention the Rust code mixes variable declarations in the body of the code which saves a few more lines and is also against the style guide.
If the Rust code didn't ignore the style guide (or the C code was written in a similar compressed style) the line difference would be far less pronounced. IMO all this makes the rust code harder to read than the C code. I suspect the Rust developers would have less resistance if they followed kernel coding style more closely instead of assuming a new language allows them to define their own style.
The biggest line savings seem to be in the include list (where rust can put multiple includes on a single line) and the initialization of the gpiochip and irqchip methods (which Rust gets away without needing by adding an additional indent on the definition of those methods).
Posted Jul 19, 2021 18:04 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link]
In Rust, one can do `let` declarations at the top of the function, but I find `let result = { /* block which computes */ };` instead of `int ret; /* decls and setup code */; ret = computed_result;` and not have `ret` be `const` is *far* worse. Rust can do the latter *and* have it be `const`, but…why do the hoist?
> Not to mention the Rust code mixes variable declarations in the body of the code which saves a few more lines and is also against the style guide.
In Rust, this can be *very* important if the lifetime of a variable is not available until the middle of the function. Declaring it at the top of the function can make its lifetime "too long" to pass off to some narrower lifetime API.
> I suspect the Rust developers would have less resistance if they followed kernel coding style more closely instead of assuming a new language allows them to define their own style.
Some of the rules come from C being C. Those are fine to let lie on the floor. Determining which is which however is where the flamewars then lie… Variable declarations needing to be hoisted (and the "Christmas tree" correlary) are such rules. Indentation, extra newlines, and brace placement? I like hoisted braces for vertical savings (which allows for more spacing between "sections" of code too. Indentation is something I've come to care less about as long as it's "enough" to be able to find a matching section by eyeball. 2 is…fine, but not ideal. 4 is great. 8 is OK (though I find people's consistency for using spaces-for-alignment in tabulator-using source lacking). Tabs also break alignment in diffs (my main pet peeve with them).
Posted Jul 19, 2021 19:06 UTC (Mon)
by jezuch (subscriber, #52988)
[Link]
Rust is not ignoring the style guide, it's just using a different one (Rust's). In contrast to C and almost all languages before it, Rust has an official style guide. You deviate from it at your own peril!
Posted Jul 19, 2021 19:15 UTC (Mon)
by kay (guest, #1362)
[Link]
OTH the rust code always use braces for if and else statements, the C code does not in case of a single instruction if or else.
Posted Jul 19, 2021 20:44 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
BTW it is remarkably ignorant to believe C99 and every other programming language combined declarations and initializations for pure "code style" reasons.
Posted Jul 19, 2021 20:48 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (4 responses)
Plus 75 lines saved out of 425 is not the kind of difference that really matters in practice anyway, especially not considering all the other big differences that actually matter.
Posted Jul 19, 2021 22:27 UTC (Mon)
by Paf (subscriber, #91811)
[Link] (3 responses)
Posted Jul 19, 2021 22:40 UTC (Mon)
by marcH (subscriber, #57642)
[Link] (2 responses)
But even that could be deceiving because one language or its code style could be "too terse" requiring more comments which would make the code longer in the end.
> it’s interesting to ask why at least.
Comparing the lengths of two solutions in the same language has some value but for two totally different languages it's apples versus oranges because pretty much every language difference can affect the end total. These differences are worth discussing individually, not as an aggregate basket of disparate things.
If the main discussion topic that comes up when comparing equivalent C and Rust code is a 20% difference in the number of lines then maybe it was not very useful to have them side by side.
Posted Jul 19, 2021 23:49 UTC (Mon)
by tialaramex (subscriber, #21167)
[Link] (1 responses)
Or conversely, maybe it was useful because it turned out one of the main topics was "This one is a bit longer/ differently formatted" which suggests a Rust driver and C driver are not so very different after all.
I think it's much easier to convince programmers that a new language is utterly alien and terrifying if you present some random snippet they're unfamiliar with. Stuff whose equivalent they would never dream of writing at all, may convince them the language is ill-suited to their purpose. If we showed this device driver code to somebody wondering if they can rewrite their web app in Rust they're going to freak out. Interrupts? Register banging? What's going on? Well, it's a kernel device driver, those things are exactly to be expected, and contrariwise, this kernel code lacks string interpolation, asynchronous function calls, and so on you'd find in the web app.
Idioms don't translate 100% and I think the ensuing summit list discussion shows people feeling their way around and this LWN discussion likewise. I can imagine some kernel style being enforced for kernel Rust, I doubt Linus is going to demand that Rust macros yell louder, but he very well might insist on different indentation and line length rules, rustfmt should cheerfully fix that without any significant human labour.
So this example driver was a great idea, and I think the side-by-side layout was, though not essential, a useful contribution to understanding.
Posted Jul 20, 2021 0:44 UTC (Tue)
by marcH (subscriber, #57642)
[Link]
That's much more optimistic perspective of "bikeshedding" and I genuinely appreciate it, thanks!
Posted Jul 20, 2021 8:46 UTC (Tue)
by mchehab (subscriber, #41156)
[Link]
That's not relevant at all, specially on Kernel. I'm a lot more interested on the output of `size` to both C and Rust codes (plus the extra size for the Rast to C bindings, if any), as this matters a lot, specially for embedded and IoT, where the constraints for RAM usage can be too tight.
Posted Jul 20, 2021 10:12 UTC (Tue)
by ras (subscriber, #33059)
[Link] (1 responses)
Using vim as a very rough tool to break the code into tokens yields: C = 2134 tokens, Rust = 2261 tokens.
I expected Rust to be higher than C. Putting all that type information on each line comes at a cost. I am somewhat surprised the cost is so only 6%. It might even be within the error margin of my rough tool.
Posted Jul 20, 2021 10:54 UTC (Tue)
by pbonzini (subscriber, #60935)
[Link]
The thing that surprises me the most is how different it looks from C. For example I couldn't follow how the irqchip can be used from C code, all I could see is an implementation of irq::Chip.
That on one hand means that the bindings are idiomatic, on the other hand it means the Linux code base would be split in two even more than I expected.
Posted Jul 19, 2021 17:07 UTC (Mon)
by jgg (subscriber, #55211)
[Link] (15 responses)
... data: &Ref<DeviceData> ...
Quick! Tell me if this code is now in an atomic context? Should it use GFP_KERNEL or GFP_ATOMIC? Does rust check this kind of stuff at compile time?
This is not an improvement (again, what type is this, what do the values mean?):
declare_id_table! {
vs
static const struct amba_id pl061_ids[] = {
The CONFIG_PM memory consumption optimization seems to have been lost?
The trick with the drvdata is... interesting.. I wonder how that works with something like sysfs files that often will read the drvdata? The ordering of the set will be wrong. Could rust tell at compile time?
Posted Jul 19, 2021 17:16 UTC (Mon)
by pbonzini (subscriber, #60935)
[Link] (3 responses)
You trade that for not having to remember, for every spinlock, whether it's the irqsave kind or the "normal" one. It's probably possible to improve on both the Rust problem (too much type erasure) and the C problem.
> Does rust check this kind of stuff at compile time?
In some cases, it's probably possible to pass in a guard that ensures that a given function is called with disabled IRQ or with a given spinlock taken.
Posted Jul 19, 2021 18:36 UTC (Mon)
by jgg (subscriber, #55211)
[Link] (2 responses)
Collapsing everything to a irqsave is a simplification, but not necessarily an improvement..
Posted Jul 19, 2021 21:31 UTC (Mon)
by pbonzini (subscriber, #60935)
[Link]
> if you know you are not already in an irq disabled region you should be using just spin_lock_irq()
That really only matters in 1% of the calls, probably. It should be possible to implement it in Rust in such a way that the type system validates it, though.
Posted Jul 20, 2021 17:02 UTC (Tue)
by willy (subscriber, #9762)
[Link]
We could, in principle, make spin_lock_irq() behave like spin_lock_bh(). That is, it automatically takes care of nesting, can be called in all contexts, and doesn't need a flags argument passed to it (because it stores its state in current).
Would it really have worse performance than the current spin_lock_irq() and spin_lock_irqsave() pair? I'm a little busy right now so I haven't delved into it. But if anyone's looking for a project ...
Posted Jul 19, 2021 17:45 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
This declares a table of IDs (hence the macro name); it would be a simple change to make it:
declare_amba_id_table! {
if the Rust binding authors want you to, so that it's (a) clear this is an AMBA ID table, and (b) the id and mask elements are clearly demarcated. That's a stylistic choice that Linus et al can definitely weigh in on.
Posted Jul 20, 2021 12:59 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (9 responses)
So in this respect C and Rust are no different from each other, you get fancy high level "structure" types in the programming language but the machine code leaves no trace of those types, just whatever register sizes and arithmetic operations exist on that CPU.
I think what you're worried about might be Rust's type inference? This is optional, and I guess the kernel could insist on some more (or even _far_ more, but Linus seems too pragmatic to go down that road) verbosity.
That is, Rust doesn't mind whether you write:
let x: Foozle = something.getFoozle(); // x is a Foozle
or let the compiler infer from its type signature that getFoozle() returns a Foozle, so obviously:
let x = something.getFoozle(); // x is still a Foozle. Same type, just inferred by the compiler
The latter is more idiomatic, but if kernel style says you need to spell out that getFoozle results in a Foozle, so be it.
The same inference can be (and is) used elsewhere, for example:
let sizes: Vec<u32> = something.collect(); // x is a Vec of u32s, so the collect must make one of those
gets the same code as if you spell out to the compiler what sort of collect to use instead of what type the result must be:
let sizes = something.collect::<Vec<u32>>(); // Now we told the compiler which collect() to use, it can infer the type of x
In this case the former is the idiomatic example, but the latter illustrates a "turbofish" ::<> operator that lets you explicitly tell the compiler the type parameters of a generic function. A future kernel Rust style guide could insist on the turbofish everywhere, or even demand both the turbofish and the type of sizes, but I expect many Rust programmers would wonder why, the compiler can do this for us, why tell it what you both already know?
It's certainly useful to stop sometimes and contemplate in Rust, what *is* the type of some.thing(in).this().chain() ? You can teach Vim or Emacs and presumably shiny GUI editors to tell you what the type is of things you're looking at, since Rust is fairly well suited to this question, so it may be that quickly people writing more than one or two patches a year to Rust in the kernel have those tools and never experience concern. But it's also possible Linus decides the style guide needs to force them to use more variables and always spell out types as they go. Choosing never to do type inference should have zero impact on output code, though I can't say the same for the impact on source code readability.
Posted Jul 21, 2021 19:19 UTC (Wed)
by jezuch (subscriber, #52988)
[Link] (8 responses)
I only wanted to say that type inference in Rust feels like magic at times. This is not the best example of this, but yes, it's usually when `collect()` is involved. I sometimes think it could plausibly be called `dwim()` ;)
Posted Jul 22, 2021 10:38 UTC (Thu)
by tialaramex (subscriber, #21167)
[Link]
Stroustrup claims that innovations in C+ are often obliged to be very loud because suspicious C++ programmers worry they won't notice the new thing and will be surprised by its consequences. Such innovations can get through committee but if only when given some more verbosity they didn't really need - to shake awake readers, so, that's what happens. And you end up a few years later writing:
annoying verbosity thing = more annoying verbosity { cool_thing.cool [ more verbosity is obligatory here too ] };
... and now that everybody agrees it's awesome, they find the verbosity annoying, there's pressure to do more standardisation effort so eventually years later it can be:
annoy thing = more annoy { cool_thing.cool [ok ] };
... but Bjarne feels like if they just trusted that it's going to be awesome it need only ever have been
thing = { cool_thing.cool };
To some extent Rust editions give them more options to get in or out of syntactic decisions like this, since an edition is almost completely free to change how Rust is spelled, so long as what it means is unaffected.
But the concerns about quiet changes are real, and I expect the kernel folks to be more worried about them than the average Rust programmer, and thus, less happy to see magic. Requiring a turbofish for every collect() feels crazy to me, but it might not seem as crazy to kernel maintainers.
Posted Jul 22, 2021 18:45 UTC (Thu)
by jgg (subscriber, #55211)
[Link] (6 responses)
The blessing times are when the type is so convoluted and complex that a human can barely manage it. This situation comes up with meta programming quite a bit - ie have a function take these types as input and generate some wild type as output.
The curse is that nobody knows WTF is going on any more. 'auto X' is that an int? A class? How can I use this thing? Basically you are programming in python at this point. At least in C++ auto didn't make things worse as prior to auto the "how can I use this thing" would have typically already been obfuscated to death by the MPL environment.
But here we are going trom a C project where everything is "simple" straight to "read the code and you still have no idea WTF is going on". I wonder if people will accept that. Are people going to upgrade their editors to have inspection?
Will someone figure out tooling to replace grep? (I would dearly love to have a fast semantic grep even in C)
Posted Jul 22, 2021 20:01 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link] (1 responses)
- the compiler will figure it out or it will error and I need to help it (not saying anything); or
In C++, `auto` can do quite a number of things including indicating a copy, move, lifetime extension, etc. depending on how one decorates the `auto`. There are lints from `clang-tidy` to at least decorate your `auto` with `const`, `&`, and `*` as appropriate so you at least know the *syntax* of how to use the variable in question and whether it is modifiable. But in all the Rust I've done, the types in the signatures (no `auto` allowed there) have been sufficient. Really, the same as in Haskell most of the time when I did work there. This is largely because of the better ability of these languages to express type combinations and such.
There are times you want to specify what *shape* your types are when you're working with generic code (usually via the `collect()` mentioned elsewhere in the thread, but `.into()` is also handy to decorate at times), but even there I let the compiler just handle what type is inside of the container (e.g., via `.collect::<Vec<_>>()`). AFAIK, there's no placeholder in C++ to say "figure this part out" like `std::vector<auto> v = some_vector_returning_api();` and instead you are left with either naming the whole thing or obsfuscating the whole thing.
Posted Jul 31, 2021 14:18 UTC (Sat)
by dlazerka (guest, #74781)
[Link]
I came from Java world and it's long forgotten problem there. Your IDE knows it, so it either show it (small grey font) like it was typed there but really isn't, or, if you don't like it, IDE always shows in when your cursor is on the variable name, and hotkeys like Ctrl-B will show you the type, maybe even with its documentation, in a popup.
I believe IntelliJ does that for Rust as well. But for C++ on large projects CLion couldn't pick it up, probably because C++ world doesn't have "default" builder and it's hard to support all the tricky ways people build C++.
Posted Jul 30, 2021 1:22 UTC (Fri)
by tialaramex (subscriber, #21167)
[Link] (2 responses)
As I see it, the main problem is that C++ chooses to deduce types when the programmer intent isn't at all clear.
Inference is used in Rust to let the programmer not state the obvious. Whenever what is meant isn't obvious, Rust prefers to have the programmer spell out what they mean. The language owners have frequently been conservative here, identifying narrow cases that are obvious, inferring those cases only. If the covered inference is too narrow they can iterate. Code which makes explicit something that subsequently became obvious and could now be inferred still compiles, more inference does not break anything.
But because C++ deduction is mandatory in some places, and can be invoked elsewhere with auto even when the type is not clear, C++ must have some rules to "deduce" a type even when it isn't at all clear to the programmer, which is where the surprises leak in.
I think a LOT of places where auto is surprising in C++ ought to be compile errors instead. "Nope, I don't know for sure what the intended type is, write it down so I can check we're both on the same page".
As to tooling, there already exist popular Rust tools that will tell you what types things are, e.g. if you hover over something, or in greyed "ghost" text in your editor next to the line with the inferred type, or by context clicking. If there isn't already a way to say "Find the places I use the type Vec<Foo<i32>>" then I wouldn't anticipate it being difficult to add. One of the things Rust dodged that C++ inherited from C is the tool-hostile macro system. Any non-trivial C++ codebase is riddled with macros which make it impossible for simple tools to understand the source code, but Rust's ordinary ("by example" or declarative) macros are at worst opaque to such tools (ie the tool doesn't understand some macro calls but can press on unphased), while its "procedural" macros - which are effectively code running inside your compiler - at least respect the idea that the program consists of tokens, not just arbitrary text they should mangle. So a tool could definitely mis-understand what's going on in your source due to a proc macro in theory, but in practice that's rare.
Posted Jul 30, 2021 8:07 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (1 responses)
Part of the difference between the two is that Rust permits partial declaration of an unclear type, while C++ does not. It would be easier to punt on the hard cases if C++ provided a way to express things like "it's definitely going to be a std::vector::iterator, but you can infer the type of the contents of the iterator". Something like std::vector<T=auto>::iterator would be currently free syntax that would make it easier for C++ to say "nope, can't infer that", and the change could be introduced in a epoch thingy (by saying that old epoch code has the old rules, but can use the new syntax, but the new epoch code gets a compile failure if the surprising rules are needed to find a type).
And while I'd agree that Rust's procmacros are better for syntax highlighting than CPP (especially if the procmacro author bothered with span information in their error cases), they do have own their issues that CPP doesn't share - the better-macro crate is designed to open a web page with a surprise in it when you use a "standard" macro, and because it's a procmacro, it'll run in your type-inferring tool.
Posted Jul 31, 2021 2:00 UTC (Sat)
by tialaramex (subscriber, #21167)
[Link]
Still, the proc macros that do this are on purpose. "That's scary, punt" is a reasonable strategy for them - what went in was tokens, and what comes out would have been tokens (modulo somebody forking the compiler, Mara again) so if you get scared give up and pass the un-processed macro through to your next phase. I feel like things get out of hand in the C pre-processor in a more gradual frog-boiling way with no individual step seeming outrageous and yet you can't understand your own code.
As to fixing things in C++ with Epochs, my impression is that all the big interesting changes proposed for Epochs turn out to get into difficulties with templates, and specifically template meta-programming. It may be that introducing some extra places where you can write "auto" is allowed without that particular problem, but I don't see it being more than a band-aid.
Posted Oct 8, 2021 13:45 UTC (Fri)
by zesterer (guest, #154652)
[Link]
This is rarely the case.
Rust's type system is very rich and it's common practice to wrap types to compose certain guarantees. For example, `Arc<T>` gives you a reference-counted, immutable `T` and `Mutex<T>` gives you a `T` for which every access is guarded by a mutex. You can compose these guarantees together like `Arc<Mutex<T>>` to make a reference-counted, thread-safe wrapper around a piece of data.
It is not uncommon to transform values to and from these representations, unwrapping and wrapping them as and when invariants need to be maintained. It's very common for the *semantics* of the underlying value to be more important than the actual type.
Don't fear: Rust will prevent you doing dumb stuff.
> Basically you are programming in python at this point
This conveys a fundamental misunderstanding of how type inference vs dynamic typing works. Type inference is sound and requires principal types in all cases, it's just able to be a little cleverer about deciding what type to give terms. It won't, however, infer types that you have not specified *at some point* in the context. These types are still all checked for compatibility at compile-time and the compiler will still tell you about any typing problems.
Python, on the other hand, has a dynamic type system that detects typing errors at runtime. This means that for any given variable, it's incredibly difficult to talk about what invariants pertain to it, how to enforce them, or even what shape the value has.
Python sacrifices compile-time guarantees for a more terse syntax.
Rust does no such thing: it only allows you to omit this information *when you've already made clear to the compiler elsewhere* what's going on.
> C project where everything is "simple"
C is not simple. Its syntax appears simple, but this is only because it makes exceedingly few meaningful guarantees about how values can be used, what invariants need enforcing, or how to safely use values. It is "simple" only in that it shifts all of the complexity on to the human writing the code and, as we all know, humans are fallible fleshy creatures that regularly make catastrophic mistakes.
Rust's explicit purpose is to hoist as much of this complexity on to the compiler as possible. Invariants that previously needed remembering or forgetting-and-later-debugging-after-an-expensive-or-dangerous-failure are now managed by the compiler, meaning that the limited processing power of the human mind can be used to focus on architectural concerns, program logic, etc.
> Are people going to upgrade their editors to have inspection?
No. I've been writing Rust for 6 years now with a run-of-the-miss text editor with syntax highlighting. Not an IDE in sight. It's been fine!
> Will someone figure out tooling to replace grep?
Check out ripgrep. It's written in Rust, is shockingly quick (significantly faster than most existing grep impls), and is very user-friendly.
Posted Jul 19, 2021 17:08 UTC (Mon)
by rvolgers (guest, #63218)
[Link] (1 responses)
I notice I get annoyed at the lack of explicit types on variables and things like ".into()" (which likewise doesn't offer much clue as to typing, just that a conversion is happening). In C, it's relatively important to be aware of types. On one hand due to various traps related to UB and integer arithmetic, and on the other hand because there are no guard rails on most APIs. Rust really encourages you to embed restrictions in the API, so that the use site can be a lot more terse.
Also with "C brain" engaged, some of the nested control flow like in `Chip::get_direction` seems weird. But objectively, I think the Rust version is quicker to read.
I do notice that the indentation is 4 spaces in the Rust version. Pretty sure that's because they're using Rust defaults for formatting and also pretty sure there has been a lot of discussion about this already and I really don't want to start that again. But just mentioning it since it does make nesting look a little worse in the C case.
Posted Jul 19, 2021 22:14 UTC (Mon)
by roc (subscriber, #30627)
[Link]
But I can see why the kernel might not want to assume code readers have such capabilities available.
Posted Jul 19, 2021 19:52 UTC (Mon)
by ikm (guest, #493)
[Link] (2 responses)
Posted Jul 19, 2021 20:51 UTC (Mon)
by marcH (subscriber, #57642)
[Link]
Posted Jul 20, 2021 20:15 UTC (Tue)
by atnot (guest, #124910)
[Link]
Posted Jul 19, 2021 21:31 UTC (Mon)
by pm215 (subscriber, #98099)
[Link]
Posted Jul 19, 2021 22:16 UTC (Mon)
by roc (subscriber, #30627)
[Link] (5 responses)
Posted Jul 19, 2021 22:47 UTC (Mon)
by mathstuf (subscriber, #69389)
[Link] (4 responses)
Err, I mean, the `ENXIO` is just hidden in the UB fog of a `NULL` `gc->gpiodev`, right? ;)
Posted Jul 20, 2021 0:14 UTC (Tue)
by tialaramex (subscriber, #21167)
[Link] (3 responses)
Whereas in Rust that other code cheerfully lets the device driver code run but when the driver asks for the physical hardware, it either gets told the hardware is gone and gives ENXIO or else it gets "working" access and no crash.
I think in both cases there is code elsewhere responsible for waiting until the driver code isn't running (in C) or isn't touching this hardware (in Rust) and when that happens removing the capability to access the hardware so that you'll get ENXIO next time.
Posted Jul 20, 2021 2:12 UTC (Tue)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
> 2. The extra check we have here is because of a feature that the C code doesn't
So it does seem to be "Rust has a few patterns factored out; C has other behaviors/assumptions factored out" kind of difference. It would seem that the C code is "hold it this way and you'll be fine"[2] whereas Rust is more "you have a reference to me, so I must keep myself safe".
[1]https://lore.kernel.org/ksummit/YPW%2FLNwxwEW4h0GM@google...
Posted Jul 20, 2021 3:45 UTC (Tue)
by roc (subscriber, #30627)
[Link]
Posted Jul 20, 2021 14:29 UTC (Tue)
by xav (guest, #18536)
[Link]
That would match my experience with C and Rust very nicely.
Posted Jul 20, 2021 9:46 UTC (Tue)
by stumbles (guest, #8796)
[Link]
Posted Jul 20, 2021 12:31 UTC (Tue)
by epa (subscriber, #39769)
[Link] (11 responses)
Posted Jul 20, 2021 12:45 UTC (Tue)
by HelloWorld (guest, #56129)
[Link] (2 responses)
Posted Jul 21, 2021 2:39 UTC (Wed)
by creemj (subscriber, #56061)
[Link] (1 responses)
Posted Jul 22, 2021 6:26 UTC (Thu)
by pbonzini (subscriber, #60935)
[Link]
Posted Jul 20, 2021 13:54 UTC (Tue)
by Karellen (subscriber, #67644)
[Link] (7 responses)
Old versions of C, and old C compilers, required that a named "const int" would only be defined in one file (e.g. "const int x = 3;"). This is the "One Definition Rule". All the other files would see a declaration (e.g. "extern const int x;"), typically from a header file, so they'd know the variable existed, but they wouldn't know what the value of that variable was. Compilers which worked a file at a time wouldn't be able to perform any optimisations based on the value of the variable. Using the value would require loading it, rather than just using an immediate value.
You could get around this by putting a static declaration in a header file (e.g. "static const int x = 3;") so that each file would get its own private declaration of the variable, which wouldn't clash with the same variable in other files. The value would typically get inlined everywhere, if the compiler did any optimisations at all. And maybe the variable itself would get optimised out if there were no other uses of it.
But you'd be guaranteed to get the actual value inlined everywhere and no left-over variables if you just used a #define (e.g. "#define x 3").
So using "#define" over "const int" is the traditional idiomatic style in C, because compilers were dumb. And even though most compilers aren't dumb any more, some are. And so, that's how people who write C, write C. Which is why Linux and other C codebases use it.
Posted Jul 20, 2021 14:25 UTC (Tue)
by epa (subscriber, #39769)
[Link] (5 responses)
Posted Jul 20, 2021 22:39 UTC (Tue)
by aliguori (subscriber, #30636)
[Link] (4 responses)
Posted Jul 21, 2021 8:55 UTC (Wed)
by chris_se (subscriber, #99706)
[Link]
Interesting. I work mostly with C++ nowadays, so I haven't followed newer C standards closely, but I would have expected that C11 or C17 would improve that. I just looked at the standard and you're correct though, in C they are always int.
In C++11 and newer you can create enums that have any integral type, even something like int64_t, as their backing type. I've been using these basically since C++11 compilers that supported them came out, so for roughly a decade; learning that C still restricts enums to int is completely counter to my current intuition about that language feature. And a good indication that C++ and C have diverged quite a bit nowadays.
Posted Jul 22, 2021 15:51 UTC (Thu)
by wtarreau (subscriber, #51152)
[Link] (2 responses)
That's exactly why I stopped using them to define bit values in masks! They were convenient for GDB at the beginning but not once they became too short :-)
Posted Jul 22, 2021 18:46 UTC (Thu)
by jgg (subscriber, #55211)
[Link] (1 responses)
Posted Jul 24, 2021 10:37 UTC (Sat)
by wtarreau (subscriber, #51152)
[Link]
enum {
unsigned long long last()
0000000000000000 <last>:
With ~0U, the overflow is detected:
00000000 <last>:
Posted Jul 21, 2021 13:34 UTC (Wed)
by pm215 (subscriber, #98099)
[Link]
Posted Jul 21, 2021 23:20 UTC (Wed)
by david.a.wheeler (subscriber, #72896)
[Link] (1 responses)
There's now a lot of discussion about this example, and that's as it should be. You learn a lot by trying to make things actually work.
Posted Aug 23, 2021 4:23 UTC (Mon)
by carORcdr (guest, #141301)
[Link]
Donald E. Knuth, The Art of Computer Programming, Volume 1/Fundamental Algorithms (Second Edition, Addison-Wesley 1973); Ante, p. xvii.
Posted Jul 22, 2021 15:57 UTC (Thu)
by wtarreau (subscriber, #51152)
[Link] (13 responses)
Still I'm seeing that an O(log(N)) loop was routinely turned into O(N) in the interrupt handler, and such jokes can make a significant differences at larger scale if developers are not more careful.
I absolutely hate the principle of lazy unlocking, letting the compiler handle it by itself. This is a horrible practice in my opinion, because the person who wrote the code had a good reason to put a lock and knows exactly where it ought to be unlocked. By not placing the explicit unlock, it will hold till the end (if I understand right), which means that any extra code added to that function will be added under that lock even if not required at all. And often it's not the person who adds the extra code at the end who will take the decision to add an explicit unlock.
If the compiler is able to do that by itself, instead it should be configured to emit a loud warning for missing unlocks in the return paths.
To be honest that still doesn't reconciliate me with this language but I'm very happy that the effort was made to offer the comparison, and yes, I think that for those who can parse it it should be manageable.
Posted Jul 22, 2021 18:14 UTC (Thu)
by ojeda (subscriber, #143370)
[Link] (9 responses)
If you are referring to the in-one-go `ffs`, yes, equivalent facilities should be provided by the Rust side too.
> By not placing the explicit unlock, it will hold till the end (if I understand right), which means that any extra code added to that function will be added under that lock even if not required at all.
Not exactly: the unlock happens at the end of the scope, not at the end of the function.
This is an important distinction, because it allows one to precisely control how long the lock is held. RAII would be unworkable otherwise!
> If the compiler is able to do that by itself, instead it should be configured to emit a loud warning for missing unlocks in the return paths.
Yes, it could be done, and it would be probably a good idea (for cases where RAII is applicable). Another warning could be for double unlocks and unlocks in not-the-opposite-order etc. At that point, one would be doing "explicit RAII".
Being explicit is good in many cases, of course. In the case of RAII, I think being implicit is a good compromise because it avoids a good amount of mechanically-written code while still being easy enough to figure out where the calls happen when needed.
A third, middle-ground option is going a bit meta: RAII as usual, but use tooling that renders the calls in your text editor without actually adding them to the code. This already exists for things like showing inferred types in rust-analyzer.
> To be honest that still doesn't reconciliate me with this language but I'm very happy that the effort was made to offer the comparison, and yes, I think that for those who can parse it it should be manageable.
These words mean a lot to me since I know you dislike the language. Thanks for being open to discussion!
Posted Jul 23, 2021 2:59 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link] (8 responses)
OK that's better, but then I'd appreciate it if an explicit block was written in the function to show where the lock is expected to be held, because the way it's done right now implies it's held for the whole function. And my concern precisely is that any addition to that function will naturally be under the lock.
> In the case of RAII, I think being implicit is a good compromise because it avoids a good amount of mechanically-written code while still being easy enough to figure out where the calls happen when needed.
I'm not fundamentally against this provided that critical sections are well delimited. There are thousands of contributors to the kernel, not everyone has the same ease to grasp the subtleties of certain locks, and making such areas explicit is better. In such a case I'm fine without an explicit unlock if it ends at a block that leaves it possible to add anything at the end of the function, including a printk() that someone would need to better follow the code.
> > To be honest that still doesn't reconciliate me with this language but I'm very happy that the effort was made to offer the comparison, and yes, I think that for those who can parse it it should be manageable.
That's the difference between rejecting things by religion and rejecting them by taste :-) I still find a lot of this totally disgusting but I know we don't all have the same taste. Flies also exist to remind this to us :-)
Posted Jul 23, 2021 12:43 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (3 responses)
Posted Jul 23, 2021 12:53 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link]
Posted Jul 25, 2021 12:18 UTC (Sun)
by intgr (subscriber, #39733)
[Link]
Posted Jul 28, 2021 13:11 UTC (Wed)
by hkalbasi (guest, #153472)
[Link]
Posted Jul 23, 2021 12:55 UTC (Fri)
by ojeda (subscriber, #143370)
[Link] (3 responses)
This could definitely be a reasonable coding guideline for some of the RAII types.
One more alternative is keeping unlocks explicit, but still preventing double unlocks and missing unlocks at compile-time.
Posted Jul 23, 2021 13:09 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link] (2 responses)
Double unlocks I know how to prevent (just consume the guard and make sure the guard type doesn't `impl Clone`). However, without linear types, there's no way to have the *compiler* enforce missing unlocks since `mem::forget` is a thing and the next best thing is a `Drop` impl that detects that you forgot to do something which is only runtime.
Posted Jul 23, 2021 13:39 UTC (Fri)
by ojeda (subscriber, #143370)
[Link] (1 responses)
Posted Jul 23, 2021 13:51 UTC (Fri)
by mathstuf (subscriber, #69389)
[Link]
As for the link error, I don't think there'd be any mechanism to say that "only `unlock_irq` can call `mem::forget` for type `IrqLockGuard`" (since to make it a link error, you'd have to *avoid* the `Drop` call at the use site(s). I suspect ad-hoc tooling (akin to sparse or coccinelle) would be the best option here.
[1] Obviously Rust cannot save you from SIGKILL or power failure; nor can any language for that matter. I see this as more of a system architecture level thing where some languages help more in obtaining such resiliency (say, Erlang) than others, but is possible in even C (see sqlite).
Posted Jul 22, 2021 20:12 UTC (Thu)
by wedsonaf (guest, #151305)
[Link] (2 responses)
Posted Jul 23, 2021 3:04 UTC (Fri)
by wtarreau (subscriber, #51152)
[Link] (1 responses)
perfect, thanks for doing this, at least now the two drivers ought to be equivalent.
I still find the rust one totally unreadable, but as long as there exist people on this planet with a strong enough parser for all the smileys, we should be safe :-)
Posted Jul 27, 2021 4:03 UTC (Tue)
by calumapplepie (guest, #143655)
[Link]
> should be safe :-)
A GPIO driver in Rust
A GPIO driver in Rust
Looks like a typo, there's an irqdisable_spinlock_init below.
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
[explanation why kernel style has more lines]
Nevertheless it's not the line count that counts ;-)
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
let _guard = data.lock();
(0x00041061, 0x000fffff),
}
{
.id = 0x00041061,
.mask = 0x000fffff,
},
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
> (0x00041061, 0x000fffff),
> }
(id = 0x00041061, mask = 0x000fffff),
}
type inference surely?
type inference surely?
type inference surely?
type inference surely?
type inference surely?
- I will specify the type and the compiler will tell me if there's a mistake (by specifying some or all of the type.
type inference surely?
type inference surely?
type inference surely?
type inference surely?
type inference surely?
A GPIO driver in Rust
A GPIO driver in Rust
Machine code
Machine code
Machine code
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
> have: revocable resources. If we didn't want to have this, we could do say
> `data.base.writeb(...)` directly, but then we could have situations where `base`
> is used after the device was removed. By having these checks we guarantee that
> anyone can hold a reference to device state, but they can no longer use hw
> resources after the device is removed.
[2]The bad case being someone stuffing a pointer to the device and calling things directly rather than through the proper API channels.
A GPIO driver in Rust
A GPIO driver in Rust
From my view line count means diddly squat, unless it's horribly written code.
A GPIO driver in Rust
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
Defining constants rather than using preprocessor
LASTN1 = ~0,
LAST,
};
{
return LAST;
}
0: 31 c0 xor %eax,%eax
2: c3 retq
enum1.c:3:2: error: overflow in enumeration values
LAST,
^
With 0xffffffffUL, it only builds on 64-bit (of course). With ~0ULL, the LAST value is always correct, even in 32-bit:
0000000000000000 <last>:
0: 48 b8 00 00 00 00 01 movabs $0x100000000,%rax
7: 00 00 00
a: c3 retq
0: 31 c0 xor %eax,%eax
2: ba 01 00 00 00 mov $0x1,%edx
7: c3 ret
Defining constants rather than using preprocessor
I'm delighted to see this
I'm delighted to see this
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
>
> Not exactly: the unlock happens at the end of the scope, not at the end of the function.
>
> This is an important distinction, because it allows one to precisely control how long the lock is held. RAII would be unworkable otherwise!
>
> These words mean a lot to me since I know you dislike the language. Thanks for being open to discussion!
It seems like you're arguing that any use of a lock guard type should match this pattern:
A GPIO driver in Rust
{ // IRQ locking
let _guard = data.lock();
// code under the IRQ lock
}
// Code not under the IRQ lock
Even if it is the first line in the function. This would add an extra level of indentation to any such function (which is non-trivial under normal kernel indentation rules), but makes it more visible than even C where different effect regions exist. I think this is reasonable for such low-level code where debugging may be useful, but not possible under every kind of lock.
But unlike C++, one can also drop the lock manually if needed:
let _guard = data.lock();
// Added debugging code
drop(_guard);
printk(…);
But forgetting that `drop` call seems like it'd be quite impactful, so maybe the explicit block is indeed better.
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
https://github.com/rust-lang/rust-clippy/issues/7500
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
A GPIO driver in Rust
Willy, I had to add a bunch of facilities to build this driver. Iterating over bits was one I didn't do because I didn't expect people to care about this when N=8 in a demonstration, but I should have known better. :) A GPIO driver in Rust
For completeness, and to make it clear that there is no limitation to Rust that prevents it from doing simple loops like this, I added it to the repo, the commit is here.
And here's what irq_route looks like now:
fn irq_route(data: &Ref<DeviceData>, router: &gpio::IrqRouter) {
if let Some(pl061) = data.resources() {
let pending = pl061.base.readb(GPIOMIS);
for offset in bits_iter(pending) {
router.deliver(offset);
}
}
}
Here are the x86-64 instructions for a loop like the one above (compiler explorer link here):
.LBB0_2:
bsfq %rbx, %rcx
movq $-2, %r15
rolq %cl, %r15
movl %ecx, %edi
callq *%r14
andq %r15, %rbx
jne .LBB0_2
What I like about the Rust version is that it's type-checked and doesn't need special macros -- you iterate over the bits using the very same for-loop idiom you would to iterate over the nodes of a red-black tree. And yet, we get code that is as efficient as C.
A GPIO driver in Rust
A GPIO driver in Rust
...........................^^
