Moving the kernel to modern C
Moving the kernel to modern C
Posted Feb 24, 2022 17:36 UTC (Thu) by iabervon (subscriber, #722)Parent article: Moving the kernel to modern C
Posted Feb 24, 2022 17:59 UTC (Thu)
by Paf (subscriber, #91811)
[Link] (28 responses)
Posted Feb 24, 2022 20:09 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link] (5 responses)
Unfortunately this alone doesn't handle loops which exit early due to "break" or "goto". The "goto" case is unavoidable, but the "break" case can be dealt with by wrapping the macro in a second, trivial loop as shown in this example[0]. Note that the generated code (for gcc 5.1 with -O2) is *identical* between the version with the extra loop (traverse1) and the original version which does not set the iterator to NULL after the loop (traverse2). The initialization of the iterator to the flag state (-1), the condition for the outer loop, and the store of NULL to the iterator after the loop are all successfully eliminated.
Posted Feb 24, 2022 20:48 UTC (Thu)
by iabervon (subscriber, #722)
[Link] (3 responses)
extern unsigned long list_iterator_live_after_loop;
and "|| ((pos = (void *) list_iterator_live_after_loop), 0)"
I didn't try changing the kernel macro that way, but my little test code doesn't link if the iterator is used after the loop, but does link and work if it's not used. As I recall, the kernel is already using that sort of trick to use compiler optimization to remove an error message only if the compiler can disprove it.
Posted Feb 25, 2022 22:20 UTC (Fri)
by NYKevin (subscriber, #129325)
[Link] (2 responses)
> The entire program may have zero or one external definition of every identifier with external linkage.
There's no exception for short-circuit operators. If you use it at compile time, for anything other than sizeof, then it has to exist (have storage allocated somewhere).
Posted Feb 26, 2022 0:56 UTC (Sat)
by khim (subscriber, #9252)
[Link] (1 responses)
It's an UB according to the standard. But Linus very rarely is concerned with that: he tends to accept such stupidity only when there are no way convince compiler to stop breaking sane (from Linux developer's POV!) code. It's one of the reasons about why GCC is the only supported compiler, BTW. And GCC not just supports that feature, it even provides
Posted Feb 26, 2022 2:34 UTC (Sat)
by foom (subscriber, #14868)
[Link]
It would be a lot better if Linux used c++ constexpr functions and templates for compile time evaluation semantics, instead of abusing the optimizer to very poorly emulate them.
Posted Feb 25, 2022 1:19 UTC (Fri)
by ianloic (subscriber, #54050)
[Link]
Posted Feb 24, 2022 20:26 UTC (Thu)
by iabervon (subscriber, #722)
[Link] (21 responses)
Posted Feb 24, 2022 22:12 UTC (Thu)
by Paf (subscriber, #91811)
[Link] (20 responses)
God I’d sure love to get to a newer C standard though…
Posted Feb 25, 2022 8:54 UTC (Fri)
by ncm (guest, #165)
[Link] (19 responses)
There would, in any case, be no need to step outside Gcc, where in fact that was done long ago, with no disruption, but with massive benefits. Anybody spooked about C++ should understand that Gcc and Clang are both coded in C++, whatever the language you compile on them.
Similarly, anybody spooked by C++ "hidden code" should understand that Rust does literally all of the things they are spooked by; and all of its power comes from that.
Staying on ancient EOL'd language Standards does nobody any good.
Posted Feb 25, 2022 9:02 UTC (Fri)
by mpr22 (subscriber, #60784)
[Link] (1 responses)
Posted Feb 25, 2022 9:33 UTC (Fri)
by ncm (guest, #165)
[Link]
Posted Feb 25, 2022 9:30 UTC (Fri)
by Wol (subscriber, #4433)
[Link] (12 responses)
A further problem is the size / speed of the code. Yes C++ is *mostly* pretty good, but I suspect the compiler devs will barf on that word "mostly".
To what extent does kernel C currently drop out of C into assembler, and to what extent will C++ make that worse?
No I don't actually know the answers, I'm just predicting the devs' reactions.
Cheers,
Posted Feb 25, 2022 21:24 UTC (Fri)
by ncm (guest, #165)
[Link] (10 responses)
Posted Feb 26, 2022 13:46 UTC (Sat)
by Paf (subscriber, #91811)
[Link] (9 responses)
Posted Feb 26, 2022 22:30 UTC (Sat)
by camhusmj38 (subscriber, #99234)
[Link] (4 responses)
Posted Feb 27, 2022 0:09 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (3 responses)
So you're another idiot who thinks newer = better.
I'm not saying these new-fangled things don't work for you. And plenty of kernel devs use newer tools. But all these idiots going "ooh! new! shiny!" make life hell for people doing the work.
I'm on a kernel mailing list. And reading the emails, I feel like tearing my hair out sometimes. But moving to Rust seems a far better solution than C++.
Cheers,
Posted Feb 27, 2022 8:13 UTC (Sun)
by camhusmj38 (subscriber, #99234)
[Link] (2 responses)
Posted Feb 27, 2022 10:37 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (1 responses)
AND THERE ARE TOO MANY "OOH NEW SHINY" LEMMINGS...
Likewise moving away from email - the problem is a social problem - there AREN'T ENOUGH DEVELOPERS. I think a fair few subsystems ARE developed using solutions like github, gitlab, whatever. If it really worked, surely that model would spread rapidly. But it's not working, and it's not "not working" because the solution is better or worse, it's not working because there aren't enough people to make either solution work.
And actually, probably one of the biggest problems with C++, is that IT'S NOT TRANSPARENT. Developers don't have a clear model in their mind of HOW it works. I come from the days when tomorrow's weather forecast took a day to run on the most powerful computers if you were lucky! If you can't model performance down to the bare metal, you have no clue how long the program is going to take to run. That's one of the reasons the kernel has held back on compilers so long, the disconnect between engineering reality and theoretical correctness.
That's one of the reasons I rant about relational. With Pick the database is transparent - as an application developer I can REASON about performance right through to the OS. That's what's so hard with C++ - the devs can NOT reason through to the hardware. (Okay, it's getting harder even with C, but it's not obfuscated ...)
Cheers,
Posted Feb 27, 2022 16:51 UTC (Sun)
by corbet (editor, #1)
[Link]
Posted Mar 5, 2022 12:28 UTC (Sat)
by nix (subscriber, #2304)
[Link] (3 responses)
The thing ncm likely meant was that Wol's imagined kernel-developers' worries are not actually what they are documented as having worried about. C++ is not a security nightmare, not any more than C, anyway; it hasn't been worse than C optimization-wise for about twenty years; and there is no sudden extra need to drop into asm just because this is C++ (C++ is still nearly a superset of C, and this is just as true of the GNU variant).
Their stated worries are more that C++ has abstractions that enable things to magically happen behind your back with no immediate indication at the call site, and since the kernel developers are really looking for a portable assembler a lot of the time, where everything the machine does is obvious, this is *far* from what they want: they have their hands full coping with parallelism-induced complexities, memory model complexities, looking out for speculative execution gadgets etc etc, without worrying about the apparently same code doing wildly different things depending on what type they're operating on.
Many of C++'s transparently-do-things features are routinely used and almost essential to use anything resembling modern C++: references are the classic case (now function parameters' values can change in the caller without an & at the call site), but also C++ before std::move used to not make it terribly clear whether things were being copied or not (and it was at the very least wordy to enforce one alternative), and even now we have things like stringviews which seem to come with built-in footguns. (Of course, the kernel would never use such pieces, but code review would need to make sure they never crept in... and you only need to forget *once*). Many of the pieces that *don't* amount to 'do this invisibly albeit usually helpfully' are related to templates, and, uh... the kernel is nonswappable code in which size is at a premium, and having the compiler promiscuously generate code to monomorphize templates on the fly was anathema for a long time (though Rust does the same thing, and people seem to be complaining less: maybe RAM is just that much cheaper now and kernel code size is less important? but icache bloat still matters, and Linus has worried about it in public, and that was years ago and it's worse now).
And that's without even mentioning the really big painful problem, so big and painful that there are still compiler switches to disable the feature entirely, so big and painful that it took decades to figure out how to write code safely in the presence of these things and the last time I looked at it the safe code was extremely unobvious and if it was wrong you were unlikely to know for many years until things blew up, because there was no way to automatically check for safety: exceptions. Lovely idea, makes code's non-exceptional path much clearer, but the implementation explodes exceptional flow paths and *all of them are invisible* and many might be in what looks like the middle of an atomic, indivisible entity to someone not thinking "what if this were overloaded and threw?". If you use RTTI for absolutely everything religiously you don't need to worry, but you only have to forget and do manual cleanup once and you're in trouble when you next get an exception passing through that region. The kernel would obviously never use exceptions in the first place, mind you. Of course that now makes it impossible for destructors to fail, which probably rules out *use* of destructors for anything nontrivial, which means you can't use RTTI, which means you can't write anything resembling modern C++. You don't pay for what you don't use, but many of the bits require many of the other bits to use them non-clumsily, and then many of those bits are papering over design faults in the earlier bits -- std::move, again -- and the result of adding all those bits together is *ferociously* complex.
Posted Mar 5, 2022 13:37 UTC (Sat)
by Wol (subscriber, #4433)
[Link]
> Their stated worries are more that C++ has abstractions that enable things to magically happen behind your back with no immediate indication at the call site, and since the kernel developers are really looking for a portable assembler a lot of the time, where everything the machine does is obvious, this is *far* from what they want: they have their hands full coping with parallelism-induced complexities, memory model complexities, looking out for speculative execution gadgets etc etc, without worrying about the apparently same code doing wildly different things depending on what type they're operating on.
Actually, this is pretty much exactly what I was trying to say ... that C++ does things behind your back, and when you're trying to make sure that your code fits in L1 cache or whatever, code bloat is SERIOUS STUFF.
How often do you see kernel developers talking about "fast path"? Quite a lot. And it only takes C++ to do something you don't expect and the fast path will become orders of magnitude slower. WHOOPS!
Cheers,
Posted Mar 11, 2022 16:51 UTC (Fri)
by timon (subscriber, #152974)
[Link] (1 responses)
Posted Mar 17, 2022 16:27 UTC (Thu)
by nix (subscriber, #2304)
[Link]
Posted Feb 26, 2022 22:16 UTC (Sat)
by camhusmj38 (subscriber, #99234)
[Link]
Posted Feb 25, 2022 14:52 UTC (Fri)
by jd (guest, #26381)
[Link] (3 responses)
Is there markup for any of the static checkers beloved by kernel developers that could be used to improve the quality of the results? (And when was the last time Coverity checked the kernel?)
There must be plenty that could be done to improve the kernel code without a drastic change of language.
Posted Feb 25, 2022 18:04 UTC (Fri)
by davej (subscriber, #354)
[Link] (1 responses)
99% of the time, the answer to this question is the same as "when did Linus last cut an -rc/final".
Posted Feb 28, 2022 18:38 UTC (Mon)
by jd (guest, #26381)
[Link]
Posted Feb 25, 2022 21:33 UTC (Fri)
by ncm (guest, #165)
[Link]
Posted Feb 24, 2022 21:11 UTC (Thu)
by abatters (✭ supporter ✭, #6932)
[Link] (1 responses)
Posted Feb 24, 2022 21:55 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
Yes, and you also have macros like for_each_list_entry_continue() which depend on the value being left in the iterator. All of these would also break if the macro was changed to declare the iterator inside the `for` statement, C99-style. One way to work around the problem in your example would be to move the condition inside the loop, like this: The compiler should be smart enough to avoid checking the end condition twice in each iteration. Of course this becomes much less convenient if there is more than one break statement.
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
>
> If an identifier with external linkage is used in any expression other than a non-VLA, (since C99) sizeof, or _Alignof (since C11), there must be one and only one external definition for that identifier somewhere in the entire program.
Moving the kernel to modern C
__attribute__((__error(msg)))
extensions to make error messages more explicit. And GLibC uses it to define __errordecl
macro.Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Wol
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
I suppose what I am saying is that experience outside the Kernel community suggests that using a language which enables the us of low or zero cost abstractions and automates resource management is a good idea. Trying to emulate these features in C89 (using macros!) because C89 is all you have is not a good solution. Preserving with C because it’s what you know is also not the best for a project that is used as a bedrock of modern computing.
Kernel mode C++ standards exist. They’re quite reasonable and not hard to implement or learn.
Moving the kernel to modern C
Wol
Moving the kernel to modern C
And as for Rust, I’m a big fan of it but there is no way that the Kernel is going to be rewritten in Rust. It’s much more viable to replace some data structures and paradigms with C++ alternatives. This is possible without rewriting everything or completely retraining all existing contributors.
And I don’t think calling people idiots is particularly helpful to what is a technical discussion.
Moving the kernel to modern C
Wol
Surely we can find a way to discuss things without calling each other idiots or lemmings, right? Please don't do this anymore.
Stop this please
Moving the kernel to modern C
Moving the kernel to modern C
Wol
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
Moving the kernel to modern C
I usually kick off a run the same day, failing that the following morning.
Moving the kernel to modern C
Moving the kernel to modern C
It would break code that does this:
Moving the kernel to modern C
list_for_each_entry(iterator, &foo_list, list) {
if (do_something_with(iterator)) {
break;
}
}
if (list_entry_is_head(iterator, &foo_list, list)) {
// iteration finished
} else {
do_something_else_with(iterator);
}
All this "compare to head" nonsense is why I prefer regular NULL-terminated linked lists to the kernel's circular linked lists. Insert/delete may take more instructions but iteration is much easier.
Moving the kernel to modern C
list_for_each_entry(iterator, &foo_list, list) {
// ...
if (do_something_with(iterator)) {
do_something_else_with(iterator);
break;
}
// ...
if (&iterator->list == &foo_list) {
// this is the last entry; iteration finished
}
}