|
|
Subscribe / Log in / New account

Building the kernel with Clang

By Jake Edge
September 19, 2017

Linux Plumbers Conference

Over the years, there has been a persistent effort to build the Linux kernel using the Clang C compiler that is part of the LLVM project. We last looked in on the effort in a report from the LLVM microconference at the 2015 Linux Plumbers Conference (LPC), but we have followed it before that as well. At this year's LPC, two Google kernel engineers, Greg Hackmann and Nick Desaulniers, came to the Android microconference to update the status; at this point, it is possible to build two long-term support kernels (4.4 and 4.9) with Clang.

[Nick Desaulniers]

Desaulniers began the presentation by answering the most commonly asked question: why build the kernel with Clang? To start with, the Android user space is all built with Clang these days, so Google would like to reduce the number of toolchains it needs to support. He acknowledged that it is really only a benefit to Google and is "not super useful" elsewhere. But there are other reasons that are beneficial to the wider community.

There are some common bugs that often pop up in kernel code, especially out-of-tree code like the third-party drivers that end up in Android devices. The developers are interested in using the static analysis available in Clang to spot those bugs, but the kernel needs to be built using Clang to do so. There are also a number of dynamic-analysis tools that can be used like the various sanitizers (e.g. AddressSanitizer or ASan) and their kernel equivalents (e.g. KernelAddressSanitizer or KASAN).

Clang provides a different set of warnings than GCC does; looking at those will result in higher quality code. It is clearly beneficial to all kernel users to have fewer bugs in it. There are some additional tools that are planned using Clang. One is a control-flow-analysis tool that could enumerate valid stack frames at compile time; those could be checked at run time to eliminate return-oriented programming (ROP) attacks. There is also work going on for link-time optimization (LTO) and profile-guided optimization (PGO) for Clang, which could provide better execution speed, especially for hot paths.

Building code with another compiler is a good way to shake out code that relies on undefined behaviors. Since the language specification does not define certain behaviors, compiler developers can choose whatever is convenient. That choice could change, so even a GCC upgrade might cause misbehavior if some kernel code is relying on undefined behavior. The hope, Desaulniers said, is that both the kernel and LLVM/Clang can improve their code bases from this effort. The kernel is a big project with a lot of code that can find bugs in the compiler; in fact, it already has.

Greg Kroah-Hartman said that "competition is good"; he was strongly in favor of the effort. Desaulniers was glad to hear that as he and others were worried that the tight coupling with GCC was being protected by the kernel developers. Kroah-Hartman said that there have been other compilers building the kernel along the way. Behan Webster also pointed to all of the new features that have come about in GCC over the past five years as a result of the competition with LLVM. Kroah-Hartman said that he wished there was a competitor to the Linux kernel.

[Greg Hackmann]

Hackmann related the state of the upstream kernel: "we are very close to having a kernel that can be built with Clang". It does require using a recent Clang that has some fixes, but the x86_64 and ARM64 kernels can be built, though each architecture has one out-of-tree patch that needs to be applied to do so. There is also one Android-specific Kbuild change that is needed, but only if the Android open-source project (AOSP) pre-built toolchain is being used.

As announced on the kernel mailing list, there are patches available for the 4.4 and 4.9 kernels. There are also experimental branches of the Android kernels for 4.4 and 4.9 available from AOSP. More details can be found in the slides [PDF]. Those branches had just been pushed a few days earlier, Hackmann said, and the HiKey boards were able to build and boot that code shortly thereafter.

There have been LLVM bugs found in the process, though most of them have been fixed at this point, Desaulniers said. The initial work was done with LLVM 4.0, but they have since updated to 5.0 and are also building with the current LLVM development tree (which will become 6.0). You can probably build the kernel with 4.0, he said, but it will be much slower than building with 5.0 or later.

There are still some outstanding issues. Variable-length arrays as non-terminal fields in structures are not supported by Clang, there is a GNU C extension for inline functions that is not supported, and the LLVM assembler cannot be used to build the kernel. Hackmann noted that the GNU assembler is too liberal in what it accepts.

This work has shown that the FUD surrounding using a new toolchain for the kernel is unfounded, Desaulniers said. It is working now, but there are a few asterisks. Clang, the front end, can compile the kernel, but the assembler and the linker from GNU Binutils are needed to complete the build process.

Next up is figuring out how to do automated testing of LLVM and the kernel. Currently, the team is working with two specific LTS kernel branches and using specific LLVM versions. So he can't quite say that Clang will build any kernel, since there are so many different configuration options. A bot to check whether kernel patches will fail to build under Clang is in the works as well. An audience member noted that kernelci.org is looking at adding other compilers to its build-and-boot testing.

Hackmann and Desaulniers encouraged others to try building using Clang. All it takes is a simple "make CC=clang" on a properly equipped system. We are, it seems, quite close to having a two-compiler world for the Linux kernel.

[I would like to thank LWN's travel sponsor, The Linux Foundation, for assistance in traveling to Los Angeles for LPC.]

Index entries for this article
ConferenceLinux Plumbers Conference/2017


to post comments

Building the kernel with clang

Posted Sep 19, 2017 18:13 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (5 responses)

Nice write up, thanks Jake! The slides from our talk can be found here.

Building the kernel with clang

Posted Sep 19, 2017 20:24 UTC (Tue) by nathanchance (subscriber, #118533) [Link] (4 responses)

Thank you for these, I'm going to give it a shot with my personal Android kernel to see what warnings and such come up. It's really nice to see all the work Google has been doing with the kernel as of late. I was particularly happy with the upstream bring up of the Pixel 1 for Oreo and the backporting of a lot of security features.

Building the kernel with clang

Posted Sep 19, 2017 21:22 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (1 responses)

Your welcome. Please CC me on any patches! :D

Building the kernel with clang

Posted Sep 19, 2017 23:57 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link]

s/Your welcome/You're welcome/g ;)

Building the kernel with clang

Posted Sep 20, 2017 4:55 UTC (Wed) by voltagex (guest, #86296) [Link] (1 responses)

Any hints on building Android kernels? How do you test? I'm interested in helping with PostmarketOS but I always end up losing a week to getting build VMs working.

Building the kernel with clang

Posted Sep 20, 2017 17:54 UTC (Wed) by nathanchance (subscriber, #118533) [Link]

I have a thread on XDA regarding this: https://forum.xda-developers.com/android/software-hacking...

The process is basically identical to building a normal desktop kernel (setup defconfig, customize as you need, build with a cross compiler, then install it). Use Google's stock toolchain (linked below). It's a little more difficult than a normal desktop kernel as the entire boot partition is traditionally compressed into an image so you need to pull it off your device, unpack that, add your kernel image and any other files (like modules), then reflash it (either with fastboot or a custom recovery like TWRP). I personally don't run any tests on the kernel after adding patches and building as I am an amateur and don't have that kind of time; runtime is my test environment lol. I just add all the Linux stable upstream patches and pull stuff in from CAF and kernel/common from Google.

https://android.googlesource.com/platform/prebuilts/gcc/l...
https://android.googlesource.com/platform/prebuilts/gcc/l...

Building the kernel with clang

Posted Sep 19, 2017 19:28 UTC (Tue) by sfeam (subscriber, #2841) [Link] (19 responses)

I am curious at what point you decide that if the emitted code can only be assembled by a "too liberal in what it accepts" assembler then it indicates a bug in the code or the compiler. What API or standard applies to the interface between compiler and assembler?

Building the kernel with clang

Posted Sep 19, 2017 20:08 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (9 responses)

Behan Webster was explaining to me at LPC that this was mostly due to ARM's "Unified Assembly Language" in that gcc will pass inline assembly through to the assembler without validation, while Clang will validate the inline assembly is valid by UAL rules. If the arch/arm[64] code was converted to all use UAL, then I think the major obstacle for LLVM would be removed. Though it might be worthwhile to disable that functionality in LLVM, if it were implemented.

Building the kernel with clang

Posted Sep 19, 2017 21:17 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (7 responses)

I just spoke with Greg more about UAL so that I might better understand the context around the issue. It seems that UAL is an issue for ARM (pre ARMv8, so 32b ARM). He likened the situation to: "imagine you have a code base written in AT&T syntax, and your assembler only works with GAS, or vice versa." I'm not sure what the situation is for ARMv8, so I'm going to give it a shot (disabling `-no-integrated-as`) and see what breaks.

Building the kernel with clang

Posted Sep 19, 2017 21:21 UTC (Tue) by ndesaulniers (subscriber, #110768) [Link] (1 responses)

Turns out it breaks pretty quick. Warnings about "DWARF2 only supporting one section per compilation unit," and errors around unexpected tokens and intruction mnemonics. Ok, bugs for another day.

Building the kernel with clang

Posted Sep 25, 2017 14:58 UTC (Mon) by mwsealey (subscriber, #71282) [Link]

Try building with DWARF4.. :)

Building the kernel with clang

Posted Sep 19, 2017 23:05 UTC (Tue) by Sesse (subscriber, #53779) [Link] (1 responses)

You mean Intel and not GAS in your metaphor? (The gas default x86 syntax is AT&T.)

Building the kernel with clang

Posted Sep 20, 2017 0:00 UTC (Wed) by ndesaulniers (subscriber, #110768) [Link]

Yes, sorry! objdump -M intel -d <...>

Building the kernel with clang

Posted Sep 20, 2017 7:04 UTC (Wed) by alison (subscriber, #63752) [Link] (2 responses)

I take it then that there is no hope for building ARM32 with clang? Increasingly it feels like ARM32 users should upgrade to ARM64 in ordert to get not only clang support but also eBPF.

Building the kernel with clang

Posted Sep 20, 2017 21:00 UTC (Wed) by ndesaulniers (subscriber, #110768) [Link]

Personally, my professional work is on arm64 devices, and for a hobby I have x86-64 machines. All other architectures are important to support, but I don't have the bandwidth to support them all. My hope is that work we do on arm64 and x86-64 encourages others to step up and help support their favorite architectures.

Building the kernel with clang

Posted Sep 21, 2017 16:38 UTC (Thu) by mkaehlcke (guest, #61834) [Link]

I have no first hand experience with supporting arm32 builds with clang, but given that the kernel now has the generic/kbuild bits for supporting clang and that there are older patch sets for arm32 out there I imagine it shouldn't be that hard. The stack shared by Arnd Bergmann in April could be a starting point: https://git.kernel.org/pub/scm/linux/kernel/git/arnd/play... / https://www.spinics.net/lists/linux-fsdevel/msg109823.html

Building the kernel with clang

Posted Sep 21, 2017 21:28 UTC (Thu) by behanw (guest, #90443) [Link]

The issue, last I checked, was that there was a lot of pre-UAL assembly in the ARM arch. The gas specific non-UAL way of doing things is unlikely to be supported by any other assembler (including the Integrated Assembler in clang) because it's non-standard, and not completely documented. Essentially "fixing clang" isn't an option here; instead the offending ASM in the kernel needs to be rewritten to follow UAL. This by itself was a controversial idea in the past. No idea whether people would be open to it today.

Building the kernel with clang

Posted Sep 19, 2017 20:50 UTC (Tue) by WolfWings (subscriber, #56790) [Link] (7 responses)

I think it's more the sheer quantity of hand-written assembly in various bits of the kernel has been written to a "be overly tolerant with input, and overly strict with output" mantra, which includes what assembly (inline for standalone files) the GNU assembler accepts versus what the CLANG project will accept.

Building the kernel with clang

Posted Sep 20, 2017 0:51 UTC (Wed) by ncm (guest, #165) [Link] (5 responses)

Being over-tolerant of bad input ends up doing the world no favors. I wonder how many CVEs we got from that policy, as applied to internet services. (Hint: lots!) Now it holds us back from building Linux with other toolchains, and for what? Did it really make writing the asm code noticeably easier?

Any sequence that is rejected by the current version of a tool is a candidate for syntax to mean something useful later on.

Being hard-ass about input grammar is good for everybody.

Building the kernel with clang

Posted Sep 20, 2017 7:18 UTC (Wed) by joib (subscriber, #8541) [Link] (1 responses)

> Being over-tolerant of bad input ends up doing the world no favors.

https://tools.ietf.org/html/draft-thomson-postel-was-wron... lays it out in more detail.

Building the kernel with clang

Posted Sep 21, 2017 22:04 UTC (Thu) by jani (subscriber, #74547) [Link]

I like to call this the Crapustness Principle: The software design guideline based on the illusion of ability to happily process any crap that you're given as if it was actually meaningful. This is when GIGO and the robustness principle collide.

Building the kernel with clang

Posted Sep 20, 2017 14:33 UTC (Wed) by aaron (guest, #282) [Link] (2 responses)

"Be pedantic in what you accept, and arbitrarily brutal in what you send" -- Malcolm Ray

Building the kernel with clang

Posted Sep 21, 2017 1:25 UTC (Thu) by marcH (subscriber, #57642) [Link] (1 responses)

> "Be pedantic in what you accept, and arbitrarily brutal in what you send"

Or... not :-(

http://www.wall.org/~larry/natural.html
https://youtu.be/ju1IMxGSuNE?t=165

Building the kernel with clang

Posted Sep 21, 2017 18:01 UTC (Thu) by niner (subscriber, #26151) [Link]

What does this have to do with aaron's post?

Building the kernel with clang

Posted Sep 20, 2017 18:01 UTC (Wed) by valarauca (guest, #109490) [Link]

I believe you are correct.

By default Clang/LLVM tracks _closely_ to GAS, but with less magic https://llvm.org/docs/LangRef.html#inline-assembler-expre...

Small things like `add` on x64 requires a suffix to state if its an `addw`, `addq`, etc. for example https://clang.llvm.org/compatibility.html#inline-asm

Building the kernel with clang

Posted Sep 21, 2017 22:42 UTC (Thu) by codewiz (subscriber, #63050) [Link]

I am curious at what point you decide that if the emitted code can only be assembled by a "too liberal in what it accepts" assembler then it indicates a bug in the code or the compiler. What API or standard applies to the interface between compiler and assembler?

I think they refer to hand-written inline assembly containing borderline invalid syntax. Suppose, for example, that someone wrote x86 assembly containing "test %eax", with the second operand missing, and gas liberally interpreted it as "test %eax, #0" or "test %eax, %eax".

C, still?

Posted Sep 20, 2017 3:48 UTC (Wed) by ncm (guest, #165) [Link] (38 responses)

Call again when you can build it with clang++. It's getting embarrassing, depending on a C kernel in this millennium. PDP-11s were pretty cool, once, but it comes time to move on.

C, still?

Posted Sep 20, 2017 4:18 UTC (Wed) by eru (subscriber, #2753) [Link] (19 responses)

It seems to me a kernel is the one place where you don't want the compiler doing things behind your back. You need to stay in control. The things C++ adds to C are all about hidden magic.

C, still?

Posted Sep 20, 2017 5:34 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

Kernel is nothing special. Most of the code in Linux is bog-standard C code, except that it uses somewhat unusual "standard library". Rewriting it in something more expressive would easily allow to reduce the kernel size by 2-3 times, if not more.

Truly low-level code is a very small percentage of Linux.

But anyway, Linux is not going to be rewritten in anything any time soon.

C, still?

Posted Sep 20, 2017 7:30 UTC (Wed) by smurf (subscriber, #17840) [Link] (1 responses)

I assume you're talking about size of the source code, not the resulting objects.

C, still?

Posted Sep 20, 2017 7:32 UTC (Wed) by Cyberax (✭ supporter ✭, #52523) [Link]

Sure, but text size also shouldn't be _too_ large.

C, still?

Posted Sep 20, 2017 9:21 UTC (Wed) by NAR (subscriber, #1313) [Link]

On the other hand compiling C is fast but compiling C++ is slooow. In my experience most compilation times were in the coffee-break range, but if a common header was modified (fix documentation comments, add private fields), it was lunch-break time. Pre-compiled headers might help, but I don't know how much. And although modern C++ with all those move semantics can be really effective, it can also be really complicated...

C, still?

Posted Sep 20, 2017 16:43 UTC (Wed) by ncm (guest, #165) [Link] (14 responses)

> It seems to me a kernel is the one place where you don't want
> the compiler doing things behind your back. You need to stay
> in control. The things C++ adds to C are all about hidden magic.

There was a time you could say this, and the greybeards would nod silently and go back to scraping barnacles. But they're dead now, and the greybeards we have today know better.

We know that you aren't obliged to hide what is better not hidden. We know the magic isn't about hiding essential details, it's about whole categories of mistakes made impossible, and the attention spared from watching out for those available for better things. In all programming, no less in kernel programming, by far the scarcest commodity is attention. More productively-applied attention means better code: faster (yes, good C++ code is routinely faster), and doing more of the right things, and fewer of the wrong things. Compilation may be slower (although not compilation of the C subset -- guess what, Gcc uses the same code for both!), but with fewer trivial mistakes that have big consequences, you come out far ahead.

Nobody seriously suggests rewriting Linux in C++, just as nobody suggests rewriting Gcc (although somebody wrote Clang). But an increasing fraction of Gcc is good C++, and is visibly better for it. (Who hasn't noticed Gcc getting better, faster? It's not just competition from Clang.) Linux is coded in C, but C is bad C++, and new code could be good C++.

A totally new kernel in a modern language might be better than a mixed C and C++ Linux, but Linux is what we can have, and Linux can be made better than what we do have, with overwhelmingly less work. Over time, it will be noticed that the overwhelming majority of the bugs, by proportion, are in the old C code, and the quality standard will rise.

C, still?

Posted Sep 21, 2017 9:19 UTC (Thu) by NAR (subscriber, #1313) [Link] (8 responses)

"Compilation may be slower (although not compilation of the C subset -- guess what, Gcc uses the same code for both!)"

If you're only using the C subset of C++ then why are you compiling with the C++ compiler in the first place? The slowness starts with C++-specific stuff, each usage of e.g. std::map<std::string, std::string> includes so much code eventually that it really slows down compilation. Once I read a story about a C++ project that took 2 hours to compile. One developer was bored and copied all code of the project into a single source file: it was compiled in 7 minutes. This is the problem that the C++ ecosystem needs to solve.

C, still?

Posted Sep 21, 2017 11:09 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

There are proposals for modules in the standards committee. The problem with it that I see is that it will require massive overhauls to all of the build tools for proper support. Unless you can determine the module output just from the filename or a basic scan of a file's contents, you need to compile it to see what modules a file will create or consume, so you need a pass before everything else that determines this before you start your real compilation. For example, ninja currently has no support for this (see Brad King's PRs for Fortan support; it's the same logic necessary).

C, still?

Posted Oct 1, 2017 0:42 UTC (Sun) by philomelus (guest, #96366) [Link] (6 responses)

There are ways to fix this that are easy and straight forward. The fact that none of the common c/c++ libraries implement them is a shame.

For an example:

Have you seen something like this in a header?

#ifndef __FOO_H__
#define __FOO_H__
...
#endif // __FOO_H_

That in itself isn't enough. The point of the macro is to prevent reloading a file. If the entire file has to be read in order to exclude it (e.g. compiler has to find the matched #endif because it could be a #else or something), you've gained very little. What saves compilation time is NOT reading the files to begin with.

My projects have used the following structure for more than 20 years (yes, since cfront days):

In source, or header if you like:

#ifndef __FOO_H__
#include "foo.h"
#endif

Then in header, at line 1:

#ifndef __FOO_H__
#define __FOO_H__
// Header comments, (c) notice, license notice, etc.

// other stuff as usual
#endif

Doing the above makes the top of the source files a bit "ugly" in some folks opinion, but the compile time savings are well worth it. With modern c++ template meta programming, one can even make this happen without much involvement of the end user (the programmer, that is).

C, still?

Posted Oct 1, 2017 2:43 UTC (Sun) by viro (subscriber, #7872) [Link] (1 responses)

Compiler can trivially recognize that file is guarded that way (i.e. that having a macro defined guarantees that everything inside will be ifdef'ed out) and skip running a tokenizer over that thing. I've never checked in gcc and clang do that, but sparse sure as hell does and implementation is nothing tricky. See already_tokenized() and handle_ifndef() in pre-process.c; it's really not hard to do. Comments in commit eac4d539b83e ([PATCH] fixed stream->protect handling) describe the logics of the current implementation. If any compiler out there doesn't handle that without rereading/retokenizing the files - just fix it...

C, still?

Posted Oct 17, 2017 21:39 UTC (Tue) by nix (subscriber, #2304) [Link]

Compiler can trivially recognize that file is guarded that way (i.e. that having a macro defined guarantees that everything inside will be ifdef'ed out) and skip running a tokenizer over that thing. I've never checked in gcc and clang do that
GCC certainly does. See the first few conditionals in libcpp/files.c:should_stack_file(). This is even documented (in the node 'Once-Only Headers' in the cpp info doc).

C, still?

Posted Oct 1, 2017 7:17 UTC (Sun) by NAR (subscriber, #1313) [Link]

The problem is that for every single compilation unit that uses std::map<std::string, std::string> the compiler actually has to read the relevant headers. If there are 20 C++ sources files in the directory using the std::map<std::string, std::string> type, the headers will be read and compiled 20 times. ifdef's don't help at all. What would help is using a single file instead of 20 files but that has other problems.

C, still?

Posted Oct 1, 2017 18:24 UTC (Sun) by madscientist (subscriber, #16861) [Link] (1 responses)

Every compiler I've used in the last 5 years supports the "#pragma once" facility. This is much cleaner than ifdefs and just as fast as the "ifdefs around the #include" method you are suggesting.

It isn't standard of course, so if you require maximum portability you can't use it.

C, still?

Posted Oct 5, 2017 9:25 UTC (Thu) by mathstuf (subscriber, #69389) [Link]

Doesn't that have issues with copies of the header floating around though? Though I guess you have "which was first?" questions anyways in similar situations using guards, so maybe it isn't so bad.

include guard

Posted Oct 1, 2017 19:44 UTC (Sun) by Jandar (subscriber, #85683) [Link]

For more than a decade gcc recognizes an include guard and doesn't open the file a second time.

https://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros....

C, still?

Posted Sep 21, 2017 15:19 UTC (Thu) by nix (subscriber, #2304) [Link]

Compilation may be slower (although not compilation of the C subset -- guess what, Gcc uses the same code for both!)
Um, no it doesn't. The frontends are separate, the parser is distinct, there is no way you can say it uses the same code for both except insofar as it also uses the same code for Ada, Objective C and Fortran.

C, still?

Posted Sep 22, 2017 20:34 UTC (Fri) by vomlehn (guest, #45588) [Link] (3 responses)

This greybeard is really tired of this argument. It would nice to have a good object-oriented language for writing kernels. Linux has a fair amount code that is logically object oriented but which is limited to being expressed with some pretty horrible C syntax. However, C++ is not the solution. I don't know of a solution and I don't have the time to take this detour. Apparently nobody else does, either. So, C it is.

C, still?

Posted Sep 24, 2017 8:18 UTC (Sun) by ncm (guest, #165) [Link] (2 responses)

There is little of interest in "object-oriented" languages; very little of the benefit of coding in C++ (vs C) arises from object support in the language, and very little from syntactic convenience. If that were all, there would be hardly any point to switching.

It is about putting the type system to work performing logic at compile time, to generate code that is correct by construction. This is not something exotic; it is daily life for a C++ programmer.

C, still?

Posted Sep 24, 2017 21:36 UTC (Sun) by peter-b (guest, #66996) [Link] (1 responses)

> It is about putting the type system to work performing logic at compile time, to generate code that is correct by construction. This is not something exotic; it is daily life for a C++ programmer.

Alas, this paradigm of C++ programming is a relatively modern concept, which relies heavily on language features introduced in C++11 and more recently. Most C++ projects I've worked on in my career use C++ as "C plus classes", and the use of types-as-compile-time-assertions was a controversially innovative suggestion. It must be nice to work in an environment where this sort of "types for compile-time logic" approach is commonplace. :-)

C, still?

Posted Sep 25, 2017 2:26 UTC (Mon) by ncm (guest, #165) [Link]

It really is.

MongoDB's core server engineering organization is very well-run. (And is hiring.) We are using C++14 for the current release, probably C++17 in the next.

C, still?

Posted Sep 20, 2017 10:20 UTC (Wed) by k3ninho (subscriber, #50375) [Link] (5 responses)

Hey, can you let me know what you think of Some Were Meant for C: The Endurance of an Unmanageable Language (PDF), by Stephen Kell? (Via: http://chneukirchen.org/trivium/)

K3n.

C, still?

Posted Sep 20, 2017 19:46 UTC (Wed) by ncm (guest, #165) [Link] (1 responses)

Thank you for asking.

The paper cited is one in a long line of apologia. There was a time when writing apologia was a respected activity, and apologia were widely persuasive. Although reading old apologia offers a precious glimpse into a lost world, they have ceased to persuade. As well-written as they often were (and remain), too many of the facts they cited have turned out to be falsehoods, and too many of the truisms have turned out, in the fullness of time, to be mere truthiness. (This last term seems quaint now; only a year ago, truth was something even liars pretended to.) Too many of the merits claimed were, in fact, harms, or are merits we may claim without accepting the argument.

In this case, essentially all of its valid arguments apply equally well to C and C++; the author contrasts them with "managed languages". However, he presents as a truism that C is faster than other languages, where we know that well-written C++ is routinely faster than C. In general, though, the arguments are basically irrelevant to the topic of upgrading from C to C++ . Arguments over the merits of "safe" languages, in general, are suspect; the real question is where we expect to get correctness. Testing is nice, and checkers, and validators, but the place to get the pure stuff is by construction. When your language is powerful enough to present facilities (i.e., libraries) that admit only valid operations, without compromising performance, worries over invalid operations creeping in vanish.

C, still?

Posted Sep 21, 2017 5:04 UTC (Thu) by eru (subscriber, #2753) [Link]

When your language is powerful enough to present facilities (i.e., libraries) that admit only valid operations, without compromising performance, worries over invalid operations creeping in vanish.

This I can agree with. But C++ is not that language. My biggest gripe with it is that it cannot protect its abstractions. Where I work, this turns up every time the g++ compiler is upgraded, despite having warnings to the max in mandatory compiler options. The programmers has used the language or its library in a way that happened to work in the old version, but either does not compile in the new version, or crashes. (A contributing problem is also the horrid complexity of modern C++).

C, still?

Posted Sep 22, 2017 9:47 UTC (Fri) by tdz (subscriber, #58733) [Link] (2 responses)

IMHO the author is correct about the benefits of C for any kind of low-level I/O where access to raw data is required or at least beneficial. C++ is in a similar position because for raw-data I/O, it's really just a beefed-up variant of the equivalent C code. But as soon as the programming language favors or requires more I/O abstractions, it looses the advantage that 'managed' and/or 'safe' usually provide.

C, still?

Posted Sep 22, 2017 16:11 UTC (Fri) by ncm (guest, #165) [Link] (1 responses)

> usually provide

Always promise, never provide.

In every case, as noted in the cited article, the language offers some sort of escape hatch to do "unsafe" operations. In this detail they are equivalent to the "safe subset" promoted for C++, that a program steps out of at need.

The relevant difference between languages, for systems programming, is how effectively they can package user-defined abstractions to make it unnecessary for users to step outside the (safe) abstraction. Commonly, certain necessary abstractions can't be expressed as libraries, and so have to be built into the core language, and then are promoted as features "missing" from other languages.

In an otherwise powerful language like Haskell, for example, we see its weakness in resource management papered over with built-in garbage collection, causing the familiar integration problems. When you cannot abstract resource management, abstractions that need to manage resources other than memory necessarily leak, and in integration even memory management leaks.

The Rust project has chosen to provide expressive power, and in many cases better defaults than C++, while making it harder to accidentally do many (but not all) unsafe operations. In ten or twenty years, if it matures well, it may be a good choice for implementing a successor to Linux; but C++ isn't standing still, so the bar is rising.

There is really no way forward for the Linux kernel other than C++. At some point the choice to build as C++ or not will amount to choosing whether to keep or abandon relevance. It's not there yet. It would be better for the project to make the switch before that point.

C, still?

Posted Nov 17, 2017 19:39 UTC (Fri) by marcH (subscriber, #57642) [Link]

> At some point the choice to build as C++ or not will amount to choosing whether to keep or abandon relevance. It's not there yet.

Indeed the competition for the Linux kernel is unfortunately nowhere near yet. In fact I haven't really seen any at all. I bet Rust will pass C++ long before there's any credible one.

C, still?

Posted Sep 20, 2017 10:43 UTC (Wed) by error27 (subscriber, #8346) [Link] (4 responses)

C++ is basically a superset of C. The kernel is moving more and more in the opposite way to use a subset of C. For example, in C you can have an assignment inside an if statement but in the kernel that's forbidden by checkpatch.pl.

It's already pretty common to format code to please static analysis tools. For example, Sparse is limited in how it understands locking so people work around it by making their locking very simple. In twenty years, the kernel will still be written in C but it will mostly be a subset of C that static analysis tools can understand. We're still developing the tools and figuring out that looks like.

C, still?

Posted Sep 20, 2017 11:53 UTC (Wed) by cpitrat (subscriber, #116459) [Link]

Clang is a good way to avoid the drawback of limiting yourself _only_ to please some static analysis tool (the assignment in if example is more to avoid bugs).

C, still?

Posted Sep 20, 2017 12:31 UTC (Wed) by eru (subscriber, #2753) [Link] (2 responses)

Avoiding an assignment inside a condition expression is a good idea also for better human readability.

C, still?

Posted Sep 20, 2017 13:10 UTC (Wed) by karkhaz (subscriber, #99844) [Link]

Also prevents
if ((options == (__WCLONE|__WALL)) && (current->uid = 0))
        retval = -EINVAL;
circa 2003: https://freedom-to-tinker.com/2013/10/09/the-linux-backdoor-attempt-of-2003/

C, still?

Posted Sep 21, 2017 3:54 UTC (Thu) by jreiser (subscriber, #11027) [Link]

If there is only one test (no && or ||) and only one group of assignments (a=b=c but no , [sequential comma]) then assignment inside an if can be OK.

C, still?

Posted Sep 20, 2017 14:38 UTC (Wed) by aaron (guest, #282) [Link] (1 responses)

Redox is over there. -->

C, still?

Posted Sep 21, 2017 13:41 UTC (Thu) by adobriyan (subscriber, #30858) [Link]

kernel.h, math64.h and slab.h are good examplse why even basic function overloading will move kernel to the next level.

I never realized how stupid the decision to ban struct comparisons while allowing struct assignments is until trying to compile kernel with C++ compiler.

Say, it is possible to annotate non-null pointers by switching to references and lose nothing.

But no, we continue to do it the hard way because kernel programming is supposed to be hard, right?

Right?

C, still?

Posted Sep 21, 2017 13:22 UTC (Thu) by adobriyan (subscriber, #30858) [Link] (4 responses)

> Call again when you can build it with clang++.

I actually started doing it and even found a 1.5 bugs in the process:

2c13ce8f6b2f6fd9ba2f9261b1939fc0f62d1307 posix_cpu_timer: Exit early when process has been reaped

50755bc1c305340660bbfa65fdae3ed113d8fe0e seqlock: fix raw_read_seqcount_latch() (+ followup fix)

It is doable to compile with clang++ (but not g++, see C99 intializers) while maintaining source compatibility with certain exceptions like SYSTEM_CALL macro wrappers and some relocation thingy clang doesn't support. But then one needs to win holy war against allocators returning "void *" (recent kvmalloc() enthusiasm doesn't help), pointer arithmetic, "new", "private" etc.

It was quite refreshing to type something like "char min(char x, char y) = delete;" or overload "==" or "enum class" for type safety.

In fact your C++ posts here encouraged me to turn to the dark side. :-)

C, still?

Posted Sep 21, 2017 15:22 UTC (Thu) by khim (subscriber, #9252) [Link] (1 responses)

but not g++, see C99 intializers
C++20 got them thus hopefully GCC would implement them soon. I guess when that would happen there would, finally, could be some discussion about switching to C++ compiler...

C, still?

Posted Sep 21, 2017 16:42 UTC (Thu) by adobriyan (subscriber, #30858) [Link]

> C++20 got them thus hopefully GCC would implement them soon.

Good. I recalled what clang++ didn't support: .label subtractions in alternatives calculations. This is not a problem for compile time checking but showstopper for runtime. Hopefully g++ won't have problems with them.

C, still?

Posted Sep 21, 2017 15:23 UTC (Thu) by excors (subscriber, #95769) [Link] (1 responses)

"But then one needs to win holy war against allocators returning "void *""

If you mean the problem is just that you need to add explicit casts in a million places, maybe you could avoid that relatively cleanly with:

class autocast {
public:
  autocast(void *p) : ptr(p) { }
  template<typename T> operator T*() {
    return static_cast<T*>(ptr);
  }
private:
  void *ptr;
};

#define kmalloc(size, flags) autocast(kmalloc(size, flags))

so it can be implicitly cast to any pointer type (with zero runtime cost).

(Hmm, I wonder if you could then extend it to something like:

template<size_t size>
class checked_autocast {
public:
  checked_autocast(void *p) : ptr(p) { }
  template<typename T> operator T*() {
    static_assert(size >= sizeof(T), "allocated size smaller than return type");
    return static_cast<T*>(ptr);
  }
private:
  void *ptr;
};

#define kmalloc(size, flags) \
  __builtin_choose_expr( \
    __builtin_constant_p(size), \
    checked_autocast<size>(kmalloc(size, flags)), \
    autocast(kmalloc(size, flags)))

to detect some bugs.)

C, still?

Posted Sep 21, 2017 16:26 UTC (Thu) by adobriyan (subscriber, #30858) [Link]

I'm doing

#define lmalloc(T, gfp) ((T*)_kmalloc(sizeof(T), (gfp)))

and friends currently. There were even minor bugs in kernel because of these type of type mismatches.

Building the kernel with clang

Posted Sep 20, 2017 11:55 UTC (Wed) by cpitrat (subscriber, #116459) [Link] (13 responses)

"Kroah-Hartman said that he wished there was a competitor to the Linux kernel."

Good because the Hurd is coming to take the world !

Building the kernel with clang

Posted Sep 20, 2017 12:11 UTC (Wed) by laarmen (subscriber, #63948) [Link] (1 responses)

Sure, now that they have PulseAudio support, nothing will stand in their way!

Building the kernel with clang

Posted Sep 20, 2017 14:20 UTC (Wed) by mageta (subscriber, #89696) [Link]

But PulseAudio is so yesterday already. Pipewire it is now.

Building the kernel with clang

Posted Sep 20, 2017 14:37 UTC (Wed) by lkurusa (guest, #97704) [Link] (1 responses)

Something makes me think Zircon kernel (for Fuchsia by Google) may be that competitor.

Building the kernel with clang

Posted Sep 20, 2017 15:30 UTC (Wed) by cornelio (guest, #117499) [Link]

And FreeBSD ... which already builds just fine with clang.

Building the kernel with clang

Posted Sep 20, 2017 15:29 UTC (Wed) by rvfh (guest, #31018) [Link] (4 responses)

Or did you mean Magenta?

Building the kernel with clang

Posted Sep 20, 2017 16:08 UTC (Wed) by lkurusa (guest, #97704) [Link] (3 responses)

The Magenta kernel was recently renamed to Zircon to stand in line with the neosilicate naming scheme.

Building the kernel with clang

Posted Sep 20, 2017 16:53 UTC (Wed) by ncm (guest, #165) [Link] (2 responses)

Zircon appears to be in transition C -> C++.

It's kind of dumb to name the files ".cpp", instead of ".cc", though. For a long time, portable C++ code had to be in ".cpp" files because MSVC insisted on that, but Zircon doesn't need to be built with MSVC (and anyway, nowadays MSVC can compile ".cc" files).

Building the kernel with clang

Posted Sep 21, 2017 14:34 UTC (Thu) by jond (subscriber, #37669) [Link] (1 responses)

Zircon?

New law: Over a long enough time frame, every IRC client's name will be re-used for something else.

Building the kernel with clang

Posted Sep 24, 2017 7:21 UTC (Sun) by magfr (subscriber, #16052) [Link]

I think your observation is needlessly limited. It should read:
Over a long enough time frame every software project name will be re-used for something else.

Building the kernel with clang

Posted Sep 20, 2017 21:02 UTC (Wed) by ndesaulniers (subscriber, #110768) [Link] (3 responses)

This was a typo, gkh@ said he was hoping for more competition in the toolchain space, as the kernel would benefit as a result. He did not state he was hoping for a competitor to the Linux kernel.

Building the kernel with clang

Posted Sep 21, 2017 1:18 UTC (Thu) by marcH (subscriber, #57642) [Link]

> He did not state he was hoping for a competitor to the Linux kernel.

NIMBY!

Building the kernel with clang

Posted Sep 21, 2017 5:58 UTC (Thu) by gregkh (subscriber, #8) [Link] (1 responses)

Sorry, but you might have missed my follow-on comment which was, "I wish we had a viable competitor to the Linux kernel as well". Jake correctly quoted me here.

Building the kernel with clang

Posted Sep 25, 2017 18:14 UTC (Mon) by leoc (guest, #39773) [Link]

Wish granted. :)

binary diff ?

Posted Sep 20, 2017 12:23 UTC (Wed) by johnjones (guest, #5462) [Link] (1 responses)

whats the difference in kernel binary and why would be very interesting...

Performance diff?

Posted Sep 20, 2017 15:29 UTC (Wed) by mchouque (subscriber, #62087) [Link]

And performance differences based on the same benchmark to see if one compiler could "learn" from the other. Even though that must be a time consuming effort to work on that.

Building the kernel with clang

Posted Oct 24, 2017 1:47 UTC (Tue) by zhiqiu (guest, #119236) [Link]

Hi, I tried to build android-4.4-llvm with clang, by doing:
make ranchu64_defconfig
export ARCH=arm64
export CROSS_COMPILE=aarch64-linux-android-
export CLANG_TRIPLE=aarch64-linux-gnu-
make CC=clang HOSTCC=clang

Building has been done successfully, but when lunch aosp_arm64-eng, and run:
emulator -kernel kernel/common/arch/arm64/boot/Image
I cannot boot into the emulator.
AND when use prebuilt qemu-Image, emulator is ok. I dont know how to use the Image built by clang, would you please explain detailly?

Any help will be really appreciated. Thank you!

Building the kernel with clang

Posted Jan 3, 2018 23:47 UTC (Wed) by ylluminate (guest, #120848) [Link]

Extremely exciting. Thanks for this update. LLVM has a lot to offer in reality.

Also, regarding the lamentation of having another option to Linux itself, as far as I'm seeing illumos is really pushing to be that alternative. Frankly with it's memory management and such, it would be a welcomed change if they can get something figured out for wide driver support.


Copyright © 2017, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds