Distributions looking at LLVM
The LLVM compiler infrastructure project and its Clang C front-end have been making strides over the last few years, to the point where some distributions are looking into using these tools more widely. We have already seen efforts to build the Linux kernel with Clang, but members of the Fedora and Debian communities (at least) are discussing going beyond that and building the entire distribution with an LLVM-based toolchain. While there are obvious benefits to trying that, it will likely be a ways off—if ever—before the benefits outweigh the costs of such a move.
The LLVM project started in 2000, but it was its adoption by Apple in 2005 that spurred much of its growth. The notoriously GPL-averse company has wanted to move away from using GCC for some time, and the BSD-ish licensed LLVM is the path that it has chosen. It is more than just a licensing issue, though, as LLVM has some technical advantages as well. But it is thought that GCC's move to GPLv3, with its more explicit patent provisions and anti-Tivo-ization language, has made the GCC to LLVM move that much more important to Apple. In any case, Xcode 4, the most recent version of the tool set shipped for building Mac OS X and iOS applications, now includes LLVM rather than GCC.
That change led "jonathan" to post a query to the fedora-devel mailing list about the status of using LLVM for building Linux packages. The query was a bit cryptic, but it spurred an interesting discussion about where LLVM is, and why (or why not) a distribution like Fedora might be interested in heading down that path. Matthew Garrett pointed out that LLVM is already used by the hfsplus-tools package (which provide utilities for the HFS+ filesystem) and the Mesa software rasterizer (llvmpipe), but beyond that:
In essence, Garrett is outlining the requirements for any new feature to be adopted into Fedora. Certainly performance is one of the advantages that is touted for LLVM; Apple claims that it compiles twice as fast as GCC while producing faster code. Undoubtedly, there are some programs where that is the case, but Adam Jackson has done some testing with the X server, and didn't find any huge performance increases for the LLVM-generated code:
This isn't especially surprising. Both llvm and gcc have a robust set of high quality optimization passes. Changing compiler is in this sense little different from changing CFLAGS. [...] The performance problems in Linux - in software in general - are almost always algorithmic, and no compiler is going to magically fix broken algorithms.
(Jackson's footnote says that he should re-run the tests and file a GCC bug.)
In addition to Jackson's tests, Vladimir Makarov has done some benchmarking of GCC vs. LLVM (see the links in the lower left) that don't seem to show any major performance advantages for LLVM—in fact GCC looks like it more than holds it own. In addition, as he notes, the compilation speed comparison is often done with both compilers using the -O2 optimization level, but that it is fairer to compare GCC's -O1 with LLVM's -O2 in terms of generated code quality. When doing so, the 2x speed increase touted by Apple disappeared in his tests.
Debian developer Sylvestre Ledru has run an experiment
rebuilding the Debian archive with LLVM. His focus was not on
benchmarks, rather it was looking at how easily LLVM could handle the large
diversity of C/C++ code in the archive. He was surprised to find
relatively few problems: "I was expecting many issues and bugs caused
by clang but I have been surprised to notice that most of the issues are
either difference in C standard supported, difference of interpretation or
corner cases.
"
Of the 15,600+ packages built, nearly 1400 failed and Ledru documented the kinds of problems he found. Some of the failures were for things like warnings that Clang emitted that GCC didn't, resulting in compilation failure because of the -Werror flag (turn all warnings into errors). In addition, he pointed out that the problems building packages are being fixed rapidly. The Clang 2.9 release in September 2011 failed on 14.5% of packages, while the 3.0 release from January only failed on 8.8%.
Those warnings produced by Clang are actually part of what Ledru was after with his test. One of the aims of LLVM is to generate better warnings and diagnostic information, which will be useful even for packages and distributions that never intend to use LLVM for production. In the end, Ledru concluded:
Whether that happens remains to be seen, of course, and in any event, it's not likely to happen overnight. But LLVM is definitely improving, and some of the BSDs are working on switching permanently (FreeBSD, for example; OpenBSD still seems most interested in pcc). Not having a BSD-licensed compiler that can build the kernel and user space has always been somewhat controversial in BSD-land, so switching to LLVM, which has a non-copyleft license (the University of Illinois/NCSA Open Source License), would be a step in the right direction.
There are definitely still plenty of hurdles to clear, especially for a distribution like Debian that supports lots of different architectures. GCC has a multi-year head start on supporting various CPU architectures, so LLVM may not be available for all of Debian's needs. Jackson pointed out some other deficiencies with LLVM, at least from his perspective:
One thing that the rise of LLVM has done is to provide competition for GCC, which has certainly been helpful in pushing GCC in new and interesting directions. As Jackson put it:
While the Linux kernel has been mostly built using LLVM, it is still a work in
progress. GCC is still required to build some parts and there are changes
needed to LLVM before that can change. Jeff Garzik said that he has been working on it, but "LLVM still needs several obscure
compiler changes before we can even boot a no-op kernel
".
There is little question that experiments and tests with LLVM are a good thing. Various bugs, in packages and compilers, will be ferreted out and both GCC and LLVM will improve. Some are concerned that Apple will rule the project in ways that could be detrimental to other projects (a la CUPS, which was mentioned by several fedora-devel posters), but there is little evidence of that occurring. It is free software, so, if that happens, there will be no barriers to continuing development. A bigger problem could be patents, which are not addressed by the LLVM license—some day a contributing patent-holder could potentially start suing. But that isn't a problem that is confined to LLVM.
It will be interesting to see how the adoption of LLVM goes. It seems likely to start with FreeBSD, but Linux distributions may eventually try it out as well. There will need to be compelling reasons to do so, but with the progress the compiler suite has made, those reasons may not be all that long in coming. Until correctly working kernels can be built, distributions obviously can't switch over completely. But, a complete switch is not necessarily a requirement, and there is nothing stopping interested distributions from building some of their packages from LLVM today.
Posted Mar 22, 2012 7:55 UTC (Thu)
by halla (subscriber, #14185)
[Link] (2 responses)
OpenGTL in turn is used by Krita (but there are also bindings for Gegl) to provide the basic building blocks for the HDR colorspaces, as well as filters and pixel generators. It works really, really well, and makes developing new color models or filters really easy. The OpenCTL and OpenShiva "scripts" are compiled to native code once and then execute just as fast as native C++ filter implementations.
Posted Mar 22, 2012 10:43 UTC (Thu)
by danieldk (guest, #27876)
[Link] (1 responses)
Posted Mar 22, 2012 11:47 UTC (Thu)
by halla (subscriber, #14185)
[Link]
Posted Mar 22, 2012 13:13 UTC (Thu)
by jwakely (subscriber, #60262)
[Link] (3 responses)
Switching a distro to use clang-with-libstdc++ doesn't remove the dependency on GCC but adds a dependency on a combination that isn't properly supported by the vendor of clang or by libstdc++ upstream. That combination is likely to be fragile without an influx of libstdc++ contributors who care about clang, so any distro making the switch would need to do the work themselves, rather than expecting upstream GCC maintainers to help a non-copyleft compiler steal our lunch.
Posted Mar 29, 2012 12:47 UTC (Thu)
by gowen (guest, #23914)
[Link] (2 responses)
Posted Mar 29, 2012 20:15 UTC (Thu)
by mathstuf (subscriber, #69389)
[Link]
Posted Mar 30, 2012 12:17 UTC (Fri)
by jwakely (subscriber, #60262)
[Link]
Some time ago Fedora shipped a version of clang (2.8 IIRC) that was completely incompatible with the system GCC headers, which was silly - they should have shipped an older libstdc++ (alongside the system one) just for use by clang. Clang 3.0 and later has far fewer issues and work well with all but the very bleeding-edgiest libstdc++ headers.
Posted Mar 22, 2012 13:44 UTC (Thu)
by tshow (subscriber, #6411)
[Link] (5 responses)
One of the things I keep hearing about llvm is that the error reporting is supposed to be top notch; it's supposed to be really good at pinpointing the actual error instead of pointing to wherever the compiler finally gave up.
Sadly, in practice, llvm is the worst compiler I've worked with for errors since... well, the early 90s, at least. I've had a typo in a variable name in a single location generate a cascade of hundreds of errors, *none* of which referenced the actual syntax error. Leave out a semicolon by accident and you'll see an error cascade. Put 1,0f instead of 1.0f, error cascade.
Somehow none of these trip up gcc at all.
Debugging was a lot more reliable under gcc as well. Since the llvm switch, two times out of three trying to watch a variable brings XCode down in flames.
Code generation seems no better (our games take the same amount of time to build, roughly, and we've seen no measurable performance difference in the games), and we haven't noticed much else different except for the occasional llvm crash.
Perhaps it's better when it isn't running on OSX, but personally I'm filing llvm under "lots of potential, needs to mature".
Posted Mar 22, 2012 21:34 UTC (Thu)
by Yorick (guest, #19241)
[Link] (4 responses)
I maintain a medium-sized proprietary application, a few million lines of C and C++ code. We use GCC and are mainly happy with it, but do build with Clang from time to time, just to see if it catches something that GCC didn't - and it often does. I also sometimes use Clang for particularly messy C++ work because of its clearer diagnostics.
Interestingly, Clang builds our code base slower than GCC. This is most likely because our compile times are dominated by a few very large (>100000 lines) machine-generated C files with very large functions, and apparently Clang doesn't handle this quite as well as GCC. For more reasonably-sized files, Clang is faster.
The quality of the generated code for our purposes (branchy integer code, very little FP, nothing vectorisable) is comparable between the compilers - the difference is usually not significant.
Posted Mar 23, 2012 8:54 UTC (Fri)
by khim (subscriber, #9252)
[Link] (2 responses)
This is biased selection. Both GCC and Clang have cases where one compiler produces garbage and another gives you nice and clean message, but if you if only run Clang when GCC produced garbage then you are missing cases where GCC gives clean messages and Clang blows up. Again: YMMV. Often Clang is faster but in our codebase there are file which GCC compiles in 50seconds with full optimization while Clang needs 9 minutes - that's 10x slowdown (MSVC is two times worse then Clang). LOL. Our case is similar, too. Clang and GCC produce code of similar speed and size in the end even if one needs 10x more time then the other.
Posted Mar 23, 2012 14:57 UTC (Fri)
by Yorick (guest, #19241)
[Link] (1 responses)
Certainly — swap GCC and Clang, and my statement would have been equally valid. I'm happy that we have not just one but two free compilers of very high quality, that implement most of the same language extensions and even take the same command-line options.
We do run both GCC and Clang with -Wall -Werror, by the way, forcing ourselves to fix even minor complaints from either compiler. This has proven very effective.
Posted Mar 23, 2012 19:33 UTC (Fri)
by oak (guest, #2786)
[Link]
I've used GCC v4.4 and Clang v1.1 / LLVM v2.7 in Debian stable to compile largish C programs and with these old versions GCC provides much superior detection of issues (while the errors Clang reports are more readable).
Which versions of GCC and Clang/LLVM you were using?
Btw. I would recommend adding quite a few extra warning options to GCC & Clang as -Wall misses quite a few things that can go wrong. For example: -Wextra -Wmissing-prototypes -Wstrict-prototypes -Wold-style-definition -Wcast-qual -Wbad-function-cast -Wpointer-arith -Wwrite-strings -Wformat-security -Wshadow.
Posted Mar 30, 2012 16:35 UTC (Fri)
by mlopezibanez (guest, #66088)
[Link]
Posted Mar 22, 2012 19:49 UTC (Thu)
by jhhaller (guest, #56103)
[Link] (3 responses)
I recently saw a case where clang generated incorrect code when a novice programmer used something like "for(d=0.0; d < 5.0; d++)", and d had the wrong value in the loop. Now, I would never use a double as a loop index, but I would rather see an error than wrong code. If all that is done to validate a compiler is to be sure it compiles things properly, this kind of problem won't be found until someone tries to use the software with incorrect code generation. That's not to say one wouldn't have the same issue moving from one release of gcc to another, just that any change in compilers can have unexpected consequences.
Posted Mar 22, 2012 21:47 UTC (Thu)
by NAR (subscriber, #1313)
[Link] (2 responses)
Posted Mar 22, 2012 21:53 UTC (Thu)
by jhhaller (guest, #56103)
[Link] (1 responses)
Posted Mar 23, 2012 14:48 UTC (Fri)
by jezuch (subscriber, #52988)
[Link]
Posted Mar 23, 2012 4:56 UTC (Fri)
by pflugstad (subscriber, #224)
[Link]
Posted Mar 23, 2012 15:06 UTC (Fri)
by walex (guest, #69836)
[Link]
While for some people like the BSD distributions the non-GPL license of LLVM is an advantage, for people like me it is the opposite: non-GPL/non-copyleft licensed software is something I would rather not invest my own time on. Indeed I would be probably using one of the BSD based distributions if they were GPL licensed, but I use GNU/Linux in large part because it is GPL-licensed, and it makes me unhappy that there is no practical GPL licensed alternative to the X Window System. BTW there is another compiler suite that is BSD licensed, the British
TenDRA suite,
which generates ANDF.
Posted Mar 23, 2012 20:30 UTC (Fri)
by scientes (guest, #83068)
[Link] (3 responses)
So Apple is behind the FUD and misinformation.
Posted Mar 23, 2012 21:37 UTC (Fri)
by khim (subscriber, #9252)
[Link] (2 responses)
Well, kinda. Apple does not lie, the problem with Apple's “facts” is tiny GPLv3-related wrinkle: they consider GPLv3 so poisonous they don't ever touch GCC 4.3+. And of course when they compare Apple's LLVM they compare it with Apple's GCC. This means all these comparisons are with five-year old version of GCC! Of course applefans don't know (or don't care) about that fact - that's where FUD comes from. It is conceivable that eventually Clang will win comparison with GCC even in apples-to-apples style - but this battle will not be easy or fast.
Posted Mar 23, 2012 22:06 UTC (Fri)
by scientes (guest, #83068)
[Link] (1 responses)
Posted Mar 26, 2012 11:10 UTC (Mon)
by jwakely (subscriber, #60262)
[Link]
Posted Mar 23, 2012 23:31 UTC (Fri)
by PaXTeam (guest, #24616)
[Link] (8 responses)
i don't know what he's been doing, but i've been compiling linux with clang for almost 2 years now and since clang v3.0+few commits it can be done without any clang/llvm patches, only the linux side needs patching.
Posted Mar 23, 2012 23:43 UTC (Fri)
by mstefani (guest, #31644)
[Link] (5 responses)
Posted Mar 24, 2012 8:41 UTC (Sat)
by PaXTeam (guest, #24616)
[Link] (4 responses)
Posted Mar 24, 2012 18:54 UTC (Sat)
by rahulsundaram (subscriber, #21946)
[Link] (3 responses)
Posted Mar 24, 2012 21:45 UTC (Sat)
by PaXTeam (guest, #24616)
[Link] (2 responses)
Posted Mar 25, 2012 1:09 UTC (Sun)
by rahulsundaram (subscriber, #21946)
[Link] (1 responses)
Posted Mar 25, 2012 15:12 UTC (Sun)
by PaXTeam (guest, #24616)
[Link]
Posted Mar 25, 2012 20:28 UTC (Sun)
by rgmoore (✭ supporter ✭, #75)
[Link] (1 responses)
That may represent a difference in emphasis. If your main goal is to get the compile working, you may be willing to patch whatever gets you there with the least effort, whether it's the compiler or the kernel source. But if your main interest is in improving the compiler so it can compile anything that GCC can, you have to stick with a vanilla kernel source and keep patching the compiler until it works.
Posted Mar 25, 2012 22:53 UTC (Sun)
by PaXTeam (guest, #24616)
[Link]
clang developers do not want to achieve this, see http://clang.llvm.org/compatibility.html and http://clang.llvm.org/docs/UsersManual.html for some of the details (linux gets bitten by VLAs in structures and nested functions, among others).
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
We use GCC and are mainly happy with it, but do build with Clang from time to time, just to see if it catches something that GCC didn't - and it often does.
Interestingly, Clang builds our code base slower than GCC.
This is most likely because our compile times are dominated by a few very large (>100000 lines) machine-generated C files with very large functions, and apparently Clang doesn't handle this quite as well as GCC.
This is biased selection. Both GCC and Clang have cases where one compiler produces garbage and another gives you nice and clean message, but if you if only run Clang when GCC produced garbage then you are missing cases where GCC gives clean messages and Clang blows up.
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
My own experience is quite the opposite - Clang's diagnostics are both more useful and precise, and there are also more warnings about potential mistakes. Perhaps you used an old version?
I would encourage you and everybody else to report bugs for GCC diagnostics that you find to be worse than Clang's. GCC diagnostics have improved a LOT in the last few releases. In the order of hundreds of patches and probably close to a hundred bugs fixed per release.
Fortunately, many issues are quite trivial, and they can be fixed by changing one line. Unfortunately, the entry barrier for submitting a one-line patch is so huge that very few external contributors ever suggest such patches.
Unfortunately x 2, some issues are not so easy to solve and there are some known limitations of GCC diagnostics that makes it look worse than Clang. Overcoming these limitations does not seem to be a priority to GCC maintainers. It is true that despite the bitching in blogs and forums about how awful GCC diagnostics are, the developers don't actually see that many reports about it. But I think if enough people reported the problems that they find, something will eventually be done about it.
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Building code with multiple compilers
Some people prefer investing in GPL licensed software instead
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
especially when they are probably running on hardware that wasn't manufactured 5 years ago, and in that case LLVM probably compiles for CPU features and architecture optimizations that the five-year-old GCC 4.2 couldn't possibly know about.
Distributions looking at LLVM
Distributions looking at LLVM
> several obscure compiler changes before we can even boot a no-op kernel".
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
Distributions looking at LLVM
