Cranelift code generation comes to Rust
Cranelift is an Apache-2.0-licensed code-generation backend being developed as part of the Wasmtime runtime for WebAssembly. In October 2023, the Rust project made Cranelift available as an optional component in its nightly toolchain. Users can now use Cranelift as the code-generation backend for debug builds of projects written in Rust, making it an opportune time to look at what makes Cranelift different. Cranelift is designed to compete with existing compilers by generating code more quickly than they can, thanks to a stripped-down design that prioritizes only the most important optimizations.
Fast compiler times are one of the many things that users want from their programming languages. Compile times have been a source of complaints about Rust (and other languages that use LLVM) for some time, despite continuing steady progress by the Rust and LLVM projects. Additionally, a compiler that produces code quickly enough is potentially viable in applications where it currently makes more sense to use an interpreter. All of these factors are cause to think that a compiler that focuses on speed of compilation, rather than the speed of the produced code, could be valuable.
Cranelift's first use was as the backend of Wasmtime's just-in-time (JIT) compiler. Many languages now come equipped with JIT compilers, which often use specialized tricks to quickly compile isolated functions. For example, Python recently added a copy-and-patch JIT that works by taking pre-compiled sections of code for each Python bytecode and stitching them together in memory. JIT compilers often use techniques, such as speculative optimizations, that make it difficult to reuse the compiler outside its original context, since they encode so many assumptions about the specific language for which they were designed.
The developers of Cranelift chose to use a more generic architecture, which means that Cranelift is usable outside of the confines of WebAssembly. The project was originally designed with use in Wasmtime, Rust, and Firefox's SpiderMonkey JavaScript interpreter in mind. The SpiderMonkey project has since decided against using Cranelift for now, but the Cranelift project still has a design intended for easy incorporation into other programs.
Cranelift takes in a custom intermediate representation called CLIF, and directly emits machine code for the target architecture. Unlike many other JIT compilers, Cranelift does not generate code that relies on being able to fall back to using an interpreter in case an assumption is invalidated. That makes it suitable for adopting into non-WebAssembly-related projects.
Cranelift's optimizations
Despite its focus on fast code generation, Cranelift does optimize the code it generates in several ways. Cranelift's optimization pipeline is based on equality graphs (or E-graphs), a data structure for representing sets of equivalent intermediate representations efficiently. In a traditional compiler, the optimizer works by taking the representation of the program produced by parsing and then applying a series of passes to it to produce an optimized version. The order in which optimization passes are performed can have a large impact on the quality of code produced, since some passes require simplifications made by other passes in order to apply. Choosing the correct order in which to apply optimizations is called the phase-ordering problem, and has been the source of a considerable amount of academic research.
In Cranelift, the part of each optimization that recognizes a simpler or faster way to represent a particular construct is separated from the part that chooses what representation should ultimately be used. Each optimization works by finding a particular pattern in the internal representation, and then annotating it as being equivalent to some simplified version. The E-graph data structure represents this efficiently, by allowing two copies of the internal representation to share the nodes that they have in common, and to allow nodes in CLIF to refer to equivalency classes of other nodes, instead of referring to specific other nodes. This produces a dense structure in which adding an alternate form of a particular section of the program is cheap.
Because optimizations run on an E-graph only add information in the form of new annotations, the order of the optimizations does not change the result. As long as the compiler continues running optimizations until they no longer have any new matches (a process known as equality saturation), the E-graph will contain the representation that would have been produced by the optimal ordering of an equivalent sequence of traditional optimization passes — along with many less efficient representations. E-graphs are more efficient than directly storing every possible alternative (taking O(log n) space on average), but they still take more memory than a traditional intermediate representation. Depending on the program in question and the set of optimizations employed, a fully saturated E-graph could be arbitrarily large. In practice, Cranelift sets a limit on how many operations are performed on the graph to prevent it from becoming too large.
E-graphs pay for this simplicity and optimality when it comes time to extract the final representation from the E-graph to use for code generation. Extracting the fastest representation from an E-graph is an NP-complete problem. Cranelift uses a set of heuristics to quickly extract a good-enough representation.
Trading one NP-complete problem (selecting the best order for a set of passes) for another may not seem like a large benefit, but it does make sense for a smaller project. The order of optimization passes is largely set by the programmers who write the optimizations, because it requires domain knowledge to pick a reasonable sequence. Extracting an efficient representation from an E-graph, on the other hand, is a generic search problem that can have as much or as little computer time applied to it as the application permits. Cranelift's heuristics don't extract the most efficient representation, but they do a good job of quickly extracting a decent one.
Representing optimizations in this way also makes it easier for Cranelift maintainers to understand and debug existing optimizations and their effects, and makes writing new optimizations somewhat simpler. Cranelift has a custom domain-specific language (ISLE) that is used internally to specify optimizations.
While Cranelift does not organize its optimizations in phases, it does have ten different sets of related optimizations defined in their own ISLE files, which allows for a rough comparison with GCC and LLVM. LLVM lists 96 optimization passes in its documentation, while GCC has 372. The optimizations that Cranelift does have include constant propagation, bit operation simplifications, vectorization, floating-point operation optimizations, and normalization of comparisons. Dead-code elimination is done implicitly by extracting a representation from the E-graph.
A paper from 2020 showed that Cranelift was an order of magnitude faster than LLVM, while producing code that was approximately twice as slow on some benchmarks. Cranelift was still slower than the paper's authors' custom copy-and-patch JIT compiler, however.
Cranelift for Rust
Cranelift may have been designed with the aim of being an alternate backend for Rust, but actually making it usable has taken significant effort. The Rust compiler has an internal representation (IR) called mid-level IR that it uses to represent type-checked programs. Normally, the compiler converts this to LLVM IR before sending it to the LLVM code-generation backend. In order to use Cranelift, the compiler needed another library that takes mid-level IR and emits CLIF.
That library was largely written by "bjorn3", a Rust compiler team member who contributed more than 3,000 of the approximately 4,000 commits to Rust's Cranelift backend. He wrote a series of progress reports detailing his work. Development began in 2018, and kept pace with Rust's own rapid development. In 2023, the backend was considered stable enough to ship as part of Rust nightly as an optional toolchain component.
People can now try the Cranelift backend using rustup and cargo:
$ rustup component add rustc-codegen-cranelift-preview --toolchain nightly $ export CARGO_PROFILE_DEV_CODEGEN_BACKEND=cranelift $ cargo +nightly build -Zcodegen-backend
The given rustup command adds the Cranelift backend's dynamic library to the set of toolchain components to download and keep up to date locally. Setting the CARGO_PROFILE_DEV_CODEGEN_BACKEND environment variable instructs cargo to use Cranelift for debug builds, and the final cargo invocation builds whatever Rust project lives in the current directory with the alternate code-generation backend feature turned on. The latest progress report from bjorn3 includes additional details on how to configure Cargo to use the new backend by default, without an elaborate command-line dance.
Cranelift is itself written in Rust, making it possible to use as a benchmark to compare itself to LLVM. A full debug build of Cranelift itself using the Cranelift backend took 29.6 seconds on my computer, compared to 37.5 with LLVM (a reduction in wall-clock time of 20%). Those wall-clock times don't tell the full story, however, because of parallelism in the build system. Compiling with Cranelift took 125 CPU-seconds, whereas LLVM took 211 CPU-seconds, a difference of 40%. Incremental builds — rebuilding only Cranelift itself, and none of its dependencies — were faster with both backends. 66ms of CPU time compared to 90ms.
Whether Cranelift will ameliorate users' concerns about slow compile times in Rust remains to be seen, but the initial signs are promising. In any case, Cranelift is an interesting showcase of a different approach to compiler design.
Posted Mar 15, 2024 22:29 UTC (Fri)
by willy (subscriber, #9762)
[Link] (1 responses)
Yes, I'm being facetious, but the metric that is important to me as a programmer is "How many seconds am I waiting for the test to run" (edit/compile/debug cycle). I don't really care how many CPU-seconds are consumed. Amdahl's Law applies, of course, but if a compiler wants to take advantage of 15 extra cores to improve the quality of the code, I have no problem with this.
Posted Mar 16, 2024 10:51 UTC (Sat)
by HadrienG (subscriber, #126763)
[Link]
In a fixed pass order design like LLVM, there is an inherent sequential dependency chain, where each pass must run to completion before another pass can start. Each pass can, in principle, use parallelism internally, but usually parallelizing tiny workloads with running times in milliseconds leads to disappointing results due to task spawning and synchronization overheads dwarfing all the benefits of extra parallelism.
In cranelift's E-graph based design, on the other hand, it is in principle possible to repeatedly run all passes in parallel on the current E-graph[1] until a fixed point is reached. This will not use CPU time as efficiently as running them sequentially because each optimization pass will see less new input E-graph nodes on each run, and more pass runs will be needed to reach the fixed point, which will increase the costs associated with starting/ending passes. But if you are latency bound (no other compilation unit is being built concurrently), using CPUs inefficiently can be better than not using them at all.
Assuming this pass parallelization scheme works well, the running time would eventually be bottlenecked by the final fastest e-graph representation selection step, but if this pass were parallelizable too (and it is if it works by assigning each node a score and searching for the lowest score, or by comparing nodes with each other), that's not an issue.
Ultimately, assuming sufficient L3 cache capacity, larger builds will probably get better overall performance by using coarser-grained compilation unit based parallelism. I wonder how well build systems will cope with the mixing of different levels of parallelism that combining multiple compilation units with parallel compilation of individual compilation units produces.
---
[1] Fine-grained E-graph write synchronization can be used to ensure that each pass sees as many E-graph nodes from other passes as possible on startup, reducing the number of times each pass needs to run and the number of global joins (wait for all passes to finish before moving on) at the expense of a more complex synchronization protocol that will slow down individual accesses to the E-graph.
Posted Mar 16, 2024 1:01 UTC (Sat)
by jalla (guest, #101175)
[Link] (7 responses)
Posted Mar 16, 2024 1:43 UTC (Sat)
by mattdm (subscriber, #18)
[Link] (2 responses)
Posted Mar 16, 2024 2:05 UTC (Sat)
by jalla (guest, #101175)
[Link] (1 responses)
Posted Mar 16, 2024 16:03 UTC (Sat)
by ghodgkins (subscriber, #157257)
[Link]
Posted Mar 16, 2024 12:34 UTC (Sat)
by swig-flail-tricky-sterling (guest, #170190)
[Link] (2 responses)
Posted Mar 17, 2024 0:57 UTC (Sun)
by intelfx (subscriber, #130118)
[Link] (1 responses)
I mean, this is a subscription-only article that the readers are _paying_ to see…
Posted Mar 21, 2024 7:23 UTC (Thu)
by andreashappe (subscriber, #4810)
[Link]
> LWN subscribers at the "professional hacker" level and above can disable ads by going into the account management area and selecting "Customization."
Posted Mar 16, 2024 16:24 UTC (Sat)
by zdzichu (subscriber, #17118)
[Link]
Posted Mar 16, 2024 16:03 UTC (Sat)
by Curan (subscriber, #66186)
[Link] (42 responses)
LLVM ist just not a stable platform you can develop against. So many of my patches for various projects, including Mesa, are just fixups to make the builds work again with recent LLVM versions (internal/private stuff is even crazier, but I can't show, of course). I can accept that major versions break things, but if you need to break core concepts this often, you probably made a lot of mistakes in the past (and seeing how this kind of breakage is not slowing down, the project doesn't seem to wise up either). A lot of LLVM feels like it is a test environment to try out new things for the compiler space (which is great, don't get me wrong), but then it shouldn't be the basis of anything else. The one thing I'll never understand is how so many parts of the Khronos/Mesa ecosystem (and others, including Rust and WebAssembly) can depend on such an unstable platform.
And before anybody says "just stick with some stable LLVM version": you really don't want to be stuck on an old LLVM version, because it almost always will hold your workload back. Which means you have to update. I really do see major upgrades in performance here. Most of it is backend-work, that would be possible without the breakage.
Long story short: as a non-compiler developer I hate LLVM with a vengeance. On the other hand I do understand what LLVM is offering. clang is sometimes producing better binaries than GCC and – depending on what you want – it is easier to experiment with LLVM than GCC. GCC is a classic GNU project with all of the baggage that entails (really wish they would abandon at least their GNU Make system and move to something sensible like CMake), but at least you don't have to worry about breakage. Even with libgccjit.
Posted Mar 16, 2024 21:13 UTC (Sat)
by Wol (subscriber, #4433)
[Link] (2 responses)
Cheers,
Posted Mar 17, 2024 10:01 UTC (Sun)
by khim (subscriber, #9252)
[Link] (1 responses)
It's just simply research projects acts like a research project. As you were advocating in other discussion that means everyone who uses it just have to get their act together and fork (or write something from scratch).
Posted Mar 17, 2024 18:57 UTC (Sun)
by Wol (subscriber, #4433)
[Link]
Drop the "and fork", please. Okay, you can if you want, but the BEST approach (which is what the Rust guys did, I assume) is to *branch* it, fix it, and send pull requests upstream.
If upstream ignores them, then you have to decide what you want to do about it, but what you do NOT do is launch a self-entitled petulant shit-storm at upstream because their priorities are different from yours. If it ends with a full fork, then that's sad, but then you have two - hopefully friendly - projects sharing code, but with different priorities and aims. So be it ...
Cheers,
Posted Mar 17, 2024 9:59 UTC (Sun)
by khim (subscriber, #9252)
[Link] (27 responses)
Nope. Linux kernel breaks internal interfaces pretty often, too. The only problem of LLVM is that it's advertised as something of a separate project while in reality it's only half of many projects. If all these projects would have lived in one repo and people would have changed everything in sync it wouldn't have even been visible. That's the core issue: it was never designed as such. Clang/LLVM developers even explicitly said that you shouldn't try to use it as a stable platform. But lots of companies wanted stable compiler platform and they decreed that LLVM is it against developer's wishes and insistence. Which is precisely what LLVM was designed for. Just open Wikipedia and read: LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. From what you are saying LLVM works and acts like it was designed to work and act so why is that an issue? Build “better basis for anything else”, isn't that the right solution? Maybe as LLVM fork or write from scratch. I was told in no-uncertain terms in somewhat tangetially related discussion just over there that you have zero right to complain since LLVM is free. License. Writing compilers is hard and time-consuming process. Thus there are, realistically, only two choices: LLVM and gcc (via libgccjit). And pointy-haired-bosses out there don't like GPL so LLVM was chosen. Initially they even mandated the use bitcode which produced many stillborn projects (pNaCl, RenderScript and bitcode iOS apps, to name a few), after they realized that developers weren't joking and they couldn't force them to do what they never promised to do bitcode use was abandoned, but since no replacement was available LLVM use continued.
Posted Mar 17, 2024 16:03 UTC (Sun)
by jem (subscriber, #24231)
[Link] (1 responses)
Note the word "originally". That was 21 years ago, and the sentence does not imply it still is nothing more than a research project. On the official LLVM web site we can read "LLVM began as a research project[...] Since then LLVM has grown to be an umbrella project consisting of a number of subprojects, many of which are being used in production by a wide variety of commercial and open source projects[...]"
Posted Mar 17, 2024 16:20 UTC (Sun)
by khim (subscriber, #9252)
[Link]
Beyond certain size it's incredibly hard to change the nature of a project. Like Windows 11, which includes certain design decisions which may be traced back to design decisions made more than half-century ago when TOPS-10 was designed many things in LLVM are still in the shape needed to be a research project.
Posted Mar 17, 2024 16:32 UTC (Sun)
by tialaramex (subscriber, #21167)
[Link] (3 responses)
I think "We assumed C++" is a problem for a research project too. Lots of interesting new work from the last few decades can't happen if you're just "assuming C++" everywhere, what you get out is "Oh well, apparently it's impossible to do better than C++" because you've assumed that's all that's possible.
Posted Mar 17, 2024 17:49 UTC (Sun)
by farnz (subscriber, #17727)
[Link] (2 responses)
And even when they didn't assume C++, there's often been bugs that boil down to "Clang doesn't use this much, therefore it's not routinely tested and there's lot of lurking bugs"; see the fun Rust has had trying to use noalias on references, where because the matching Clang feature (the C99 restrict type qualifier) is rarely used correctly, miscompilations by LLVM traceable to noalias in LLVM IR kept blocking Rust from using it for Rust references (which definitionally can't alias each other).
Posted Mar 17, 2024 19:09 UTC (Sun)
by Wol (subscriber, #4433)
[Link] (1 responses)
:-))
Cheers,
Posted Mar 17, 2024 19:53 UTC (Sun)
by farnz (subscriber, #17727)
[Link]
Implementation language isn't at the root of this; the underlying issue is that LLVM IR's semantics aren't (yet) formally defined, but instead rely on informal reasoning throughout LLVM. As a consequence, it's intractable to verify that LLVM actually implements the claimed semantics, and it's not reasonable to write test cases that validate the LLVM IR semantics are met in edge cases, since we don't actually know what the edge cases are.
There's efforts afoot to fully define LLVM IR semantics formally, and one of the biggest outputs those efforts are having (at the moment) is finding bugs in existing LLVM functionality, where existing LLVM code assumes opposing meanings (that both fit the informally defined semantics) for the same LLVM IR construct in different places.
Posted Mar 17, 2024 17:06 UTC (Sun)
by willy (subscriber, #9762)
[Link] (18 responses)
What is hard is creating a thriving project that has many people who are dedicated to finding & fixing the glass jaws. There's also a question of how much optimisation you really need; TCC takes that to an extreme, but maybe it's a useful extreme.
Posted Mar 17, 2024 19:56 UTC (Sun)
by roc (subscriber, #30627)
[Link] (6 responses)
Posted Mar 17, 2024 20:25 UTC (Sun)
by tialaramex (subscriber, #21167)
[Link] (5 responses)
The purpose would be to wean ourselves off machine code for writing core cryptographic libraries. It would be nice if the sort of people who enter NIST competitions could write this rather than C but it's not crucial.
In this application we actually don't want ordinary optimisation, so I suspect some (many?) optimisation strategies are invalid and it may be faster to begin from almost nothing.
Posted Mar 17, 2024 22:07 UTC (Sun)
by khim (subscriber, #9252)
[Link] (4 responses)
You do realize that for modern CPUs “architecture”, here, would include not just CPU vendor, but stepping, version of microcode, etc? One trivial example: when Intel implemented BMI instructions in 2013 they had nice, constant, execution time, but AMD turned them into nice let's leak all your data to everyone to see version after four years and every microcode update (on both AMD and Intel) may do the same to any instruction — to patch some other vulnerability. Before you may even begin attempting something like this you would need to define what do you want in the end. Given the fact that give enough samples you may even distinguish between ( The whole thing looks like an incredible waste of manpower: instead of trying to achieve something that's not possible to, realistically, achieve on modern CPUs we should ensure that non-ephemeral keys are generated on dedicated core. Adding tiny ARM core (Cell-style) would be much easier and more robust than attempts to create such compiler.
Posted Mar 18, 2024 7:05 UTC (Mon)
by DemiMarie (subscriber, #164188)
[Link] (1 responses)
Hardware crypto engines are nice, but they are not at all a substitute for constant time guarantees for software operations.
Posted Mar 18, 2024 8:55 UTC (Mon)
by khim (subscriber, #9252)
[Link]
Oh, sure. Hardware works. “Constant time guarantees” are a snake oil you may lucratively sell. Completely different products with different properties and target audience. So you can't even change apps, yet, somehow, pretend that they are not leaking your precious key in some other way except for operations being of different speeds depending on source? You keys are not leaking (or maybe leaking but you just don't know that) because nobody targets you. It's as simple as that.
Posted Mar 18, 2024 9:01 UTC (Mon)
by pm215 (subscriber, #98099)
[Link] (1 responses)
Posted Mar 18, 2024 9:08 UTC (Mon)
by khim (subscriber, #9252)
[Link]
They still would depend on alignment of you data and code, on speculative properties of code which was executed before and after you call that “well crafted” code and so on. Just look on continuous struggle to guarantee that SGX is useful for something. With another vulnerability revealed less than week ago. Ultimately the solution would be the same as with memory security in C: solution that was obvious on the day one would be applied… but only after everything else would be unsuccessfully tried.
Posted Mar 18, 2024 15:18 UTC (Mon)
by paulj (subscriber, #341)
[Link] (10 responses)
Posted Mar 18, 2024 16:28 UTC (Mon)
by willy (subscriber, #9762)
[Link] (5 responses)
https://student.cs.uwaterloo.ca/~cs444/
Team of four students builds a compiler in three months.
Posted Mar 18, 2024 18:15 UTC (Mon)
by paulj (subscriber, #341)
[Link] (4 responses)
I was just pointing out the humour in the irony of making that point via an example written by an author who has a (prodigious) habit of solving difficult problems. ;)
I agree though that, even if a basic compiler is simple, there is a /lot/ more to making a _good_ C/C++ compiler.
Posted Mar 18, 2024 23:48 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (1 responses)
Fun fact: If your professor is sufficiently insane, it is possible that you will end up having to write an interpreter for the (untyped) lambda calculus. So count yourself lucky that you got a language that actually looked vaguely modern.
OTOH, I must admit that the lambda calculus is much, *much* easier to implement than most real languages. It only has 2½ rules, or 1½ if you use De Bruijn indexing. But I would've liked to do a real language, or at least something resembling a real language. I often feel that the most difficult courses were the only ones that actually taught me anything useful.
Posted Mar 20, 2024 20:14 UTC (Wed)
by ringerc (subscriber, #3071)
[Link]
I had a comp sci course on concurrency proofs and theory. The tool they used for it sucked so I updated it from the ancient RH4 target it required and replaced the build system. Then fixed some bugs and memory issues. Improved the error messages and generally made the tool nicer to use.
Posted Mar 19, 2024 4:42 UTC (Tue)
by buck (subscriber, #55985)
[Link] (1 responses)
Well, I'm with you: a compiler written by Fabrice Bellard is not your run-of-the-mill hobby project.
But, I'm really more just pointing out that this still blows me away:
https://bellard.org/jslinux/index.html
which i think saves this comment from being (rightly) criticized for being OT, since it's more Linux-y than the article, and this is LWN after all, dang it.
Well, to get this right back on topic, i can actually just point out that JSLinux features tcc and gcc but not clang:
localhost:~# cat readme.txt
b/c the performance win:
[`time gcc -o hello -c hello.c -O0` output elided to spare my old laptop's feelings]
Posted Mar 19, 2024 4:47 UTC (Tue)
by buck (subscriber, #55985)
[Link]
Quoth https://bellard.org/jslinux/news.html:
2020-07-05:
Added the Alpine Linux distribution. Many packages are included such as gcc, Clang, Python 2.7 and 3.8, Node.js, Ruby, PHP, ... The more adventurous (and patient) people can also try to run Wine or Firefox.
Posted Mar 18, 2024 18:02 UTC (Mon)
by farnz (subscriber, #17727)
[Link]
I've written more than one compiler, and I would not count myself as a Fabrice Bellard level developer. You should be able to write a simple optimizing C compiler following a book like this in about 3 months full-time effort if you're a competent developer (less if you're willing to reuse tools like gas and ld rather than doing everything yourself).
Posted Mar 18, 2024 19:57 UTC (Mon)
by Cyberax (✭ supporter ✭, #52523)
[Link] (2 responses)
It really is not a hard problem. Annoying and somewhat long, but not hard.
Posted Mar 19, 2024 5:40 UTC (Tue)
by adobriyan (subscriber, #30858)
[Link] (1 responses)
Posted Mar 19, 2024 5:56 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
Posted Mar 20, 2024 22:44 UTC (Wed)
by Curan (subscriber, #66186)
[Link] (1 responses)
I do not care about internal interfaces. I do care about what they break in their official interfaces. Especially in their C interface (which gets worse every release, it seams like, and forces you to use the even worse C++ interface).
And offering a libllvm/libclang means you have to take some responsibility. At least make the minor versions work across the board all the time. Still get the some fails there, when some piece of software bundles their own LLVM and the system has a different one.
>> a lot of LLVM feels like it is a test environment to try out new things for the compiler space
First of all – as others pointed out too – that is not what LLVM claims itself these days. (NB: LLVM/clang is the standard compiler for eg. Mac OS via XCode.) So either there needs to be a big fat warning at the top of all of LLVM that says "don't use my for production, I am a test environment" or LLVM needs to play ball.
>> but then it shouldn't be the basis of anything else
I have no issue with somebody attempting to build something better (no matter the language or the licensing model). What I do have an issue with is my stuff breaking because of some library. The glibc makes sure my oldest programs still work (even though there will be not much of a chance to get a new version of them for me). I want that commitment from LLVM. I do not care what they do internally. But their interfaces need to be stable enough.
Posted Mar 21, 2024 10:06 UTC (Thu)
by khim (subscriber, #9252)
[Link]
Why should they do that and was there such a promise ever mentioned on their web site? Yes, LLVM is designed around the idea that it would be bundled with a frontend. If you have other ideas then it's your responsibility to support them. Yes. And that pair have stable outer interfaces AFAIK. It does do that if you use it according to it's design. For a long time LLVM wasn't even designed to be used as shared library, but at some point Apple decided to change that. And they did. Now it's easier to embed LLVM into external projects as a shared library, but there are still no promises beyond that. If you want something more than that then it's your responsibility to offer such solution. But who would do the work to ensure that? That's non-trivial amount of work and AFAIK no one ever volunteered.
Posted Mar 18, 2024 6:55 UTC (Mon)
by DemiMarie (subscriber, #164188)
[Link] (6 responses)
I know that Rust has a fork of LLVM. I believe this is partly because sometimes they need to fix miscompiles and can’t wait for upstream to take the patch.
Posted Mar 18, 2024 12:43 UTC (Mon)
by intelfx (subscriber, #130118)
[Link] (4 responses)
Ah yes. I believe it is time for a revised version of the "layers of abstraction" maxim: every releng problem may be solved by pinning and bundling, except for the problem of too much pinning and bundling.
Posted Mar 18, 2024 23:49 UTC (Mon)
by NYKevin (subscriber, #129325)
[Link] (3 responses)
Posted Mar 19, 2024 12:24 UTC (Tue)
by intelfx (subscriber, #130118)
[Link] (1 responses)
I think it's safe to say that the real world is very far from achieving either scenario.
Posted Mar 19, 2024 18:11 UTC (Tue)
by Cyberax (✭ supporter ✭, #52523)
[Link]
That's actually how Nix works (compiler flags and environment are also a part of the content's hash). Some newer languages like Go also have this baked into the module system.
Posted Mar 19, 2024 12:28 UTC (Tue)
by paulj (subscriber, #341)
[Link]
Posted Mar 20, 2024 22:24 UTC (Wed)
by Curan (subscriber, #66186)
[Link]
If they want to offer a libllvm (and all the other wonderful libraries like libclang), with a SONAME, that indicates it is stable, they need to get their act together. Or make it impossible (or at least a manual downstream patch) to do dynamic linking (which they do not). Apart from that: even static linking caused issues in the past. When I linked LLVM statically, I still ended up having issues with executed workloads, that brought their own (sometimes statically linked) versions of LLVM. I have to admit, that I never went for the deep dive into these issues, since I am not given time for that during my day job. But I do have to keep the workloads running. So far that meant either manually "fixing" the LLVM builds (I will not claim these to be universally applicable fixes) or pinning the deployed LLVM version until we could work something out with our various upstreams.
(Side note: building LLVM across platforms is not fun. LLVM constantly breaks one build or another. I am sure Sylvestre (who's behind apt.llvm.org) and others beyond myself could attest to that. So even your "just bundle it" solution falls flat on its face right away.)
Posted Mar 23, 2024 23:08 UTC (Sat)
by donio (guest, #94)
[Link] (3 responses)
Posted Mar 25, 2024 7:34 UTC (Mon)
by Curan (subscriber, #66186)
[Link] (2 responses)
Posted Apr 11, 2024 7:24 UTC (Thu)
by daenzer (subscriber, #7050)
[Link] (1 responses)
And that's just one driver, Mesa uses LLVM in many other places, the elephant in the room being llvmpipe.
Posted Apr 11, 2024 7:39 UTC (Thu)
by Curan (subscriber, #66186)
[Link]
llvmpipe I do not mind. That is a very distinct driver and allows testing for extensions. That would be easy not to build in a production environment with hardware acceleration around. llvmpipe was also very important for virtual machines (these days less). But the core Khronos and/or hardware driver stuff with LLVM dependencies I really wish I could avoid.
Posted Mar 16, 2024 20:17 UTC (Sat)
by willy (subscriber, #9762)
[Link]
https://kobzol.github.io/rust/rustc/2024/03/15/rustc-what...
Posted Mar 17, 2024 17:39 UTC (Sun)
by rvolgers (guest, #63218)
[Link]
Posted Mar 18, 2024 21:01 UTC (Mon)
by proski (subscriber, #104)
[Link]
The resulting binary would panic on startup. The backtrace points to some unsafe code in `fontconfig-rs`. It could actually be a bug in that code. It's nice to have another tool that can find issues in unsafe code. Miri is a tool specifically for that purpose, but it has too many limitations.
Posted Mar 21, 2024 16:44 UTC (Thu)
by sdumitriu (subscriber, #56869)
[Link] (2 responses)
Unrelated to that, it's ambiguous when saying that Cranelift has "ten sets of related optimizations", compared to "96 optimization passes" for LLVM. What is a set? How big are those sets? Are these 10 sets each made of 10 different optimizations, or are there like 20 (or merely the 5 listed in the article) total optimizations, but grouped in 10 different ways?
Posted Mar 21, 2024 17:35 UTC (Thu)
by daroc (editor, #160859)
[Link] (1 responses)
If you are curious, you can read Cranelift's optimizations here[0], and judge for yourself how similar one of their ISLE files is to one of LLVM's optimization passes.
[0]: https://github.com/bytecodealliance/wasmtime/tree/main/cr...
Posted Mar 21, 2024 18:32 UTC (Thu)
by sdumitriu (subscriber, #56869)
[Link]
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
LLVM ist a mess
LLVM ist a mess
Wol
LLVM ist a mess
LLVM ist a mess
Wol
> if you need to break core concepts this often, you probably made a lot of mistakes in the past
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
Wol
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
> One compiler application which feels intuitively useful to me (but I'm not a language designer) would be to have a _non-optimising_ compiler which can translate from a suitable language to dependable constant time machine code for some N architectures where N > 1
LLVM ist a mess
xor %eax,%eax
and mov $1,%eax
(they affect flags and one is 2bytes while other is is 5bytes) first you would need to define some metric which would say if timings are “sufficiently similar” or not.Constant-time cryptography
> Hardware crypto engines are nice, but they are not at all a substitute for constant time guarantees for software operations.
Constant-time cryptography
LLVM ist a mess
> Modern CPUs, at least for Intel and Arm, have an architecturally defined data independent timing mode that you can enable in a status register bit when you want to execute this kind of crypto code, and which then guarantees that execution timing of a specified subset of instructions is not dependent on the data they are operating on.
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
Some tests:
- Compile hello.c with gcc (or tcc):
gcc hello.c -o hello
./hello
- Run QuickJS:
qjs hello.js
- Run python:
python3 bench.py
localhost:~#
LLVM ist a mess
Added SSE2 support to the x86 emulator
Added dynamic resizing of the terminal
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
>
> Nope. Linux kernel breaks internal interfaces pretty often, too. The only problem of LLVM is that it's advertised as something of a separate project while in reality it's only half of many projects.
>
> If all these projects would have lived in one repo and people would have changed everything in sync it wouldn't have even been visible.
>
> Which is precisely what LLVM was designed for. Just open Wikipedia and read: LLVM was originally developed as a research infrastructure to investigate dynamic compilation techniques for static and dynamic programming languages. From what you are saying LLVM works and acts like it was designed to work and act so why is that an issue?
>
> Build “better basis for anything else”, isn't that the right solution? Maybe as LLVM fork or write from scratch.
>
> I was told in no-uncertain terms in somewhat tangetially related discussion just over there that you have zero right to complain since LLVM is free.
> At least make the minor versions work across the board all the time.
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
I think there might be a very slowly growing momentum of projects moving away from LLVM. Mesa's ACO shader compiler is the first example that comes to mind, I believe this was motivated primarily by compilation performance. Zig is also working a new backend that won't require LLVM. LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
LLVM ist a mess
Cranelift code generation comes to Rust
Cranelift code generation comes to Rust
I tried to compile alacritty (a terminal program) using Cranelift. It compiles much faster. It really makes the difference for the development, when I want to test my changes quickly.
Much faster!
Benchmark results
Benchmark results
Benchmark results