GCC gets a new Optimizer Framework
GCC is used as the system compiler for GNU/Linux and many other operating systems. The system compiler is one of the components of an operating system that has a massive impact on the performance of the system as a whole. From the kernel to productivity applications, from the C library to even the compiler itself, almost all executable binaries are compiled with the system compiler, so it has to be stable and produce good code. It is therefore not surprising that major changes to the internals of a stable compiler almost never happen. But computer architectures change, so at some point an aging compiler will have to undergo big surgery or risk becoming irrelevant. And GCC is aging.
While GCC produces reasonably good code for a large number of architectures, even its most recent version essentially builds on the compiler framework started by Richard Stallman in the early 1980's. In this framework, code improving transformations are performed on an intermediate representation called Register Transfer Language (RTL), an architecture independent, lisp-like assembly language. Older versions of GCC used this framework mostly for local optimizations, but such limited optimizations are insufficient for modern architectures with RISC-like properties and a significant difference between the speed of the chip and of memory access.
So, with the release of GCC 3.0, a number of global optimizations acting on RTL were introduced. Unfortunately, for many code transformations, RTL is not a suitable and effective representation because it is too close to the actual machine language. This hinders several of the high-level analyses performed by modern compilers. It has become more and more obvious that a new, high-level intermediate representation needs to be added to GCC. The Tree SSA project has been started to address this need.
The goal of the Tree SSA project is to build a completely machine-independent optimization framework based on the Static Single Assignment (SSA) form. SSA is an intermediate representation (IR) that is becoming increasingly popular because it allows efficient implementations of data flow analysis and optimizing transformations.
In SSA form, every temporary variable is only assigned a value once. Actual programs are seldom in SSA form initially, because variables tend to be assigned multiple times, not just once. An SSA-based compiler modifies the program representation so that every time a variable is assigned in the original program, a new version of the variable is created. Different versions of the same variable are distinguished by subscripting the variable name with its version number.
Variables used in the right-hand side of expressions are renamed so that their version number matches that of the most recent assignment. It is not always possible to statically determine what is the most recent assignment for a given use. These ambiguities are the result of branches and loops in the program's flow of control. To solve them, the SSA form introduces a new type of operation called PHI functions, these merge multiple incoming assignments to generate a new definition; they are placed at points in the program where the flow of control causes more than one assignment to be available.
![]() |
![]() |
| Figure 1 | Figure 2 |
For example, consider the code fragment in Figure 1, where it may not be known at compile time which of the branches will execute. The USE-DEF chains for 'x' are drawn in the figure. In the second 'switch', the compiler has to assume that any of the assignments to 'x' in the first switch may have been executed. In this case, the SSA conversion process will introduce a PHI function for 'x' to create the needed unique definition, as shown in figure 2.
Notice that PHI functions are an artifact used internally by the SSA form and are never emitted in the final code. The PHI function that defines 'x_4' in the previous example simply means that 'x_4' can take the value of 'x_1', 'x_2', or 'x_3' at run time.
Once the program is in SSA form, flow of control and USE-DEF chains are explicitly represented in the intermediate representation, giving almost instantaneous information to passes like constant propagation and folding. The properties of the SSA form greatly simplify data flow analysis, and indeed many traditional compiler optimizations, such as constant and copy propagation and also some forms of common subexpression elimination, are relatively straightforward and fast on functions and even whole programs represented in SSA form.
Before work on these optimizations could start, a whole new optimization framework had to be implemented:
- A new intermediate representation. GCC already constructed each function as an abstract syntax tree (AST), but there was no single AST representation in GCC. Instead, each language defined its own trees which were translated piecewise to RTL and then optimized in the old framework. With Tree SSA, two new language independent representations have been added to resolve this issue.
- Analyses for rewriting the GIMPLE representation in SSA form. In the old framework, no optimizations were performed on the AST. This meant that there was no need for a control flow graph, or for data flow analysis to be performed. All of this is now necessary before a representation can be rewritten into SSA form.
- Passes for performing the actual code optimizations. Passes that have already been implemented include sparse conditional constant propagation, partial redundancy elimination, dead code and dead store elimination, and scalar replacement of aggregates. Also, a lot of dominator tree based optimizations and some conditional execution conversions have been implemented. Many of these passes replace equivalent passes that work on RTL.
All the language front-ends now emit a very high-level IR called GENERIC. Each function is handed over to the language independent parts of the compiler as a tree in GENERIC form. Next, this tree is lowered to GIMPLE form, another new IR derived from the SIMPLE representation proposed by the McCAT project out of McGill University.
The GIMPLE representation looks like three-address code. All side effects are explicit so that a function in GIMPLE form is ready for analysis. Most of the existing front-ends have been modified to emit GENERIC so that they can be optimized using Tree SSA. The next release will also include a Fortran 95 front-end, which is the first front-end built directly to emit GENERIC.
To avoid unnecessary code duplication, a lot of effort was spent on rewriting the old framework so that it was possible to share many of the basic control flow graph manipulations between the old and the new framework. Data flow analyses had to be implemented from scratch.
One particularly interesting analysis is alias analysis. GCC now implements several types of alias analysis: type-based flow-insensitive analysis, flow-insensitive points-to analysis, and flow-sensitive points-to analysis. Most analyses are currently intra-procedural, although some inter-procedural analyses are partially implemented or planned.
All the new parts together account for about 100,000 lines of new code, not including the many changes to existing parts of the compiler. The framework implemented as part of the Tree SSA project adds a whole new path to the compilation process, while no RTL passes have been disabled yet.
Still, a compiler with the Tree SSA passes enabled is not significantly
slower than the recently released GCC 3.4.0, and a number of very
expensive passes in the RTL framework have already been subsumed by
Tree SSA passes.
Once these RTL passes have been disabled and removed, the resulting
compiler will be a lot faster than GCC 3.4.0, while the generated code
is at least as good, and often better.
| Index entries for this article | |
|---|---|
| GuestArticles | Bosscher, Steven |


