By Nathan Willis
March 20, 2013
The GNU Compiler Collection (GCC)
is nearing the release of version 4.8.0, approximately one year after
the release of 4.7.0. The new release is the first to be implemented with C++, but for most
developers the new optimizations and language support improvements are
of greater interest. Jakub Jelinek announced the first release candidate
builds of GCC 4.8.0 on March 16, noting that if all goes well the
final release could land in less than a week's time.
Chunks, dwarfs, and other optimization
The new release
merges in some important changes to the Graphite memory-optimization
framework, updating it to work with the upstream Chunky Loop Generator (CLooG) and Integer Set Library (ISL) libraries
(where it had previously used internal implementations), and
implementing the PLUTO algorithm as a
polyhedral optimizer. This work moves Graphite significantly closer
to being able to provide a generic polyhedral
interface, though there is still work remaining (such as Static
Control Part detection). Polyhedral loop optimization is a technique
in which nested loop iterations are mapped out in two dimensions, forming
lattice-like graphs to which various geometric transformations (such
as skews) can be applied in an attempt to generate an equivalent
structure that exhibits better performance.
There is a new general-purpose optimization level available in GCC
4.8.0 with the -Og switch, which should provide fast
compilation while still resulting in better runtime performance than the "straightforward" -O0.
The -ftree-partial-pre switch has also been added, which
activates the partial
redundancy elimination (PRE) optimization. PRE eliminates
expressions and values that are redundant in some execution paths,
even if they are not redundant in every path. In
addition, there is a new, more aggressive analysis used by default in
4.8.0 to determine upper bounds on the number of loop iterations. The
analysis relies on constraints imposed by language standards, but this
may cause problems for non-conforming programs which had worked
previously. Consequently, GCC has added a new
-fno-aggressive-loop-optimizations switch to turn off the new
analysis. Although breaking the constraints of the language standard
is frowned upon, there are some notable real-world examples that do
so, such as the SPEC CPU
2006 benchmarking suite.
Several other improvements to optimization arrive in the new
release, including a rewritten link-time optimizer (LTO) and a new symbol
table implementation. Together they should improve performance by
catching more unusual symbol situations (such as aliases) that result
in unreachable code—which can be safely cut out by the LTO. GCC
has also updated its support for the DWARF debugging format from
DWARF2 to DWARF4,
which brings it up to speed with newer versions of GDB and Valgrind.
Two other new features debuting in GCC 4.8.0 are AddressSanitizer
and ThreadSanitizer.
The first is a memory-error detector that is reportedly fast at
finding dangling pointers as well as heap-, stack-, and global-buffer
overflows. The second is a data-race detector, which spots conditions
where two threads try to access the same memory location—and at
least one of them is attempting a write. ThreadSanitizer offers a hybrid
algorithm not found in competing race detectors like Helgrind. Both
new additions are actively being developed at Google.
Language support
The release
notes accompanying 4.8.0 highlight a number of improvements in C,
C++, and Fortran support. The C improvements are all of a diagnostic
nature, such as -Wsizeof-pointer-memaccess, which is a new
option to issue a warning when the length parameters passed to certain
string and memory functions are "suspicious"—namely when the
parameter uses sizeof foo in a situation where an explicit
length is more likely the intent. This option can also suggest
possible fixes.
All diagnostic messages now including printing the offending source
line, and place a caret (^) underneath the appropriate column, to
(hopefully) guide the eye right to the error in question. A
similarly debugging-friendly option that displays the macro expansion
stack in diagnostic messages (-ftrack-macro-expansion=2) is
now enabled by default. In addition, -pedantic has been
deprecated (in favor of -Wpedantic), and -Wshadow has
been fixed. -Wshadow now permits a common use-case certain
kernel developers have long complained was
erroneously flagged as invalid.
C++11 support has been improved, with the addition of the
thread_local keyword, C++11's attribute syntax, and
constructor inheritance. There is also a -std=c++1y flag which
allows developers to experiment with features proposed for the
next revision of the C++ standard (although at the moment GCC
only supports one proposed feature, return
type deduction for normal functions). The libstdc++ library now
provides improved experimental C++11 support as well, plus several improvements
to <random>.
Fortran fans have quite a bit to look forward to, including the
addition of the BACKTRACE
subroutine, support for expressing floating point numbers using "q" as
the exponential notation (e.g., 2.0q31), and Fortran 2003's unlimited
polymorphic variables, which allow dynamic typing. There are also
several new warning flags that can report (among other things) when
variables are not C interoperable, when a pointer may outlive its
target, and when an expression compares REAL and COMPLEX data for
equality or inequality.
However, GCC 4.8.0 will also introduce some potential compatibility
dangers: the ABI changes some internal names (for procedure pointers
and deferred-length character strings), and the version number of
module files (.mod) has been incremented. Recompiling any
modules should allow them to work with any code compiled using GCC
4.8.0.
Targets
Finally, GCC 4.8.0 will introduce quite a few improvements for the
various architecture targets supported. In ARM land, AArch64 support
is brand new, initially supporting just the Cortex-A53 and Cortex-A57
CPUs. The (separate) 32-bit ARM support has added initial support for
the AArch32 extensions in ARMv8. There is also improved support for
Cortex-A7 and Cortex-A15 processors, and initial support for the
Marvell PJ4 CPU. There are also improvements to auto-vectorization
and to the scheduler; the latter can now account for the number of
live registers available (potentially improving execution performance
for large functions).
In the x86 world, GCC gains support for the "Broadwell" processor
family from Intel, the "Steamroller" and "Jaguar" cores from AMD, as
well as several new Intel instruction sets. There are also two new
built-in functions; __builtin_cpu_is is designed to detect
the runtime CPU type, and __builtin_cpu_supports is designed to
detect if the CPU supports specified ISA features. GCC now supports
function
multiversioning for x86, in which one can create multiple versions
of a function—for example, with each one optimized for a
different class of processor.
But the less popular architectures get their share of attention as
well; support has been added for several new MIPS chips (R4700,
Broadcom XLP, and MIPS 34kn), IBM's zEC12 processor for System z, and
the Renesas Electronics V850. There is a lengthy set of improvements
for the SuperH architecture, including multiple new instructions
and improved integer arithmetic. There are also improvements for several
existing architectures: optimized instruction scheduling for SPARC's
Niagara4, miscellaneous new features for PowerPC chips running AIX,
and several features targeting AVR microcontrollers.
Over the years, GCC has maintained a steady pace of new stable
releases, which is especially noteworthy when one stops to consider
how many languages and target architectures it now supports. In
recent years, the project has still managed to introduce interesting
new features, including the Graphite work, for example. There is
still a long list of to-dos, but 4.8.0 is poised to be yet another
dependable release with its share of improvements covering a wide variety
of processors and language features.
(
Log in to post comments)