LWN.net Logo

GCC Explorer - an interactive take on compilation

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 17:06 UTC (Thu) by dgm (subscriber, #49227)
In reply to: GCC Explorer - an interactive take on compilation by Richard_J_Neill
Parent article: GCC Explorer - an interactive take on compilation

One could argue that in the initial case gcc was even being "too clever". C is a language where there's no mandatory runtime. You can write perfectly valid programs that do not use the C run time library at all. So, assuming specific semantics for printf() is a bit... hairy.


(Log in to post comments)

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 17:29 UTC (Thu) by Richard_J_Neill (subscriber, #23093) [Link]

Well...surely if a function takes only constant arguments, and produces deterministic output, then the compiler ought to be able to "execute" that function at compile-time. In the trivial case:

int add (int a, int b){
return (a + b);
}

printf ("Result is %d\n", add (5, 6));

then I'd expect GCC to do what a human could, and optimise to:
puts ("Result is 11\n");

Experimentally, GCC is optimising out the add() call entirely, but it isn't optimising the printf. Why one, not the other?

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 19:48 UTC (Thu) by nybble41 (subscriber, #55106) [Link]

> Experimentally, GCC is optimising out the add() call entirely, but it isn't optimising the printf. Why one, not the other?

Well, for one thing, the definition of add() is visible in the same source file, while printf() is a library function which is only defined at runtime. For the optimization into puts(), the compiler is assuming you will link with a libc providing standards-compliant printf() and puts() functions.

I'd prefer a proper preprocessor capable of performing this sort of optimization, rather than hard-coding functions like printf() into the compiler. Common Lisp macros can work this way; Calls to (format) which are decipherable at compile-time can be reduced by a macro into a series of low-level output routines, skipping the format string entirely. Unfortunately, for that you need conditionals and loops or recursion; the C preprocessor isn't up to the task. One could possibly do something similar with C++ templates, which are apparently Turing-complete, but it would require more boredom and/or masochism than I'm ever likely to experience.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 19:58 UTC (Thu) by Richard_J_Neill (subscriber, #23093) [Link]

That makes some sense as to why it is this way - but still doesn't really explain why it can't be better. Assuming that we do a one-step compile (i.e. gcc knows what's being linked), then the compiler should be able to do better. For that matter, printf() isn't even explicitly linked; it's just defined by "#include stdio.h", without a corresponding "-lstdio".

Runtime linking is a wonderful thing, that reduces the size of code, and allows for shared libraries without dependency hell. But I still contend that if I ask the compiler to do -O3, then it should be able to apply the combined results of thousands of man-years of experience to generate better code than I could. So, in this instance, why not have gcc inline the printf() function, and then optimise it.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 20:27 UTC (Thu) by khim (subscriber, #9252) [Link]

But I still contend that if I ask the compiler to do -O3, then it should be able to apply the combined results of thousands of man-years of experience to generate better code than I could.

Sorry, but no. When human spends a week or month to generate code this is accepted. When compiler spends measly 1.5 hours people are becoming mightly antsy.

So, in this instance, why not have gcc inline the printf() function, and then optimise it.

Try to take a look on printf implementation one of these days. That's... quite an experience. Computed gotos and function pointers, bazillion tables and locales, I'm not sure even human can optimize it to a single string. No, I take it back: I know human can't do that because it's not possible. Output of %d depends on locale!

Compiler is clever beast, but there are limits for what it can do.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 20:49 UTC (Thu) by Richard_J_Neill (subscriber, #23093) [Link]

Wow - that's a really complicated beast! But still, the computer is capable of executing the compiled printf function. Forgetting for a moment the problem of locale, then if I write:
printf ("This is: %d\n", 42);
the computer will, at runtime, eventually have the string
"This is: 42\n"
in a buffer somewhere, which then gets sent to stdout.
So, why shouldn't that runtime execution be anticipated at compiletime?
GCC doesn't have to understand printf(), it simply needs to invoke it.

In fact, the preprocessor can do it (via stringify), but that requires significant extra work for the programmer.

[Of course, your point about locales may explain why this doesn't happen... I might never care about my program being run in locations with a different decimal format, nor wish to sacrifice performance for that, but it means that the presumed-deterministic output of printf() is actually dependent on the environment variables.]

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 8:48 UTC (Fri) by ekj (guest, #1524) [Link]

It's not enough to invoke it and notice that it returns "blablah 42\n". To be able to optimise it away, gcc would have to be certain that it -always- returns precisely that. (not just that it did, this time)

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 15:20 UTC (Fri) by jimparis (subscriber, #38647) [Link]

Even more to the point, printf doesn't actually return "blablah 42\n" at all -- it writes it to stdout as a side-effect. So you can't even say "simulate execution and see the result" without knowing that a particular small piece of inline assembly deep inside the libc stdio that was called near the end of printf() execution happens to trigger kernel magic. It's really just not feasible to do this type of optimization in the general case. And it's just impossible in this specific case, as mentioned by khim.

GCC Explorer - an interactive take on compilation

Posted May 28, 2012 13:59 UTC (Mon) by etienne (subscriber, #25256) [Link]

> So, why shouldn't that runtime execution be anticipated at compiletime?

IMHO C++ pushes a lot of things from the .data section (i.e. initialised variables/structures) into runtime execution/initialisation (object constructors), when compared to C.
Moreover some obvious optimisations cannot be done because objects "constant during execution" are still not initialised at compilation time, so compiler cannot do anything about them.
It is as if we would need another step in producing an executable, after linker stage we would run all object constructors which are independant of any input, and copy the resulting memory dump into the ELF file.
Note that my last sentense involves so much complexity that I sometimes better like C, like on this software where I have a grand total of 96 Kbytes for everything, on a 32 bits processor...

GCC Explorer - an interactive take on compilation

Posted May 28, 2012 14:28 UTC (Mon) by jwakely (subscriber, #60262) [Link]

IMHO C++ pushes a lot of things from the .data section (i.e. initialised variables/structures) into runtime execution/initialisation (object constructors), when compared to C.
Do you have anything more convincing than opinion? IMHO C++ puts the same things in .data, when compared to C.

C++ allows you to initialize globals in more ways than C, and those things might need dynamic initialization, but they are things that must also be dynamically initialized in C. Unless you know some way to statically execute an object's constructor in C that can't be done in C++?

Moreover some obvious optimisations cannot be done because objects "constant during execution" are still not initialised at compilation time, so compiler cannot do anything about them.

Have you looked at the C++11 keyword constexpr? It's specifically designed to allow such initialization.

struct Data {
  constexpr Data(int i, int j) : i(i), j(j) { }
  int i;
  int j;
};
constexpr Data data(1, 2);  // statically initialized
This example is clearly pointless (an array of two ints would serve in this case, and be statically initialized in C or C++03) but there are many more things you can do with constexpr.

GCC Explorer - an interactive take on compilation

Posted May 28, 2012 15:56 UTC (Mon) by etienne (subscriber, #25256) [Link]

> IMHO C++ puts the same things in .data, when compared to C.

I do agree that given the same source, you get the same thing in .data.
But when I read other people source code, because they work and think in C++, they will use classes instead of structure/base-types for the most basic things, and define virtual desctructors just in case they need it at a later time, and work with "auto ptr" to manage the list of 80000 parameters in a sorted collection.
I know the C way to manage the parameter "loglevel" as an unsigned integer for debug and as a constant for costumer versions is not as flexible as the C++ way, that because the user is not using a "setter" checking for min/max values he may write incorrect values there, but sometimes I am thinking simpler is better (mostly for such debug parameter).
In short I am not thinking the problem is in the toolchain, but in the mentality of using features just because they exists, of generalising small problems until nothing is constant anymore. For instance writing the line (without even the const keyword) inside or outside a class:
unsigned int bits_per_byte = 8;

That was my humble opinion, I know there is some clever ways those things can be sorted in C++, I have no hope to see this 60 Mbytes source code software cleaned or booting quicker (because everything is constructed at boot).

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 6:59 UTC (Fri) by gughur (guest, #83256) [Link]

That's a good example where a JIT could significantly outperform
a traditional compiler. It could figure out that printf is
being called with the same locale and same format string
all the time and generate special case fast path code for that.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 10:50 UTC (Fri) by dgm (subscriber, #49227) [Link]

Except that it is not. The locale can change at anytime, so the jitter has to include a test for locale before the "fast" path. Pooof!, your improvement just (almost) went away and you added a whole lot of complexity to the machinery.

Beware of silver bullets.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 11:50 UTC (Fri) by gioele (subscriber, #61675) [Link]

> Except that it is not. The locale can change at anytime, so the jitter has to include a test for locale before the "fast" path. Pooof!, your improvement just (almost) went away and you added a whole lot of complexity to the machinery.

Except that it is still possible in most cases. For example the JRuby JIT + Java JIT will optimise various paths and functions and keep these optimised versions around until the "environment" changes. In that case it stops the world, throws away the old compiled units and create new ones taking into account the current situation. In practical terms this means that using `eval` works flawlessly but kills your performance.

GCC Explorer - an interactive take on compilation

Posted Jun 13, 2012 8:15 UTC (Wed) by massimiliano (subscriber, #3048) [Link]

The V8 Javascript VM does the same...

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 21:20 UTC (Thu) by gutschke (subscriber, #27910) [Link]

It is relatively easy for GCC and glibc to come up with a new extension. This extension would allow the compiler to cleanly detect whenever it is compiling code that eventually executes in a standard run-time environment.

All we need is a new function attribute to annotate the declaration of the printf() function. For example:

extern int printf(const char * restrict format, ...)
  __attribute__((__format__(__printf__, 1, 2),
                 __runtime_function__(__printf__)));

I do not believe, this is how it is done today. But it would be easy to add to both GCC and glibc. It would allow the compiler to do optimizations, whenever it finds a suitable function declaration in <stdio.h>. But if it builds for a non-standard run-time environment, it would automatically skip doing these optimizations.

Of course, that raises the question whether it should ever try to do these optimizations in the first place. And arguably, the answer is more difficult than a simple "yes" or "no".

For example, it is quite common to have code that uses the same format string multiple times just with different arguments. If GCC always expanded the arguments, it would lose the ability to coalesce the identical format strings. For some programs, the extra space requirements in the .data segment could very well be more painful than the minute cost of computing the output at run-time. In fact, I have a hard time envisioning any program where the cost of printf() is noticeable at all.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 23:32 UTC (Thu) by Richard_J_Neill (subscriber, #23093) [Link]

> In fact, I have a hard time envisioning any program where the cost of
> printf() is noticeable at all.

I have one. I'm doing data-capture and pre-processing on a relatively slow system (mini-itx, to keep the heat down so as not to cause optical turbulence in the lab). We acquire data at about 1 MHz on 4 channels, and need to do some simple preprocessing (offset and scale) then save to file. The file is a 4-column text-file as asciihex eg "0x1234 0x4567 0x3432 0x6666\n", i.e format string is "0x%x\t0x%x\t0x%x\t0x%x\n". So the printf cost is definitely not negligible, when we call it 1 million times a second.

(Yes, we could optimise further and keep the data in raw binary form, but it's not strictly necessary yet, would prevent us adding headers, and would make for a non-human-friendly file format. But I'd certainly welcome a fast printf for the simple cases)

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 11:13 UTC (Fri) by dgm (subscriber, #49227) [Link]

The think is, the solution in this case is simply to write your own formating function. It's really simple, much simpler than the kind of optimizations you were asking for.

An example:

char * to_hex(unsigned v){
    static char digits[16] = "0123456789ABCDEF";
    static char buffer[8]; /* assuming 32 bits */
    int i;
    for (i = 7; i >= 0; i--) {
        buffer[i] = digits[v & 0xF];
        v >>= 4;
    }
    return buffer;
}

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 14:04 UTC (Fri) by jwakely (subscriber, #60262) [Link]

But do you call printf with compile-time constants a million times a second?

GCC Explorer - an interactive take on compilation

Posted Jul 3, 2012 3:32 UTC (Tue) by Richard_J_Neill (subscriber, #23093) [Link]

Much more significantly, what about an embedded system. On an Arduino, the runtime and storage costs of printf are severe. (typically ~3% of ROM)

So for example, I want to write:

#define OSC_MHZ 25
#define CLOCK_NS (1000/OSC_MHZ)
#define STRINGIGFY(s) STRINGIFY2(s)
#define STRINGIFY2(s) #s

Serial.print ("Clock period is " STRINGIFY(CLOCK_NS) "ns\n");

...which should print: "Clock period is 40 ns\n"
but actually prints: "Clock period is (1000/25 ns\n")

This sort of thing can't be done in the preprocessor and it's too expensive to do at runtime. The only workaround seems to be to actually manually: #define CLOCK_NS_STR "40"
which is a bug just waiting to happen.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 13:34 UTC (Fri) by vonbrand (subscriber, #4458) [Link]

It is almost exactly how gcc handles printf-like functions... from the gcc info: extern int my_printf(void *my_object, const char *my_format, ...) __attribute__((format(printf, 2, 3))); tells the compiler this function is printf-like. You can also use the attribute const on a function to express that it only uses the values of the arguments to compute its return value, so the compiler knows it can stash away the value and not recompute it.

So yes, the compiler is pretty smart, and the relevant C standard also says that standard functions can't be just overriden by the user. That is how printf("Hello, world!\n"); gets replaced (legally) by puts("Hello, world!\n");. Sure, there are lots of other changes the compiler might legally make, but at some point the extra cost in compiling outweighs the benefits; and some "optimizations" might on average turn out costlier in runtime than just leaving the code as is. As for the next suggestion, "leave it to the programmer via some attribute/special pragma," it has been shown time and again (Jon Bentley's "Writing efficient programs," sadly out of print, has some delightful stories) that programmers are terrible at predicting how to optimize code.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 13:56 UTC (Fri) by jwakely (subscriber, #60262) [Link]

> I'd expect GCC to do what a human could

If performance actually matters (and isn't a pointless microbenchmark) then I'd expect the human to write puts("11") in the first place instead of printf("%d\n", add(5, 6))

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 18:07 UTC (Thu) by njs (guest, #40338) [Link]

Yes, but GCC knows whether you're using the C run time library. All these magic optimizations are dependent on which C standard you've asked it to conform to, whether you're running in strict mode, etc.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 23:36 UTC (Thu) by daglwn (subscriber, #65432) [Link]

It does not know whether you LD_PRELOADed something before execution.

prinf optimization is inherently unsafe.

GCC Explorer - an interactive take on compilation

Posted May 24, 2012 23:49 UTC (Thu) by njs (guest, #40338) [Link]

If the person writing the code was anticipating someone LD_PRELOADing a printf replacement, then yeah, they should use -fno-builtin-printf. If they weren't, then they might have just as well written puts() themselves anyway. And for that matter, it'd be totally legal for libc to make printf an inlineable function.

Not perfectly supporting an inherently hacky interface like LD_PRELOAD is really really different from being "inherently unsafe". By that measure I'm pretty sure everything about C is inherently unsafe.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 2:59 UTC (Fri) by daglwn (subscriber, #65432) [Link]

By "inherently unsafe," I mean the compiler would arguably make a transformation that causes wrong results. That's certainly not what the programmer expects.

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 12:47 UTC (Fri) by njs (guest, #40338) [Link]

printf's behavior is defined in the C standard, just like the rest of the language. Any optimization the compiler does to anything, ever, might arguably make a transformation that causes wrong results. Such things are just called "compiler bugs"...

GCC Explorer - an interactive take on compilation

Posted May 25, 2012 14:25 UTC (Fri) by BenHutchings (subscriber, #37955) [Link]

No, the use of a LD_PRELOAD library which wraps printf() and not puts() is the invalid transformation.

validity of optimizing printf

Posted May 25, 2012 5:26 UTC (Fri) by pjm (subscriber, #2080) [Link]

> C is a language where there's no mandatory runtime.

The C language (and gcc) make a distinction between a freestanding environment (where there is actually a small mandatory runtime component, but it doesn't include printf or puts) and a hosted one (which does require the language-specified behaviour of printf and puts and the rest of the standard library). gcc by default compiles in hosted mode, where it's valid (conforming) to apply optimizations to standard library functions like printf, strlen etc.

gcc also supports a -ffreestanding option for freestanding environments, in which case it won't make assumptions about the meaning of an external function that happens to be called printf or strlen.

(If you're writing a kernel that includes a normal strlen function, then you might compile with -ffreestanding but #define strlen(s_) __builtin_strlen(s_), so that you still get strlen optimizations without any assumptions made about functions that happen to be named printf or the like.)

See also http://gcc.gnu.org/onlinedocs/gcc/Standards.html, and the documentation of -fno-builtins and related options at http://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html .

validity of optimizing printf

Posted May 25, 2012 11:21 UTC (Fri) by dgm (subscriber, #49227) [Link]

The C standard requires that gcc provide a minimum set of libraries to call itself a "conforming freestanding implementation", but programmers are not required to use them.

GCC Explorer - an interactive take on compilation

Posted May 30, 2012 5:04 UTC (Wed) by jzbiciak (✭ supporter ✭, #5246) [Link]

GCC (and C compilers in general) can safely assume the existence of the standard library, since it forms part of the environment that strictly conforming programs must adhere to. See page 19:


5 A strictly conforming program shall use only those features of the language and library specified in this International Standard. It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.

6 The two forms of conforming implementation are hosted and freestanding. A conforming hosted implementation shall accept any strictly conforming program. A conforming freestanding implementation shall accept any strictly conforming program that does not use complex types and in which the use of the features specified in the library clause (clause 7) is confined to the contents of the standard headers <float.h>, <iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>. A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program.

So, even in a freestanding implementation, there is some minimal amount of library implied (to the extent that the headers listed need under-the-hood support), although the vast majority of it disappears. In the more common hosted implementation (which nearly all GCC targets are, including the one that GCC Explorer compiles for), you've got the full standard C library there.

Nothing stops a conforming hosted compiler implementation from replacing calls to printf with something else (including inlined code), if the behavior of the result is indistinguishable to a correct program. Now, if you were expecting your own implementation of printf() to get called, that's non-standard.

Now GCC does offer the -ffreestanding flag, which would disable the sort of optimization shown up-thread. But in a hosted environment, I see absolutely nothing wrong with said optimization.

On a different note: As I recall, these sorts of printf() optimizations are among the optimizations that NullStone tests for.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds