User: Password:
Subscribe / Log in / New account

Not quite

Not quite

Posted Mar 27, 2008 0:44 UTC (Thu) by ncm (subscriber, #165)
In reply to: whatever by elanthis
Parent article: Striking gold in binutils

As useful as I find C++, some of the above is not right.

There is no standard ABI for C++. G++ (in different versions) has two in common use, with a third coming soon; MSVC++ has others. (Other compilers tend to copy one or other of Gcc's or MSVC++'s, depending on target.) What is different now is that people have learned to include version numbers in the names of library files and library packages, so one rarely tries to link to a library built with the wrong compiler.

C++ code can be substantially faster than the best macro-obscurified C code, even without fancy template tricks. The reason is, believe it or don't, exceptions. Checking return status codes at each level in C (besides obscuring code logic!) is slower than leaving the stack to be unwound by compiler-generated code in the (unlikely) case of an exception.

Shitty programmers are more likely to code in C++ not because they're drawn to it, particularly, but because C++ is what everybody uses in Windows-land, and that's where most of them come from. That could be taken as a slight on typical Windows development habits, but it's really more a matter of the law of big numbers.

The only valid reason to consider C++ unsuitable for some particular "low-level" application is if the environment it must be linked/loaded into was built with a C compiler, and lacks the minimal support needed for, e.g., exception handling. An example is the Linux kernel. There's no reason Linux couldn't all be compiled and linked with G++ -- modulo some undisciplined use of C++ keywords as identifiers -- and then C++ drivers would be fine. However, it would be unwise to throw an exception in many contexts there.

Finally, the instability introduced in Gcc-4.x has a lot more to do with the optimizer than with changes to C++ or its implementation. That instability affected C programs (including the Linux kernel) as much as C++ programs.

None of these affect the conclusion, of course.

(Log in to post comments)

Not quite

Posted Mar 27, 2008 4:51 UTC (Thu) by wahern (subscriber, #37304) [Link]

Your theory about C++ exceptions being more performant than a comparable C pattern doesn't pan
out. It's a similar argument the Java folk give: "Java *can* be faster, because you can do
code optimization on-the-fly".

The extra tooling that C++ must put into function prologs and epilogs--and is mandated by the
various ABIs--for stack unwinding, as a practical matter, adds at least as much work, and
usually more. There are tables to index into--often from within another function which must be
called, and maybe using a pointer dereference. Any one of those can add up to several register
comparisons. I dunno how function inlining effects exception tooling, but I imagine the
relative losses only increase.

For the rare instance where you really need to fine tune a block or routine, both C and C++
suffice. I once shaved 20% runtime by changing a single line--loop to GCC built-in; it was in
C but would've applied equally to C++. In reality, C applications will be moderately faster.
But in most cases we're comparing apples to oranges because, for instance, many people prefer
exceptions. If they improve _your_ ability to engineer better solutions, and don't hinder
others, there is no other justification required. I don't understand why people try so hard to
prove that some feature "tastes better and is less fattening".

Imaginary losses

Posted Mar 27, 2008 6:16 UTC (Thu) by ncm (subscriber, #165) [Link]

Can you this identify any of this "extra tooling" in assembly output from the compiler? Or are you just making it up? You can "imagine" all the "relative losses" you like, but that has nothing to do with the facts.

What is factual is that the extra code each programmer must insert in C code to return error codes, to check error codes, and to dispatch based on error codes compiles to actual instructions that must be executed on every function return. When errors are reported by exception, instead, none of those instructions are executed unless an error occurs. The difference has been measured as high as 15%. Now, 15% isn't very much in Moore's Law country, but it's not negligible. It's not a reason to choose one language over another, but it puts the lie to made-up claims that C++ code is slower than C.

Imaginary losses

Posted Mar 27, 2008 7:20 UTC (Thu) by alankila (guest, #47141) [Link]

I'm not sure what kind of code has been used to benchmark that, but assuming the C++ compiler
has to insert some low-level call such malloc() into the generated code to handle the new
operator (or whatever), it will have to detect the return code from malloc just the same as
the programmer using the C compiler.

In general, I suspect C code doesn't execute error paths a lot. In a malloc example there is
practically nothing to do but die if it fails. So you'd expect the C++ and C code to actually
perform pretty much the same instructions -- both would do the call, and both would test for
error, and in case of no error they move forward to the next user construct.

In case of error, the C program would do something the programmer wrote, the C++ would do
whatever magic is required to raise exception (hopefully without further memory allocations,
of course). After this point, things do diverge a lot, but I think in most cases there are no
errors to handle. 

Therefore, it would seem to me that both should perform identically, unless error returns are
a common, expected result, in which case you'd have to write dispatch logic in C to deal with
each error type (normally a O(log N) switch-case statement I'd guess) while the C++ compiler
would probably generate code to figure out which exception handler should receive the

Somehow I do get the feeling that C should win in this comparison. After all, it's testing the
bits of one integer, while C++ has to test exception class hierarchives. In light of this, it
seems ludicruous to claim that C error handlers cost a lot of code that need to be run all the
time, but somehow C++ exceptions are "free".

Imaginary losses

Posted Mar 27, 2008 7:56 UTC (Thu) by njs (guest, #40338) [Link]

malloc isn't the example to think of here, because yeah, usually you just abort. And the problem isn't that first if statement, where you detect the error in the first place. The problem is that in well-written C code, practically *every* function call has some sort of error checking wrapped around it, because errors in that function need to detected and propagated back on up the stack. It's the propagating that really hurts, because you have to do it with if statements, and if statements are expensive.

Compare C:

error_t foo() {
  char * blah;
  error_t e = bar(blah);
  if (!e)
    return e;
  e = baz();
  if (!e) {
    return e;
  /* ... */
versus C++:
void foo() {
  std::string blah = bar();
One might think that the C++ code has "hidden" if statements; for old C++ compilers, that was true. Modern compilers, though, use Extreme Cleverness to avoid that sort of thing. (If you're curious for details, just g++ -S some simple programs and see.)

Imaginary losses

Posted Mar 27, 2008 20:40 UTC (Thu) by pphaneuf (subscriber, #23480) [Link]

You get a once per function (and thus, amortized better and better with the longer the function) setup and teardown that registers destructors. If there are no destructors, it simply can be left out. Even a "just crash" approach involves one test and branch per possible failure point.

On modern processors, having branches is expensive, due to mis-predicting them. I suspect that's one of the reasons that profile-driven optimizers can be so good, is that they can figure out which side of a branch is more likely. In the case of error-handling, which branch is more likely would be readily obvious to a human, but is harder to do for a compiler (see the annotations available in GCC, used by the Linux kernel code).

The code size increases with error-handling code, often with "dead spots" that get jumped over when there are no error, which on todays faster and faster machines, means increased instruction cache usage, less locality and so on.

I don't doubt that when they happen, C++ exceptions might be more expensive, but the thing with exception is that they don't happen often, and thus, that's the most interesting case.

Imaginary losses

Posted Mar 27, 2008 20:14 UTC (Thu) by wahern (subscriber, #37304) [Link]

Modern C++ exceptions might be conceptually zero-cost, but it is not less work than comparable C code. The difference is in how the stack is prepared to call the function. There is, evidently, a small fixed cost in every C++ function call which offsets the lack of a test+jump after the call. I admit I was unfamiliar w/ the details of modern exception handling, but I'm glad you forced my hand, because if anything we're cutting through some hyperbole.

Also, the error handling pattern in my C code doesn't duplicate as much code as the straw man examples posted here. I'm perfectly capable of using "goto" to jump to a common error handling block within a function, achieving something similar to the range table method of placing the error handling logic outside of the main execution flow. And I do this most of the time, because it just makes sense, and I get, IMO, better readability than exceptions, because there are fewer syntactic blocks to obscure my code. (I admit, that's highly subjective.)

Here's the example you requested. I used GCC--gcc version 4.0.1 (Apple Inc. build 5465)--with -O2 optimization. To compile: [cc|c++] -S -02 -o ex.S ex.c -DCPLUSPLUS=[0|1]


#include <iostream>

void noargs(int i) {
        if (i > 1)
                throw i;

        return /* void */;

int main (int argc, char *argv[]) {
        try {
        } catch (int e) {

        return 0;


#include <stdio.h>

int noargs(int i) {
        if (i > 1)
                return i;

        return 0;

int main(int argc, char *arg[]) {
        int e;

        if (0 != (e = noargs(argc))) {

        return 0;


Simple, straight-forward code. Let us count the number of instructions from main() to our call to noargs(), and from return from the noargs() to leaving main().

C++ output:

.globl _main
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %esi
        subl    $20, %esp
        movl    8(%ebp), %eax
        movl    %eax, (%esp)
        call    __Z6noargsi
        addl    $20, %esp
        xorl    %eax, %eax
        popl    %esi

On the "fast-path", we have 12 instructions for C++.

Now, plain C:

globl _main
        pushl   %ebp
        movl    %esp, %ebp
        subl    $24, %esp
        movl    8(%ebp), %eax
        movl    %eax, (%esp)
        call    _noargs
        testl   %eax, %eax
        jne     L10
        xorl    %eax, %eax

And in C, we have... 11 instructions. Well, well! And I'm being charitable, because in fact there are additional instructions for noargs() which increase the disparity: 8 in C, 12 in C++. That makes the total count 19 to 24, but for simplicity's sake, I'm happy to keep things confined to the caller.

Explain to me how this is a poor example. I'm willing to entertain you, and I by no means believe that this little example is conclusive. But, it seems pretty telling to me. I admit, I'm surprised how close they are. Indeed, if anybody suggested to me that C++ exceptions introduced too much of a runtime cost, I'd set them straight. But if they looked me straight in the eye and told me unequivocally that they were faster, I'd show them the door.

Imaginary losses

Posted Mar 27, 2008 20:56 UTC (Thu) by pphaneuf (subscriber, #23480) [Link]

From my experience, the more common thing is not really try/catch, but letting the exception bubble up. Basically, you just want to clean up and tell your caller something went wrong.

We'll agree that if there is a clean up to do, it's probably equally there in C and in C++, right? The "big saving" in C++ is in the case where you just clean up and bubble up the exception. If a function doesn't have cleaning up to do, it doesn't even go in that function at all!

As they say, the fastest way to do something is to not do it.

Imaginary losses

Posted Mar 27, 2008 21:24 UTC (Thu) by wahern (subscriber, #37304) [Link]

Hmmm, good point. So, if you don't throw from an intermediate function, you compound the

Well... I guess I'll just call "uncle" at this point. I personally don't like exceptions,
specifically because in my experience letting errors "bubble up" usually means that much error
context is lost, and the programmer gets into the habit of not rigorously handling errors
(that's why, I guess, I didn't think about that pattern). But, in a discussion like this
that's inapplicable.

Imaginary losses

Posted Mar 27, 2008 22:15 UTC (Thu) by pphaneuf (subscriber, #23480) [Link]

My theory is that you do something about it where you can. If you can't think of something useful to work around the problem, then just let it bubble up, maybe someone who knows better will take care of it, and if not, it'll be the same as an assert.

That's clearly sensible in a lot of cases, because otherwise there would be no such thing as error statuses, they'd just all "handle the errors".

I also quite prefer the default failure mode of a programmer failing to handle an error to be a loud BANG than silently going forward...

Imaginary losses

Posted Mar 27, 2008 21:04 UTC (Thu) by wahern (subscriber, #37304) [Link]

I forgot to test multiple calls in the same try{} block. Indeed, for every additional
back-to-back call C needs an additional two instructions (test+jump). So, for moderately long
functions, w/ a single try{} block and lots of calls to some small set of functions, I can see
C++ being faster. The trick is that you don't want the fixed-costs to exceed the gains, of
course. In the above example, C++ pulls ahead at the 4th call to noargs().

It would be an interesting exercise to count the number of function definitions, and functions
call in my code, and multiple by the respective differences of C and C++. But, it seems
complicated by the treatment of blocks in C++. I can see how in some tests C++ came out 15%
ahead, though.

In any event, there is indeed a fixed-cost to C++ exceptions. There might not be a prologue,
but the epilogue is invariably longer for functions, and, apparently, some blocks.

Not quite

Posted Mar 27, 2008 16:13 UTC (Thu) by BenHutchings (subscriber, #37955) [Link]

Most C++ implementations use range tables for exception handling today, so no extra code is
needed in the function prologue or the non-exception epilogue. The possibility of a callee
throwing can constrain optimisation of the caller, but so does explicit error checking.

Not quite

Posted Mar 27, 2008 20:35 UTC (Thu) by wahern (subscriber, #37304) [Link]

From my limited research, it seems the constraint is much more in C++, because C++ must
preserve stack state (minimally, the mere existence of an activation record), whereas in C a
compiler can obliterate any evidence of a function call, no matter whether or how the return
value is used. Granted, I'm not aware of what kind of requirements the C++ standard mandates;
certainly I'd bet in non-conforming mode a compiler could cheat in this respect. I'd like to
hear some analysis on this.

Inlining in general, though, is actually important, because in C one of the biggest fixed
costs you have to keep in mind is function call. As shown in my example else thread, there's
comparatively quite a lot of work to maintain the stack. This is, of course, a big deal in
most other languages, too. If you've ever written much PerlXS (and peered behind the macros),
at some point it dawns on you how much work is being done to maintain the stack--it's
incredible! The fixed costs of maintaining call state in Perl dwarfs most everything
else--excepting I/O or process control--including manipulation of dynamically typed objects.

Not quite

Posted Mar 27, 2008 22:28 UTC (Thu) by ncm (subscriber, #165) [Link]

For the record, nothing about exceptions in the C++ standard or common implementation methods
interferes with inlining.  In practice, the body of an inlined function is just merged into
the body of whatever non-inline function it's expanded in.

The only place where exceptions interfere with optimization is in that the state of a function
context at a call site must be discoverable by the stack unwinder, so it can know which
objects' destructors have to run.  In practice this means that calls in short-circuited
expressions, e.g. "if (a() && b()) ...", sometimes also set a flag: "if (a() && ((_f=1),b()))
...".  This only happens if b() returns an object with a destructor, i.e. rarely.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds