LWN.net Logo

"Dependent on" ???

"Dependent on" ???

Posted Feb 27, 2007 23:38 UTC (Tue) by nix (subscriber, #2304)
In reply to: "Dependent on" ??? by eklitzke
Parent article: Mitchell Baker and the Firefox Paradox (Inc)

GCC. This is pretty much your only choice if you want a C/C++ compiler. Many years ago GCC was a healthy project with lots of progress and competition (remember EGCS?). Nowadays my impression is that there are only a handful of developers working on GCC.
I can't really imagine how you could get that impression. I've been watching the GCC lists since maybe 1998, and while people have left or hugely cut down their development effort (Zack Weinberg, Matt Austern...) other developers have piled in. A lot of them. Not random newbies, either: more years ago than I care to remember, Ken Zadeck invented the SSA representation that GCC is now using.

A quick grep with a vile lashup of shellery reveals that, in 1998 and 1999 combined, a rough total of 325-odd people appeared in the GCC changelogs. In 2006 *alone* the same number appear (actually 326). All these figures are thrown off by typos, UTF-8 canonicalization and Romanization differences, and so on, so it's hard to say what the actual numbers are, or how many people have arrived and left the project: but numbers of developers are definitely not declining, unless the people who are left are becoming much less capable of spelling their own names.

If we error in the other direction by ignoring unique names that appear less than ten times in the changelogs (some people with non-ASCII characters getting short shrift in the process), we get a total of 89 heavy contributors who committed at least ten changes in 1998/1999 (66 contributed in 1999 alone), and 121 who did the same in 2006. That's a rough doubling of the `heavy developer' base, and comm(1) shows only 25-odd heavy developer names in common. (Some surprising people get left out of the 1999 list, though, like Jeff Law, who goes through too many variations on his name for my ugly script to spot, with each iteration getting less than ten commits...)

(Vile scriptery available on request but it has raw tabs in it so I'm not sure I can paste it in here. Jon's git-grinder is doubtless much nicer.)

The codebase is big and complicated, and not a lot of people have the expertise required to contribute. As a result, most of the changes to GCC in the past few years have been to make it more ANSI compliant,
Oh, boy, you're out of date. The huge push toward (mainly C++) standards-compliance was a 3.x thing. GCC 4.x added an entire new intermediate representation (GENERIC/GIMPLE) and dozens of optimization passes, on a much firmer theoretical base and more pleasant representation than the ad-hoc RTL stuff that was the only choice for non-frontend optimizations before that. Before this change, it was horribly diffficult to pick algorithms out of the literature and use them in GCC. Afterwards, well, it's still effort, but it's at least an order of magnitude less.
and there have been relatively new features or speedups. In fact, compiling code takes longer than it used to.
You're going to get a nice surprise soon, then, because the tree-ssa optimizers have finally reached the stage where some of the horrible slow old RTL optimizers are being removed or drastically simplified, as the much faster tree-ssa optimizers can supplant them entirely, and take much less time to do much more work...

As for new features? Let's see, we have gfortran in GCC 4.0, a radically improved Java implementation in GCC 4.3 (thanks to using ecj as the frontend), and OpenMP in GCC 4.2... Or do new features not count unless they're in the C frontend? Oh, btw, the C frontend had its parser completely rewritten by Joseph Myers in GCC 4.0.)

You probably won't see huge improvements in speed on targets like i386 without someone rewriting the register allocator and reload pass, probably the most hairy thing left and a creature of a different era. This requires vast fortitude and an immense ability to pick unstated assumptions out of pretty much every .md file in GCC. Several attempts have been made: none have yet succeeded.

I'd say the pace of development in GCC has never been higher. If anything, it's so high that better tools are needed: patches get lost unreviewed in the swirl of traffic on gcc-patches too often...


(Log in to post comments)

"Dependent on" ???

Posted Feb 27, 2007 23:40 UTC (Tue) by nix (subscriber, #2304) [Link]

Posted Feb 27, 2007 17:38 UTC (Tue)

I'm not sure I trust that timestamp. It's 23:35:24 here in the UK, and
last time I looked out of my window it was winter so we're on UTC right
now.

Jon, what's wrong with the timestamps? Is it showing
the-timezone-west-of-New-York time and calling it UTC or something?

"Dependent on" ???

Posted Mar 6, 2007 23:34 UTC (Tue) by roelofs (guest, #2599) [Link]

Jon, what's wrong with the timestamps? Is it showing the-timezone-west-of-New-York time and calling it UTC or something?

Michigan time, presumably. Six hours would be US/Central. (Fixed at some point, in any case.)

Greg

"Dependent on" ???

Posted Mar 7, 2007 21:49 UTC (Wed) by nix (subscriber, #2304) [Link]

Ah. The C stands for `Corbet'. ;)

(I suppose it *is* universal, too. Wherever you are, the time where Jon is
does not change.)

"Dependent on" ???

Posted Feb 27, 2007 23:47 UTC (Tue) by nix (subscriber, #2304) [Link]

(Bah. Typing faster than brain, SSA isn't a representation in that sense.
Still anyone who knows what SSA is will know what I mean.)

"Dependent on" ???

Posted Feb 28, 2007 12:19 UTC (Wed) by k8to (subscriber, #15413) [Link]

Thanks for all that, a fascinating read.

Here is an unfair nitpick:

> You probably won't see huge improvements in speed on targets like
> i386 without someone rewriting the register allocator and reload
> pass, probably the most hairy thing left and a creature of a
> different era.

Indeed, I'm busy noticing how much _slower_ gcc has gotten in generated-code speed on i386 over the last few cycles. I'm definitely behind the times here, as I bothered to try doing a bunch of cross-release cross-feature timings for various performance sensitive code recently as I got a new amd64 box. The surprise was gcc 3.x generated code was much faster (5-20%) across the board. Not what I was expecting at all.

"Dependent on" ???

Posted Feb 28, 2007 14:28 UTC (Wed) by nix (subscriber, #2304) [Link]

I'd expect this with -O3, as inlining tends to increase register pressure.

I've noticed that quite a few of the recent optimizations tend to do this on register-poor targets, actually; the problem seems to be that the optimizers have got no way of telling if they're causing excessive stack spills because allocation of non-pseudo-registers doesn't happen until right up until the end (and that happens to RTL, far lower-level than the GIMPLE that most optimizers chew on). This is also waiting on some heroic figure rewriting the register allocator so that earlier passes can get decent feedback on whether they're going to spill to hell and back or not (and then the earlier passes would have to get updated to use this information...)

(A good few passes are turned off at -O<3 for this reason.)

There's still a lot to do: GCC is far from perfect: a lot of the problems it tries to solve (like, well, register allocation) are NP-complete anyway, but it could definitely do a better job than it does.

But it hasn't been stagnating.

(Again, I'm just an observer; if I'm spraying ignorant rubbish someone who does actual useful work like Joe Buck should point it out. :) )

"Dependent on" ???

Posted Feb 28, 2007 15:41 UTC (Wed) by k8to (subscriber, #15413) [Link]

It seemed also true with -O2 and no O Flags at all. It was also true on amd64 which is not so register poor.

"Dependent on" ???

Posted Feb 28, 2007 17:50 UTC (Wed) by nix (subscriber, #2304) [Link]

With no -O flags at all I'd expect *huge* register pressure problems at all times! With -O2, well, even there the pressure is building up (although the problem is much less severe than with -O3).

I'm surprised to find problems on amd64. Raise a bug, maybe?

"Dependent on" ???

Posted Feb 28, 2007 17:57 UTC (Wed) by k8to (subscriber, #15413) [Link]

I suspect they know. It's getting better across 4.x as a whole. Also I'm really not excited about filing a bug "these 5 big applications that I don't really understand all do much better on gcc 3.x on i386 and amd64. Here's your several hundred thousand line testcase."

If I was the developer of them maybe, or if they were were more managably sized, or if I hadn't read through other bug entries where it's discussed that there are (supposedly) no automated performance regression tests at all.

I'm certainly not really intending my observations as complaints, it's just a matter of fact observation. I wish it wasn't so but it is, and I suspect half-assed bugs will only cost the project.

"Dependent on" ???

Posted Feb 28, 2007 22:13 UTC (Wed) by massimiliano (subscriber, #3048) [Link]

Want to do a big favor to the gcc developers?

Profile those applications, and find the hot spots that get significantly worse. Then, write small benchmarks with the same code, and test them to see that the slowdown is still there.

Finally, open the bug with the small benchmarks :-)

And if you do it, remember to send the samples to me as well (massi(at)ximan"dot"com)! I work on the Mono JIT, and am generally interested in performance tests...

"Dependent on" ???

Posted Feb 28, 2007 23:20 UTC (Wed) by nix (subscriber, #2304) [Link]

Hell, if they're free software at least say what they are so someone else
with more machine time than sense and heaps of old compiler versions
scattered around can do the profiling :)

"Dependent on" ???

Posted Mar 1, 2007 12:09 UTC (Thu) by k8to (subscriber, #15413) [Link]

Off the top, the n-queens "benchmark" shows a significant drop from 3.x to 4.x. http://www.arch.cs.titech.ac.jp/~kise/nq/index.htm

A larger one I care more about is UAE, the amiga emulator, without JIT. Both mainline and E-UAE from rcdrummond.net

"Dependent on" ???

Posted Mar 9, 2007 19:49 UTC (Fri) by anton (guest, #25547) [Link]

Want to do a big favor to the gcc developers?

Profile those applications, and find the hot spots that get significantly worse. Then, write small benchmarks with the same code, and test them to see that the slowdown is still there.

Finally, open the bug with the small benchmarks :-)

And see it closed as invalid after an hour (less time than it took to do create the bug report).

Given this reaction, I don't think the gcc developers consider such bug reports as favors, and it certainly has not inspired me to report other gcc bugs.

"Dependent on" ???

Posted Mar 9, 2007 20:52 UTC (Fri) by massimiliano (subscriber, #3048) [Link]

And see it closed as invalid after an hour (less time than it took to do create the bug report).

Interesting discussion in that bug :-)

Anyway, I wrote my suggestion without knowing how gcc development actually works... I would welcome bugs like that opened for the Mono JIT!
At worst, I would mark it as "Wishlist", and keep it open that way...

"Dependent on" ???

Posted Mar 10, 2007 1:42 UTC (Sat) by nix (subscriber, #2304) [Link]

It seems that fixing this would be ridiculously difficult and not terribly
beneficial (how many programs have you seen that use computed goto? How
much effort is it worth going to to speed it up?)

I'd have kept it open, too, but there is so much low-hanging optimization
fruit in GCC that fixing fairly small one-platform optimization bugs in
rarely-used language extensions isn't going to get much attention at the
best of times.

GCC development actually works in much the same way everything else works
in the free software community: if you have a performance bug in a tiny
obscure feature it's probably not going to get fixed unless you fix it. A
noninvasive patch would probably have made it...

"Dependent on" ???

Posted Apr 7, 2007 20:29 UTC (Sat) by anton (guest, #25547) [Link]

>how many programs have you seen that use computed goto? How
>much effort is it worth going to to speed it up?

Many interpreters are using labels-as-values for a significant
speedup. And a huge number of programs use interpreters.

>one-platform optimization bugs

As far as I understand the reply to the bug report, this bug affects
every platform that does not have a conditional indirect branch, i.e.,
pretty much every platform except PPC and IA64.

>A noninvasive patch would probably have made it...

They consider the bug "invalid", so they probably would not have
accepted the patch, and I am glad that I did not waste my time on
trying to build one.

"Dependent on" ???

Posted Feb 28, 2007 22:30 UTC (Wed) by massimiliano (subscriber, #3048) [Link]

This is also waiting on some heroic figure rewriting the register allocator so that earlier passes can get decent feedback on whether they're going to spill to hell and back or not (and then the earlier passes would have to get updated to use this information...)

How funny! This is exactly what I want to do in the Mono JIT :-)
Here's a link to a presentation about our medium-long term JIT plans: "http://www.go-mono.com/meeting06/MonoSummit2006-JIT.pdf".

Actually we will not go fully SSA so fast, and maybe never. But we will rewrite the register allocator, and in the SSA path I will make sure that the communication between the optimization passes and the regalloc will happen effectively!

If you ever want to "chat" about compiler internals, feel free to drop me a mail (massi(at)ximian"dot"com).

"Dependent on" ???

Posted Feb 28, 2007 23:19 UTC (Wed) by nix (subscriber, #2304) [Link]

Interesting stuff. Of course the Mono JIT is operating under more constraints than GCC in some respects (`compilation' must be *fast*) but it's more amenable to rewrites because you don't have to target a myriad decade-old backends rife with unstated assumptions and with widely-varying constraints on (e.g.) the register file.

Zack Weinberg put it well in an IRC conversation (later reprinted in the acknowledgements to _A Maintenance Programmer's View of GCC_ in the 2003 GCC Summit proceedings):

Take an H. R. Giger painting, you know, with the perverse and insanely complicated biomechanical constructs.

Now, instead of being all shiny and new, make it old and overgrown with weeds. Slimy weeds.

Much of GCC is better than that these days (thanks to tree-ssa obsoleting many of the nasty parts, and a lot of effort to clean up the problems Zack identified in that paper), but some parts (notably reload and the rest of the register allocator, and combine) are still deep in the slime.

The first part of Vlad Makarov's _Fighting register pressure in GCC_ (in the 2004 summit proceedings) has a good description of how the current allocator works, and the extreme constraints on changing it. That attempt at rewriting the allocator ran into the slimy weeds and got all tangled up; virtually every subsequent summit proceedings has the skeleton of another attempt in it.

Someday, someone will succeed... you're already talking about moving away from BURG, but one of the earlier rewrite attempts was I think trying to move to it. I'm fairly sure the Mono compiler's register allocator can do a better job at this intractable task than GCC's can...

"Dependent on" ???

Posted Feb 28, 2007 15:16 UTC (Wed) by nix (subscriber, #2304) [Link]

You're going to get a nice surprise soon, then, because the tree-ssa optimizers have finally reached the stage where some of the horrible slow old RTL optimizers are being removed or drastically simplified, as the much faster tree-ssa optimizers can supplant them entirely, and take much less time to do much more work...
Note, `soon' here means `GCC 4.3'; the mem-ssa work reduces the size of the intermediate representation so much that the improvement to cache utilization alone causes significant compile-time speedups.

GCC 4.2 is indeed yet another slower-at-compiling release. (mem-ssa is far too large to backport...)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds