Firefox 3.0.10 released
Firefox 3.0.10 released
Posted Apr 28, 2009 23:42 UTC (Tue) by nix (subscriber, #2304)In reply to: Firefox 3.0.10 released by njs
Parent article: Firefox 3.0.10 released
kernel by just souping up sparse'. People who actually know sparse and C
compilers, like viro and davem, have been jumping hard on this silly
idea. Even if you ignored *all* optimizers, you'd have a big pile of
backends to implement, and ignoring all optimizers is probably a bad idea:
the kernel surely needs at least decent submodel-specific scheduling (to
minimize stalls and keep pipelines full: not all hardware does it for you
like x86) and a decent register-pressure-sensitive reload, which means
live range analysis, which means... it's a good-bit bigger job than
they've been assuming.
Posted Apr 29, 2009 1:52 UTC (Wed)
by njs (subscriber, #40338)
[Link] (1 responses)
When it's obvious that what the kernel needs is a Parrot interpreter?
Posted Apr 29, 2009 7:06 UTC (Wed)
by nix (subscriber, #2304)
[Link]
I'm being snarky. The kernel's desynchronization from GCC, and its much
The only other project of similar size that I've seen run into similar
And kernels *are* low-level and special: you really *don't* care about a
The kernel people like to assume that these only speed up SPEC, but y'know
Some optimizations do seem to have minimal effect, until you try turning
A few have minimal effect on one CPU but a lot on another: sometimes these
Firefox 3.0.10 released
Firefox 3.0.10 released
changes to percolate down that writing their own C compiler is obviously
better. Oh, and 'every other project of any size' runs into huge numbers
of GCC bugs too so this is not just them, only it *is* just them because
kernels are low-level and special.
faster release cycles and insistence on working even with relatively old
compilers means that you need to wait about five years between hitting a
GCC bug and being able to assume that it's fixed: and shipping the
compiler and kernel together *would* solve this. Now you'd only have the
problem of bugs in the compiler. (Unless the in-kernel-tree compiler did a
three-stage bootstrap+compare I'm not sure I could trust it much, either,
and that would make kernel compilation times balloon unless it was *much*
faster than GCC. Mind you that's not unlikely either.)
numbers of compiler bugs is clamav. God knows what they do to trigger that
many. MPlayer is also known for this, but that really *does* play at evil
low levels, using inline asm with borderline constraints for a tiny extra
flicker of speed on some obscure platform and that sort of thing.
lot of the optimizations that really do speed up real code. All the work
recently going into improved loop detection, currently used mostly to
improve autoparallelization, is useless for OS kernels, which can't allow
the compiler to go throwing things into other threads on the fly; a couple
of optimizations speed up giant switch statements entangled with computed
goto by turning them into jump tables and stuff like that, which is useful
for threaded interpreters, and the kernel doesn't have any (the ACPI
interpreter uses an opcode array instead).
it just isn't true. I'm hard put to think of an on-by-default optimization
in GCC that doesn't speed up code I see on a nearly daily basis at least a
bit (I benchmarked this three years or so back, diking optimizations out
of -O2 one by one and benchmarking work flow rates through a pile of very
large financial apps, some CPU-bound, some RAM-bound, in a desperate and
failed attempt to see which of them was pessimizing it: it would be
interesting to try this sort of thing with the kernel, only it's a bit
harder to detect pessimization automatically)... some of it isn't going to
speed up high-quality code, but you just can't assume that all code you
see is going to be high-quality.
them off and realise that later optimizations rely on them (these days,
many of these dependencies are even documented or asserted in the pass
manager: radical stuff for those of us used to the old grime-everywhere
GCC :) )
are turned on on both of them, which probably *is* a waste of time and
should be fixed. None of these seemed to be especially expensive
optimizations in my testing. (I should dig up the results and forward them
to the GCC list, really. But they're quite old now...)