In addition, as I've been playing with GCC 4.3's --combine and -fwhole-program options lately over various code bases, I've found two things:
- GCC does an amazing job of poring over a complete 'program' and optimizing it when given the chance. Most programs (for perfectly valid reasons) are broken up into many source files for ease of maintenance, but this removes a large number of optimization opportunities. In the kernel, this means that the only functions that will ever be inlined are those defined in header files, so in a subsystem or driver that consists of 20+ sources files, when 50% of the functions in those files have only one or two call sites, they still don't get inlined.
- Allowing more aggressive optimization has actually found real bugs in some of the code bases I've been working on, as the compiler has been able to see inside called functions and then report useful things like uninitialized variable usage that it could not do before.