Rethinking optimization for size
Posted Feb 11, 2013 19:54 UTC (Mon) by
khim (subscriber, #9252)
In reply to:
Rethinking optimization for size by daglwn
Parent article:
Rethinking optimization for size
Anything that involves analyzing loop inductions and stride patterns.
- Vectorization
- Loop interchange
- Cache blocking
- Strip mining
- Loop collapse
and about a dozen other transformations.
Which are not applicable at all if you have arrays of various complex objects. Sure, I can understand that in some tight loops where "vectorization", "loop interchange" and other cool things can be applied indexes may be beneficial. But for typical high-level code (that is: for 90% of code if not 99% of code in a given project) changes from indexes to pointers and back make absolutely no difference: any function call tend to break all these nice techniques - and there are a lot of them.
Sure, for inner loop it may be interesting, but then they usually are faster when implemented in the appropriate CPU intrinsics (the change in data structures needed to use them efficiently are impossible for the compiler) anyway thus in practice I rarely see any observable difference. Of course nowadays it's often better to use CUDA or OpenCL to push all that to the GPU - where different rules are used altogether and where traditional pointers make no sense at all, but these are specialized applications.
This sounds like an ABI issue. Most sane ABIs pass small structs via registers.
Yup. Up to six registers for x86-64 case. And if you have couple of arrays plus some kind of "options" argument plus some callback_data... you've already used all of them. Add one additional argument - and spill is inevitable.
You may say that code which calls a callback in a tight loop is hopeless in a first place, but that's the problem: quite often I can not afford doing anything else. It's just too expensive to have one function for buttons, another for lists and so on: code is pushed from from L1 (or sometimes even L2 cache) and all these benchmark-friendly optimizations actually slow the code down.
(
Log in to post comments)