An Introduction to GCC Compiler Intrinsics in Vector Processing (Linux Journal)
Posted Sep 21, 2012 21:34 UTC (Fri) by dashesy
Parent article: An Introduction to GCC Compiler Intrinsics in Vector Processing (Linux Journal)
A really nice article I wished for a few years ago. Although instrinsics prevent going all the way to assembly there are still some gotchas like placing __builtin_ia32_emms here and there to avoid some floating point corner cases. Also in its simplest usage it requires fixed vector length for the processing path. Not the best approach since new CPUs come with larger vector support; this emphasizes the need for dynamic dispatching based on CPU (or better would be a mixed approach that also considers GPU).
Overall the advice given in paragraph before the summery seems to be the most practical one:
Fourth, don't re-invent the wheel. Intel, Freescale and ARM all offer libraries and code samples to help you get the most from their processors. These include Intel's Integrated Performance Primitives, Freescale's libmotovec and ARM's OpenMAX.
to post comments)