tl;dr: the dispatch has overhead and the folks who need vector math either don't care about CPU portability or refuse to accept that overhead.
GCC dispatching is tied to shared libraries and have no overhead on top of that. Nothing. Exactly zero. Not one single cycle, not one single byte (except for the slow-path which is, obviously, not a big concern: it's called slow-path for a reason). Sure, shared libraries are slower for various reasons yet somehow games use them, but they don't use dispatch. Where is the logic in your statement? Why would you create fully separate engine where you can only create specialized version of some core part?
IMNSHO it's not "refuse to accept", it's "refuse to consider because of ignorance". Ignorance is most definitely not bliss in this particular case.
Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds