The approaches described here seem to completely ignore the fact that the increased amount of code would result in increased cache misses, possibly penalizing the program more than benefiting it. Instead of using the special-case function f(p) for p == 1, it might be faster to just call the generic one, just because it's in the cache already. Cache is everything nowadays, so the algorithms should be tailored for cache efficiency. Having more code hardly helps here. Furthermore, it seems to be unclear when to optimize and for what data the time spent creating new code, the memory consumed to hold it, and, once again, the cache penalties resulting from having it executed once in a while should all be worth it. The whole approach feels like it can be simplified to the idea of 'having a compiler inside of the running program used to recompile its parts in runtime', but somehow I fail to see too many benefits in that. To my understanding, if one wants to optimize, the better way to do is to think more about caches/cpus interactions. The "network channels" apporach is one nice example of that.
Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds