10 times faster on a simple plain matrix multiplication. Amazing. With optimizing only memory access. I did not realise, before this series of articles, that memory can take a such place in optimisation. Thanks a lot, it's so interesting and well written.
Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds