On x86, `cachegrind' works well (of course it's monitoring a virtual
machine but it's still a good estimate).
I'll admit that on other platforms I often prove that cache smashing is at
fault for the sloth of some algorithm by rewriting the algorithm; if the
new one has the same formal time complexity yet is much faster merely
because the access patterns are different, it was a cache problem :)
A really crude approach is to turn off the L2 cache entirely (which many
motherboard/CPU combinations allow you to do) and see if your suspect
thingy gets vastly slower. :)
Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds