Memory part 2: CPU caches
Posted Oct 7, 2007 16:15 UTC (Sun) by dps
Parent article: Memory part 2: CPU caches
If you want something significantly beyond this, especially is SSE of interest, then I suggest that you might sse what you can find about emmerald (a fast SSE-using matrix-matrix mupltication code). I have a 2001 journal article which discusses maximising the use of L0 chache (aka registers), L1 cache and L2 cache. Minimising TLB misses is also discuessed.
It also cites ATLAS which probably merits further investigation if you want to know about cache-efficeint dense matrix muplication. As sparse matricies are my current (real work) interest I have not investigated further.
Fortunately for me Bodlean (library) reader's tickets are valid for life, and students get one :-) Many people have seen Duke Humphrey's library, which is part of the Bodlean library, albeit possibly not with that name attatched.
Emmerald is avialbale at http://csl.anu.edu.au/~daa/reserach.html. I do not know whether there are copies of articles there too. I got mine from an electronic journal. Anyone that does ask the author for a reprint should *not* say I sent them.
FYI some journals will sell you reprints for large amounts of money. You should only persue this option as a last resort. Free copies for anyone are sometimes avialable at institutional websites or axvir. You might also be able to buttonhole a friendly academic :-)
BTW journals get the both the articles and editorial for no charge, so I wonder who ends up holding the money. Most journals have significant subscription charges, mostly paid for by institutions, and some have per page charges for those publishing too.
to post comments)