User: Password:
|
|
Subscribe / Log in / New account

The Kernel Hacker's Bookshelf: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

The Kernel Hacker's Bookshelf: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

Posted Feb 21, 2008 18:06 UTC (Thu) by ikm (subscriber, #493)
Parent article: KHB: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

The approaches described here seem to completely ignore the fact that the increased amount of
code would result in increased cache misses, possibly penalizing the program more than
benefiting it. Instead of using the special-case function f(p) for p == 1, it might be faster
to just call the generic one, just because it's in the cache already. Cache is everything
nowadays, so the algorithms should be tailored for cache efficiency. Having more code hardly
helps here.

Furthermore, it seems to be unclear when to optimize and for what data — the time spent
creating new code, the memory consumed to hold it, and, once again, the cache penalties
resulting from having it executed once in a while should all be worth it. The whole approach
feels like it can be simplified to the idea of 'having a compiler inside of the running
program used to recompile its parts in runtime', but somehow I fail to see too many benefits
in that. To my understanding, if one wants to optimize, the better way to do is to think more
about caches/cpus interactions. The "network channels" apporach is one nice example of that.


(Log in to post comments)

The Kernel Hacker's Bookshelf: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

Posted Feb 21, 2008 19:10 UTC (Thu) by bronson (subscriber, #4806) [Link]

More code does not imply slower.  Think bubble sort vs. quicksort.

Besides, it sounds like the recompilations only happen at exceptional times.  I highly doubt
that they result in the sort of cache thrashing that you imply.

The Kernel Hacker's Bookshelf: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

Posted Feb 21, 2008 19:33 UTC (Thu) by ikm (subscriber, #493) [Link]

The bubble sort vs. quicksort would be an incorrect analogy. The idea here is to inject
several optimized versions of the same code. My arguing was that one generic version might
perform better than a multitude of specialized copies of the same code.

The Kernel Hacker's Bookshelf: Synthesis: An Efficient Implementation of Fundamental Operating Systems Services

Posted Feb 22, 2008 10:46 UTC (Fri) by mingo (subscriber, #31122) [Link]

Yes. Especially as today's CPUs move towards annotating cache lines and doing certain
optimizations of the generic functions by observing their usage and annotating specific
instructions. Creating _more_ code goes into the opposite direction.

Code size and cache thrash

Posted Feb 21, 2008 21:18 UTC (Thu) by vaurora (guest, #38407) [Link]

The issue of code size and instruction cache usage is a very important one - one I didn't have space to talk about in this article, but which is addressed in the original paper/dissertation/book. The executive summary is that if you only do inline code generation when it makes sense and share code otherwise, the code size is fine. In fact, for certain types of functions, the code size is reduced by inlining the code directly instead of setting up all the arguments and calling the shared function code. (gcc can do this statically at compile time, Synthesis can do it at run-time too.)

For more detail, see section 3.4.1. I feel like I read another section addressing this but I can't find it right now...


Copyright © 2018, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds