User: Password:
Subscribe / Log in / New account

latency can be hidden with enough parallelism

latency can be hidden with enough parallelism

Posted Nov 15, 2007 0:37 UTC (Thu) by stevenj (guest, #421)
Parent article: Memory part 8: Future technologies

Another perspective that several people have advocated (e.g. Burton Smith), is that in the future if we can have enough fine-grained parallelism, then latency can be hidden. (e.g. you do one instruction from thread 1, the one instruction from thread 2, then... by the time you get back to thread 1, it is many cycles later and any memory operation will have completed). This was the underlying idea behind things like Intel's Hyper-threading stuff (which does it on a small scale) or the Tera MTA supercomputer (on a large scale).

One problem, of course, is extracting that much parallelism from most programs is still rather hard. There are very few programs out there that are ready to use 100 or 1000 processors.

(Parallelism is actually a huge problem on the computer-science horizon. Processors aren't getting much faster any more, and the only way to get increased performance is by exploiting parallelism, but writing parallel programs is still the province of an elite few.)

(Log in to post comments)

latency can be hidden with enough parallelism

Posted Nov 15, 2007 2:52 UTC (Thu) by elanthis (guest, #6227) [Link]

I imagine you could get quite a huge performance boost out of many applications by just
writing them to not be so inefficient.  That being one of the primary purposes of this series
of articles.  A lot of modern software blows through cycles completely needlessly.  This has
gone on because of the increase in processor speed - software could expand to fill the
available increase in speed and reduce the complexity of the software in the process.  Now
that the choice is between the complexity of efficient software designs or the complexity of
intense parallelism, maybe we'll see a shift back towards efficient code instead of
quickly-written code.

Then again, probably not - it seems like it might end up being easier to make simple tools for
parallel code than educating developers to write efficient code. :/

Call me crazy, but I take pride in being able to write single-threaded daemons that outperform
heavily threaded implementations for even moderately high workloads. :p

latency can be hidden with enough parallelism

Posted Nov 20, 2007 2:42 UTC (Tue) by nagual_sorcerer (guest, #49143) [Link]

    You are not crazy.:) But when there are multiple cpu cores, you still write single
threaded daemons? Or just one thread per cpu core? I don't know if that goes faster, or there
is some other way of doing things faster?

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds