User: Password:
Subscribe / Log in / New account

Memory part 2: CPU caches

Memory part 2: CPU caches

Posted Oct 4, 2007 22:10 UTC (Thu) by jzbiciak (subscriber, #5246)
In reply to: Memory part 2: CPU caches by ncm
Parent article: Memory part 2: CPU caches

Actually, hyperthreading treats ALUs as an underutilized resource, and task scheduling latency as the benchmark. That is, one task might be busy chasing pointer chains and taking branches and cache misses, and not totally making use of the ALUs. (Think "most UI type code.") Another task might be streaming data in a tight compute kernel, scheduling its data ahead of time with prefetch instructions. It will have reasonable cache performance (due to the prefetches), and will happily use the bulk of the ALU bandwidth.

In this scenario, the CPU hog will tend to use its entire timeslice. The other task, which is perhaps interacting with the user, may block, sleep and wake up to go handle minor things like moving the mouse pointer around, blinking the cursor, handling keystrokes, etc. In a single-threaded machine, that interactive task would need to preempt the CPU hog directly, go do its thing, and then let the hog back onto the CPU. In a hyperthreaded environment, there's potentially a shorter latency to waking the interactive task, and both can proceed in parallel.

That's at least one of the "ideal" cases. Another is when one CPU thread is blocked talking to slow hardware (e.g. direct CPU accesses to I/O registers and the like). The other can continue to make progress.

Granted, there are many workloads that don't look like these. Those which cause the cache to fall apart by thrashing it definitely look worse on an HT machine.

(Log in to post comments)

Memory part 2: CPU caches

Posted Oct 10, 2007 0:48 UTC (Wed) by ncm (subscriber, #165) [Link]

I stand corrected. Now, if only processes could be rated according to how much of each fragment of the CPU they tend to use, they might be paired with others that favor using the corresponding other fragments.

Unfortunately the mix changes radically from one millisecond to the next. For example, slack UI code may get very busy rendering outline fonts.

Still, I am now inspired to try turning on HT on my boxes and see how it goes.

Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds