CPU chip characteristics matter
CPU chip characteristics matter
Posted Jun 3, 2022 16:33 UTC (Fri) by jreiser (subscriber, #11027)Parent article: Mazzoli: How fast are Linux pipes anyway?
Serious performance work should report grep -E 'cpu family|model name|model|stepping|microcode|cache size|siblings|cpu cores' /proc/cpuinfo, trimmed of redundancies.
There are more details regarding cache architecture. For Intel Core chips of the last 12 years or so, each core has its own L1 (separate I and D, each 32kB) and its own L2 (256kB unified I and D). Then L3 (unified I and D) is on the non-core side of an internal bus, and is shared by all cores. (PCIe I/O devices also talk to L3.) Typical chips for non-server consumer machines have an L3 of 8MB for Core i7, 6MB for Core i5, 4MB for Core i3. Server chips have much larger L3: upto 28MB or more. Finally, L3 talks to physical RAM.
In some ways the best situation for communication via cache is 2 hyperthreads in the same core, where sharing is guaranteed and the hardware resolves cache contention on a cycle-by-cycle basis. Separate non-hyper threads in the same core suffer CPU contention via software task switching. Separate cores forces all Write operations to use L3, although the Reads can be short-circuited by cross-core L2 cache snooping on that shared bus.
It would be interesting to see some measurements of the analogous use of io_uring.
