User: Password:
Subscribe / Log in / New account

per-CPU counters and locking

per-CPU counters and locking

Posted Feb 5, 2006 3:13 UTC (Sun) by giraffedata (subscriber, #1954)
In reply to: per-CPU counters and locking by man_ls
Parent article: The search for fast, scalable counters

just another example of why kernel programming sometimes looks so hard to outsiders.

This is a relatively recent addition to the things performance programmers have to worry about. Used to be, electronic memory was considered fast, and it was always the same speed. Typically, it took at most 4 cycles to access it. Now, there are 4 levels of electronic memory (including the registers) and the main pool is 200 cycles away. We think of looking at memory now the way we used to think of loading a file from a disk.

I'm not ashamed to say my personal leaning these days is toward single-CPU single-core machines tied together at the network adapter. My brain can handle only so much complexity.

At least it would spare the "expensive locking operations" mentioned in the main article, wouldn't it?

That locking operation is for updating a counter. For accessing one, the only cost is moving the cache line from one CPU to another -- through main memory. Whoops, I just remembered - on fancy modern machines, there's a shortcut to move a cache line without going out on the memory bus. So it's cheaper than what I said earlier, but still more than folks are willing to pay.

(Log in to post comments)

per-CPU counters and locking

Posted Feb 9, 2006 4:57 UTC (Thu) by roelofs (guest, #2599) [Link]

Now, there are 4 levels of electronic memory (including the registers)

5 levels sometimes (at least in principle). There exist Intel chips with L3 caches (Xeon Gallatin, I think), and presumably each cache type is a different speed from the others and from main memory.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds