LWN.net Logo

KS2007: Scalability

KS2007: Scalability

Posted Sep 11, 2007 6:36 UTC (Tue) by sbsiddha (subscriber, #38593)
Parent article: KS2007: Scalability

About I/O scalability on a multi CPU system, migrating I/O submission to completion CPU is not actually moving the cost around. If most of the submission work is moved to the completion cpu, it will help minimize the
access to remote cachelines (that happens in timers, slab, scsi layers of the kernel) and most of the remote accesses will now be local. There will still be some remote cache references while migrating the I/O but those will be relatively small per I/O. A simple and dumb I/O migration experiment gave good perf results on a heavily loaded system. Patches and results are at http://lkml.org/lkml/2007/7/27/414

It will be difficult however to make these patches generally acceptable and not regress perf for common workloads. Hopefully future I/O HW will solve some of these issues but we are looking to see if there are simple enhancements and heuristics that we can exploit in the current generation HW.


(Log in to post comments)

KS2007: Scalability

Posted Sep 11, 2007 18:54 UTC (Tue) by Nick (guest, #15060) [Link]

It was tricky to get into exact details of what was happening here.

One issue is data going over the interconnect on NUMA systems -- in
this case, obviously you cannot avoid actually sending the page
data over RAM. Basically we really have to make sure userspace does
the right thing.

Another issue is from which CPU should you do the pagecache writeout
from. And in this case you do want to do it on the same node that
most of the pages are located on (rather than where the device is,
because it's a question of which would require touching more data
structures).

For the problem you describe, it is different again. And yours does not
apply only to NUMA but also SMP. And basically I gather what you are
doing is trying to hand over control of the block layer to the completing
CPU at a point that is going to result in the fewest cache misses. We
didn't really discuss this in detail, but yes some of the points that
were raised included the upcoming hardware, and also the fact that network
might have similar concerns, and it might be good to work on them together.

I still hope to see continued work on your ideas, and I don't think they
were shot down at all (if I remember correctly).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds