Posted Sep 11, 2007 6:36 UTC (Tue) by sbsiddha
Parent article: KS2007: Scalability
About I/O scalability on a multi CPU system, migrating I/O submission to completion CPU is not actually moving the cost around. If most of the submission work is moved to the completion cpu, it will help minimize the
access to remote cachelines (that happens in timers, slab, scsi layers of the kernel) and most of the remote accesses will now be local. There will still be some remote cache references while migrating the I/O but those will be relatively small per I/O. A simple and dumb I/O migration experiment gave good perf results on a heavily loaded system. Patches and results are at http://lkml.org/lkml/2007/7/27/414
It will be difficult however to make these patches generally acceptable and not regress perf for common workloads. Hopefully future I/O HW will solve some of these issues but we are looking to see if there are simple enhancements and heuristics that we can exploit in the current generation HW.
to post comments)