the problem is that the system has no way of knowing when you submit all this I/O if you mean
do all of these, and minimize the overall time
or
I need these to all make progress at the same time, even if it means taking longer overall.
current algorithms tend to assume the second, they try to split the available I/O bandwidth between all the requests, since this ends up resulting in lots of seeks, this hurts on traditional media with massive parallel requests
a small amount of parallelism helps by giving the drive something to do when it would otherwise be idle, however once you pass the saturation point it hurts because it adds additional seeks as the system jumps from one set of requests to the next.
this is the same sort of thing that makes hyperthreading be anywhere from a noticable benifit to a mild loss depending on the workload
Posted Aug 23, 2010 20:04 UTC (Mon) by axboe (subscriber, #904)
[Link]
A scheduler like CFQ will attempt to provide a mix of what you describe, depending on how you submit it. If the submission is done from one process, it will assume that you want it to be done as fast as possible. It'll be sorted accordingly. If done from multiple processes or threads, it will attempt to provide equal progress while preserving overall throughput.
What you describe is true on classical work conserving IO schedulers, it's not the case for the default Linux IO scheduler.
I/O scheduler performance
Posted Sep 8, 2010 11:15 UTC (Wed) by epa (subscriber, #39769)
[Link]
So, then, the way to get fast I/O is to make asynchronous I/O calls from a single thread (so that the scheduler knows that fairness doesn't matter) rather than spawning multiple threads or processes.
Is there any way to fork subprocesses but still let CFQ know that they're all related and happy to altruistically share I/O bandwidth between them, so it doesn't try to slice up I/O requests fairly at the expense of total throughput?