Jens Axboe's completely fair queueing (CFQ) I/O scheduler has been regarded
by many as the best available in the 2.6 kernel for a while. Said
scheduler has just been through another major upgrade which should
implement a higher degree of fairness while providing "excellent"
throughput for the system as a whole.
One of the big additions this time around is time sharing: processes now
get time slices during which they are able to dispatch I/O requests. The
scheduler will allow a drive to go idle - briefly - during a process's time
slice to give that process an opportunity to generate more I/O requests. In this
way, it behaves similarly to the anticipatory scheduler; it allows the
process to get the most out of its slice while, hopefully, taking advantage
of the locality of that process's requests. If, however, a process's
requests end up causing too much seeking, that process will temporarily
lose its right to hold the disk idle.
Tied in with the time sharing implementation is the notion of I/O
priorities. Each process has its own I/O priority, which, by default, is
derived from its CPU priority. Processes with higher priorities will
preempt lower-priority processes, while sharing the drive in a round-robin
fashion with equal-priority processes. There is also a realtime priority
level which does not do round-robin sharing, and an "idle" level which is
only allowed to dispatch requests when the drive has been idle for a
sufficiently long period.
There is a temporary priority boosting mechanism designed to avoid priority
inversion problems when a low-priority process holds important resources.
Two new system calls have been added for working with I/O priorities:
int ioprio_set(int which, int who, int priority);
int ioprio_get(int which, int who);
Here, which controls whether the call applies to a single process,
process group, or user, and who is the appropriate ID (usually the
process ID). A call to ioprio_set() will apply the new
priority (subject to the usual permissions checks) while
ioprio_get() returns the current value.
to post comments)