Improving kill_fasync()
[Posted December 23, 2003 by corbet]
Unix systems, and their variants, provide a number of ways for processes to
manage multiple I/O streams simultaneously. One of those is through the
use of I/O signals; a process can request to receive a
SIGIO
whenever a given file descriptor becomes available for reading or writing.
Inside the kernel, this signalling is handled via a file-specific
fasync_struct structure and a couple of helper functions. One of
them, called
fasync_helper(), simply helps the kernel (filesystem
or driver) code track which processes have requested notification for a
given file. The other,
kill_fasync(), is invoked to actually
deliver a signal to interested processes when the time comes.
The kernel uses a single reader/writer spinlock (fasync_lock) to
serialize all calls to either helper function. In some situations, it
would seem that this lock is starting to hurt performance. It seems that
more types of
devices support I/O signalling than was once the case, and the increasing
number of calls to kill_fasync() is creating lock contention. So
Manfred Spraul
did something about it, in the form of a
patch which switches the I/O signalling code over to the read-copy-update
mechanism for mutual exclusion. The result for his particular test load
was an 80% reduction in the time required to send out I/O signals.
Linus, having issues with how some of the locking was done, didn't much
like the patch, But he also had some ideas
for reworking the whole I/O signal mechanism to get rid of a lot of
unneeded code.
The key is in the understanding that the list of processes wanting I/O
signals is very similar to the list of processes simply waiting for the I/O
itself. Either way, it is a list of processes that needs to be notified
when data becomes available or the file descriptor becomes writable. There
is not a whole lot of difference between sending a SIGIO to the
process and simply waking it up.
During the 2.5 development process, the wait queue mechanism was
generalized somewhat; this Driver Porting Series
article describes some of the changes which were made. The kernel
function wake_up() (with several variants) is called to wake
processes which are waiting on a wait queue; in 2.4 and prior kernels, it
performed that wakeup directly. In 2.5, however, all wake_up()
really does is call a special wakeup function, a pointer to which is stored
in the wait queue entry. This indirection allows different processes to be
awakened in different ways.
So far, there are few cases where a non-default wakeup function is used.
But there is no real reason why, with a suitable wakeup function, wait
queues could not be used for any of a number of different process
signalling tasks. The whole I/O signalling mechanism and its
fasync_struct structure could really be replaced by a wait queue
with a special wakeup function.
The only problem with this nice, elegant idea is that it won't work.
kill_fasync() takes a "band" argument which eventually gets passed
though to the target process as signal data. There is currently no way to
pass that information to a wakeup function via wake_up(). Adding
a data parameter to wake_up() would fix that problem and, perhaps,
enable a number of other potential uses for wait queues. Such a change
appears likely to happen - but not until 2.7. Such changes really
shouldn't be made in 2.6, now that the 2.6.0 kernel has come out.
(
Log in to post comments)