Toward non-blocking asynchronous I/O
To perform AIO, a program must set up an I/O context with io_setup(), fill in one or more iocb structures describing the operation(s) to be performed, then submit those structures with io_submit(). A call to io_getevents() can be made to learn about the status of outstanding I/O operations and, optionally, wait for them. All of those system calls should, with the exception of the last, be non-blocking. In the real world, things are more complicated. Memory allocations or lock contention can cause any AIO operation to block before it starts to move any data at all. And, even in the best-supported case (direct file I/O), the operation itself can block in a number of places.
The no-wait AIO patch set from Goldwyn Rodrigues seeks to improve this situation in a number of ways. It does not make AIO any more asynchronous, but it will cause AIO operations to fail with EAGAIN errors rather than block in a number of situations. If a program is prepared for such errors, it can opportunistically try to submit I/O in its main thread; it will then only need to fall back to a separate submission thread in cases where the operation would block.
If a program is designed to use no-wait AIO, it must indicate the fact by setting the new IOCB_RW_FLAG_NOWAIT flag in the iocb structure. That structure has a field (aio_flags) that is meant to hold just this type of flag, but there is a problem: the kernel does not currently check for unknown flags in that field. That makes it impossible to add a new flag, since a calling program can never know whether the kernel it is running on supports that flag or not. Fortunately, that structure contains a couple of reserved fields that are checked in current kernels; the field formerly known as aio_reserved1 is changed to aio_rw_flags in this patch set and used for the new flag.
One of the places where an I/O request can block is if the operation will trigger a writeback operation; in that case, the request will be held up until the writeback completes. This wait happens early in the submission process; in particular, it can happen before io_submit() completes its work and returns. Setting IOCB_RW_FLAG_NOWAIT will cause submission to fail with EAGAIN in this case.
Another common blocking point is I/O submission at the block level, where, in particular, a request can be stalled because the underlying block device is too busy. Avoiding that involves the creation of a new REQ_NOWAIT flag that can be set in the BIO structure used to describe block I/O requests. When that flag is present, I/O submission will, once again, fail with an EAGAIN error rather than block waiting for the level of block-device congestion to fall.
Support is also needed at the filesystem level; each filesystem has its own places where execution can block on the way to submitting a request. The patch set includes support for Btrfs, ext4, and XFS. In each case, situations like the inability to obtain a lock on the relevant inode will cause a request to fail.
All of this work can make AIO better, but only for a limited set of use cases. It only improves direct I/O, for example. Buffered I/O, which has always been a sort of second-class citizen in the AIO layer, is unchanged; there are simply too many places where things can block to try to deal with them all. Similarly, there is no support for network filesystems or for filesystems on MD or LVM volumes — though Rodrigues plans to fill some of those gaps at some future point.
In other words, AIO seems likely to remain useful only for the handful of
applications that perform direct I/O to files. There have been a number of
attempts to improve the situation in the past, including
fibrils,
threadlets,
syslets,
acall,
and an AIO reimplementation based on kernel
threads done by the original AIO author. None of those have ever
reached the point of being seriously considered for merging into the
mainline, though. There are a lot of tricky details to be handled to
implement a complete solution, and nobody has ever found the goal to be
important enough to justify the considerable work required to come up with
a better solution to the problem. So the kernel will almost certainly
continue to crawl forward with incremental improvements to AIO.
| Index entries for this article | |
|---|---|
| Kernel | Asynchronous I/O |
Posted May 31, 2017 6:07 UTC (Wed)
by ringerc (subscriber, #3071)
[Link] (1 responses)
Posted May 31, 2017 20:20 UTC (Wed)
by sitsofe (guest, #104576)
[Link]
Posted Jun 1, 2017 16:22 UTC (Thu)
by oever (guest, #987)
[Link] (1 responses)
find is a single-threaded application that uses blocking IO. For each directory that it reads it does one or more getdents calls. These calls are done sequentially. find could be sped up by doing many calls at the same time.
Consider the extreme case where each subsequent directory is on the opposite site of the disk. The disk head would travel to the other side of the disk for each directory. The kernel IO scheduler cannot help because it only knows about the next location.
If 100 parallel requests were done instead of one, the IO scheduler would handle them in an efficient and quick manner: it would read the nearest entries first. It is possible to send parallel getdents requests with threads. This requires one thread per parallel request: quite an overhead. libuv does this with a thread pool. This approach can roughly double the speed of find for a cold cache.
If this would be implemented with libaio, the thread overhead could be eliminated.
Posted Jun 2, 2017 12:53 UTC (Fri)
by oever (guest, #987)
[Link]
Posted Jun 2, 2017 6:19 UTC (Fri)
by ssmith32 (subscriber, #72404)
[Link] (1 responses)
Posted Jun 2, 2017 7:06 UTC (Fri)
by peter-b (guest, #66996)
[Link]
Toward non-blocking asynchronous I/O
O_DIRECT implies that the I/O won't be left rolling around in the OS' cache but says nothing about whether it is still in the disk device's non-volatile cache. You could send all I/Os down with O_SYNC too but speeds will plummet. Thus it's still desirable to be able to send down an fsync (and it would have been preferable if submitting it didn't have to block)...
Would still be beneficial to have async fsync even with O_DIRECT
I appreciate the work on asynchronous I/O very much. There is a lot of potential to improve software with AIO. A good example I came across recently is the venerable find.
Toward non-blocking asynchronous I/O
Alas, this plan cannot be executed while there is no LIO_GETDENTS in aiocb.
Toward non-blocking asynchronous I/O
Toward non-blocking asynchronous I/O
Toward non-blocking asynchronous I/O
