Fibrils and asynchronous system calls
Zach Brown has decided to stir things up by asking a basic question: could it be that the way the kernel implements AIO is all wrong? The current approach adds a fair amount of complexity, requiring explicit AIO handling in every subsystem which supports it. IOCB structures have to be passed around, and kernel code must always check whether it is supposed to block on a given operation or return one of two "it's in the works" codes. It would be much nicer if most kernel operations could simply be invoked asynchronously without having to clutter them up with explicit support.
To that end, Zach has posted a preliminary patch set which simplifies asynchronous I/O support considerably, but doesn't stop there: it also makes any system call invokable in an asynchronous mode. The key is a new type of in-kernel lightweight thread known as a "fibril."
A fibril is an execution thread which only runs in kernel space. A process can have any number of fibrils active, but only one of them can actually execute in the processor(s) at any given time. Fibrils have their own stack, but otherwise they share all of the resources of their parent process. They are kept in a linked list attached to the task structure.
When a process makes an asynchronous system call, the kernel creates a new fibril and executes the call in that context. If the system call completes immediately, the fibril is destroyed and the result goes back to the calling process in the usual way. Should the fibril block, however, it gets queued and control returns to the submitting code, which can then return the "it's in progress" status code. The "main" process can then run in user space, submit more asynchronous operations, or do just about anything else.
Sooner or later, the operation upon which the fibril blocked will complete. The wait queue entry structure has been extended to include information on which fibril was blocked; the wakeup code will find that fibril and make it runnable by adding it to a special "run queue" linked list in the parent task structure. The kernel will then schedule the fibril for execution, perhaps displacing the "main" process. That fibril might make some progress and block again, or it may complete its work. In the latter case, the final exit code is saved and the fibril is destroyed.
By moving asynchronous operations into a separate thread, Zach's patch simplifies their implementation considerably - with few exceptions, kernel code need not be changed at all to support asynchronous calls. The creation of fibrils is intended to make it all happen quickly - fibrils are intended to be less costly than kernel threads or ordinary processes. Their one-at-a-time semantics help to minimize the concurrency issues which might otherwise come up.
The user-space interface starts with a structure like this:
struct asys_input {
int syscall_nr;
unsigned long cookie;
unsigned long nr_args;
unsigned long *args;
};
The application is expected to put the desired system call number in syscall_nr; the arguments to that system call are described by args and nr_args. The cookie value will be given back to the process when the operation completes. User space can create an array of these structures and pass them to:
long asys_submit(struct asys_input *requests, unsigned long nr_requests);
The kernel will then start each of the requests in a fibril and return to user space. When the process develops an interest in the outcome of its requests, it uses this interface:
struct asys_completion {
long return_code;
unsigned long cookie;
};
long asys_await_completion(struct asys_completion *comp);
A call to asys_await_completion() will block until at least one asynchronous operation has completed, then return the result in the structure pointed to by comp. The cookie value given at submission time is returned as well.
Your editor notes that the current asys_await_completion() implementation does not check to see if any asynchronous operations are outstanding; if none are, the call is liable to wait for a long time. There are a number of other issues with the patch set, all acknowledged by their author. For example, little thought has been given to how fibrils should respond to signals. Zach's purpose was not to present a completed work; instead, he wants to get the idea out there and see what people think of it.
Linus likes the idea:
I heartily approve, although I only gave the actual patches a very cursory glance. I think the approach is the proper one, but the devil is in the details. It might be that the stack allocation overhead or some other subtle fundamental problem ends up making this impractical in the end, but I would _really_ like for this to basically go in.
There are a lot of details - Linus noted that there is no limit on how many fibrils a process can create, for example - but this seems to be the way that he would like to see AIO implemented. He suggests that fibrils might be useful in the kevent code as well.
On the other hand, Ingo Molnar is opposed to the fibril approach; his argument is long but worth reading. In Ingo's view, there are only two solutions to any operating system problem which are of interest: (1) the one which is easiest to program with, and (2) the one that performs the best. In the I/O space, he claims, the easiest approach is synchronous I/O calls and user-space processes. The fastest approach will be "a pure, minimal state machine" optimized for the specific task; his Tux web server is given as an example.
According to Ingo, the fibril approach serves neither goal:
Ingo makes the claim that Linux is sufficiently fast at switching between ordinary processes that the advantages offered by fibrils are minimal at best, and not worth their cost. Anybody wanting performance will still have to face the full kernel AIO state machine. So, he says, there is no real advantage to fibrils at this time that are worth the cost of complicating the scheduler and moving away from the 1:1 thread model.
These patches are in an early stage, and this story will clearly take some
time to play out. Even if a consensus develops in favor of the fibril
idea, the process of turning them into a proper, robust kernel feature
could make them too expensive to be worthwhile. But it's an interesting
idea which brings a much-needed fresh look at how the kernel does AIO; it's
hard to complain too much about that.
| Index entries for this article | |
|---|---|
| Kernel | Asynchronous I/O |
| Kernel | Fibrils |
