Kevents and review of new APIs
Posted Sep 1, 2006 12:20 UTC (Fri) by pphaneuf
In reply to: Kevents and review of new APIs
Parent article: Kevents and review of new APIs
I think you misunderstood what kevent is for. kevent isn't concerned with all sorts of events, but rather on the very specific general types of events that can wake a process up. Just about all of those events are file descriptor events, due to the Unix design ("everything is a file" or close enough to do).
The exact issue that was raised by epoll in Ulrich's paper was the overhead of registering a file descriptor with the epoll_ctl() call before getting the events, and I was wondering what would he have otherwise. Just getting all the events would be highly inefficient.
To deal point by point with your reply, Linux-AIO uses a single file descriptor to get notifications on all the operations it does (reads and writes, on all other file descriptors, be them files, sockets or other). This file descriptor can most likely be put in an epoll interest set. An auditing system (such as exists already in the form of Dazuko, inotify and such) would most likely deliver its events on a file descriptor (which you can put in epoll and get notified when those events arrive). Web, news and IMAP servers could use Linux-AIO (covered earlier), but normal filesystem-based file descriptor are "always readable", even when they aren't, so you usually don't want to use them in a event mechanism like kevent or epoll (being always "ready", they make your application busy-spin, eating 100% CPU).
Processes communicating bulk data through shared memory often use a Unix domain socket to notify the other process that it should get the data. X11's MITSHM extension, for example, but simpler systems that just write a single byte (enough to make the file descriptor go "readable") are also seen. Unix domain sockets involve more copies for bulk data, but writing a single byte to wake the other process up is very cheap. If the notification is one-way only, a pipe is enough.
You also missed a few other interesting cases. Central processing of signals and timeouts are two others. Signals can also be dealt with the "single byte written on a pipe" trick, from the signal handler, deferring the work to the other end of the pipe. Timeouts can be dealt with, well, the timeout parameter of epoll_wait(), of course.
The main problem I have with epoll is still that it doesn't centralise the event dispatching for libraries. A new API should include a callback function when events arrive, which would get called without needing cooperation between unrelated pieces of code. For example, if I write an asynchronous DNS resolver library, I should have a way to be notified when a file descriptor is ready or a timeout expires without having to cooperate with other code. Right now, code in a library has to provide a way to let the code that will be doing the call to epoll_wait know that it has a specific timeout or that if it gets an event on a certain file descriptor, it should pass it on.
Some libraries, like Qt, libevent and such can do that, but the big problem is that it's a very basic functionality, and it's worthless if it's not standard (if my library registers its events with Qt, but the main program uses libevent, nothing happens and my library never gets its events). These libraries already do a good job, but the point here is to make one that will be good enough to be integrated as the Linux event API and be integrated in the glibc, so it can be relied upon.
to post comments)