The return of kevent?
Posted May 11, 2007 9:45 UTC (Fri) by pphaneuf
In reply to: The return of kevent?
Parent article: The return of kevent?
Of course, the good old select/poll being O(N) on the number of file descriptors watched still applies, indeed. I used "load" in this context to mean "work to do", but I indeed use epoll for all servers on Linux (I use kqueue on *BSD, as well). I often end up having to have a select/poll version as well, for portability to those platforms not so well endowed.
I also know about the ring buffer having less copies, but I maintain my point: the kernel needs to know how many events have been consumed by the application in order not to overwrite unread events, and this is done with a system call. Making a system call to get the events that arrive, or making a system call to tell the kernel that we did process the events, at the end of the day, it's a system call either way.
Also, in order to do edge-triggered event notification (which I find can be useful to spread the load over multiple threads), the kernel can't just "forget about it", it keeps some information on the side in the file descriptor structure. The ring buffer does save a copy, but for the size of events, struct epoll_event isn't so bad (12 bytes), particularly compared with the work that will have to be done to process the events themselves.
I know that the ring buffer can be much bigger than the signal queues were, but the point is that they have a fixed size, and thus has to manage the overflow case properly. epoll keeps the information in the file descriptor structures (where it has to be kept anyway, in addition to the event, as I described earlier), so there is no overflow case: if you could open the file descriptor in the first place, it's all good.
Note that in other things punted over to the application to manage, there's also the issue of closed file descriptors. If a file descriptor has an event, but is closed before the event is processed, and another connection is accepted (very likely to get the same file descriptor number), what happens?
Not to mention that with the kevent ring buffer, it's tricky to spread the load between multiple threads (as described in Ulrich's post that you linked to), where epoll manages multiple threads going in epoll_wait() on the same epoll file descriptor nicely...
to post comments)