LWN.net Logo

The return of kevent?

The return of kevent?

Posted May 11, 2007 9:45 UTC (Fri) by pphaneuf (guest, #23480)
In reply to: The return of kevent? by intgr
Parent article: The return of kevent?

Of course, the good old select/poll being O(N) on the number of file descriptors watched still applies, indeed. I used "load" in this context to mean "work to do", but I indeed use epoll for all servers on Linux (I use kqueue on *BSD, as well). I often end up having to have a select/poll version as well, for portability to those platforms not so well endowed.

I also know about the ring buffer having less copies, but I maintain my point: the kernel needs to know how many events have been consumed by the application in order not to overwrite unread events, and this is done with a system call. Making a system call to get the events that arrive, or making a system call to tell the kernel that we did process the events, at the end of the day, it's a system call either way.

Also, in order to do edge-triggered event notification (which I find can be useful to spread the load over multiple threads), the kernel can't just "forget about it", it keeps some information on the side in the file descriptor structure. The ring buffer does save a copy, but for the size of events, struct epoll_event isn't so bad (12 bytes), particularly compared with the work that will have to be done to process the events themselves.

I know that the ring buffer can be much bigger than the signal queues were, but the point is that they have a fixed size, and thus has to manage the overflow case properly. epoll keeps the information in the file descriptor structures (where it has to be kept anyway, in addition to the event, as I described earlier), so there is no overflow case: if you could open the file descriptor in the first place, it's all good.

Note that in other things punted over to the application to manage, there's also the issue of closed file descriptors. If a file descriptor has an event, but is closed before the event is processed, and another connection is accepted (very likely to get the same file descriptor number), what happens?

Not to mention that with the kevent ring buffer, it's tricky to spread the load between multiple threads (as described in Ulrich's post that you linked to), where epoll manages multiple threads going in epoll_wait() on the same epoll file descriptor nicely...


(Log in to post comments)

The return of kevent?

Posted May 12, 2007 20:58 UTC (Sat) by intgr (subscriber, #39733) [Link]

I concur with all of your points.
Also, in order to do edge-triggered event notification [...] the kernel can't just "forget about it"

It can forget about the events; naturally, events have side effects, and the kernel will have to keep track of the state of its objects. (Or am I missing something?)

If a file descriptor has an event, but is closed before the event is processed, and another connection is accepted

Both APIs have an "opaque pointer" field in their event structures. Applications are supposed to use this for identifying clients, not file descriptor numbers.

The return of kevent?

Posted May 12, 2007 21:34 UTC (Sat) by pphaneuf (guest, #23480) [Link]

If a file descriptor becomes readable, then not anymore, then again, without the event queue being looked at, should you get two events? With epoll, you get only one (you only get told of the file descriptor being readable once you've known about it).

Of course, it could go "on the cheap" and let userspace figure it out. But since it's so handy to just have this one bit in the file descriptor structure (which is really the "how many bytes in the appropriate buffer", which you really have to have, interpreted as a bool), why not?

They don't really get told that they are supposed to use that. The file descriptor number really is the proper identifier, as far as the kernel is concerned. Note that all the other APIs can support that without a problem (none of select/poll/epoll ever give you a "bad information" like that). Having a pointer is just to be helpful (and it is, quite!).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds