LWN.net Logo

They're just guessing

They're just guessing

Posted Sep 2, 2006 9:30 UTC (Sat) by slamb (guest, #1070)
In reply to: Kevents and review of new APIs by pphaneuf
Parent article: Kevents and review of new APIs

The exact issue that was raised by epoll in Ulrich's paper was the overhead of registering a file descriptor with the epoll_ctl() call before getting the events, and I was wondering what would he have otherwise. Just getting all the events would be highly inefficient.

I haven't seen this paper (got a link?), but I'd say there are three options:

  1. make assumptions - like that because read() returned EWOULDBLOCK you want to know when it next becomes available for write
  2. abandon level-driven polling. Edge-driven polling should let you set your notification preferences to READ|WRITE and leave it there, even if it's available and you don't currently want to consume it.
  3. accept a list of changes at the same time as the blocking call. Of course, this is the BSD way, so the Linux people have to do something different.

This article makes it painfully obvious to me that the Linux developers as a whole are just guessing. They're going back to a kevent-like system after mocking it when creating epoll. Well, now they're finding that the complexity of those other event types is worth it, and that their system call overhead is too high. Probably should have listened the first time. If sheer numbers of system calls is the problem, it's obvious that in level-driven notification applications, the FreeBSD approach of passing in all your change notifications at the same time of blocking is better than the unnecessary system calls of epoll_ctl. (Do the Linux people only care about edge-driven stuff? Perhaps that's reasonable, but I don't see it stated anywhere.)

This bizarre extreme of trying to eliminate all system calls by using a ring buffer...well, I agree with your comment that it sounds exactly like the signal-based polling mistake, and your comment in an earlier thread that some sort of blocking call is clearly necessary. Maybe it is true that it's the copying of event buffers is significant, but I haven't seen benchmark numbers that demonstrate this is superior, so again it seems that they're just guessing. That's a poor reason for throwing out what someone has already done in favor of a much more convoluted and error-prone interface.

I'm glad to see Andrew Morton's voice of reason, both on needing a clear justification for going against the existing FreeBSD interface and on the documentation. The latter is a serious problem with Linux interfaces in general. Look at inotify - they have section 2 manual pages for the system calls but no section 4 manual page for the whole interface. That's worthless - the system calls are completely obvious; the section 4 manual page is needed to actually describe what the constants and structure elements mean, among other things.

if I write an asynchronous DNS resolver library, I should have a way to be notified when a file descriptor is ready or a timeout expires without having to cooperate with other code ... ome libraries, like Qt, libevent and such can do that, but the big problem is that it's a very basic functionality, and it's worthless if it's not standard (if my library registers its events with Qt, but the main program uses libevent, nothing happens and my library never gets its events) ... the point here is to make one that will be good enough to be integrated as the Linux event API and be integrated in the glibc, so it can be relied upon.

I've always liked liboop for this purpose. It's more usable by libraries because it is general - you can plug it in to Qt's event loop, glib's event loop, libevent, etc. You don't have to make the sort of assumptions you're talking about to use it. I'd strongly prefer a well-maintained, liboop-like library to one in glibc like you're talking about. Largely because I would like my code to run on FreeBSD as well, and because it doesn't require waiting for the Qt and glib people to rebase their stuff on it. It doesn't even require other people to use it, though it'd sure be nice.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds