LWN.net Logo

EPOLL_CTL_DISABLE and multithreaded applications

EPOLL_CTL_DISABLE and multithreaded applications

Posted Oct 27, 2012 14:16 UTC (Sat) by runciter (guest, #75370)
Parent article: EPOLL_CTL_DISABLE and multithreaded applications

This is nonsense. The deleting thread should just mark the cache data for that fd as "ready for deletion" and interrupt the epoll_wait (using a write to a pipe monitored by epoll, for example). The thread doing epoll_wait() can then synchronously release the resources. You'll need a mutex for the "ready-for-deletion" flag, but you need it for the "exists" or "ready" flags anyway. It's just a matter of checking the flags: the deleting thread checks "ready" before deleting; the epoll_wait() thread checks "ready for deletion" before updating "ready". With a mutex in place there is no race.

I don't get the point about losing data at all. You've decided to destroy the userspace cache entry *first*, before epoll_ctl() returned. Data will be lost either way.


(Log in to post comments)

EPOLL_CTL_DISABLE and multithreaded applications

Posted Oct 28, 2012 14:49 UTC (Sun) by kjp (subscriber, #39639) [Link]

It sounds like the issue is a timeout case. The diagram shows one thread sees the file descriptor as not ready (no events) and decides to delete it. But, then suddenly an event for it comes in and starts processing on another thread. I don't see how your solution addreses that. Your pipe wakeup could happen at the same time as a 'real' socket wakeup event.

EPOLL_CTL_DISABLE and multithreaded applications

Posted Oct 28, 2012 17:59 UTC (Sun) by kjp (subscriber, #39639) [Link]

My comment was imprecise at best. I'll clarify what I think you are doing:

Thread 1 decides the fd is no longer needed, due to no events
Thread 2 gets a wakeup for a real event, but is then scheduled out and does not progress
Thread 1 deletes the socket from epoll, marks fd as needing deletion, and signals via a pipe.
Another thead 3 then reads the pipe and deletes the fd

That does nothing to address the race with thread 2. There's still a race, all you've added is the essence of a sleep() which delays things. (Like the article mentioned, the solution of adding an arbitrary delay).

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds