User: Password:
Subscribe / Log in / New account

Re: [patch 14/22] pollfs: pollable futex

From:  Davide Libenzi <>
To:  Ulrich Drepper <>
Subject:  Re: [patch 14/22] pollfs: pollable futex
Date:  Thu, 3 May 2007 11:24:48 -0700 (PDT)
Cc:  Davi Arnaut <>, Eric Dumazet <>, Andrew Morton <>, Linus Torvalds <>, Linux Kernel Mailing List <>
Archive-link:  Article, Thread

I thought you were talking about the poll/epoll interface in general, and 
the approach on how to extend it for the very few cases that ppl asks for. 
but I see we're focusing on futexes ...

On Thu, 3 May 2007, Ulrich Drepper wrote:

> On 5/2/07, Davide Libenzi <> wrote:
> > 99% of the fds you'll find inside an event loop you care to scale about,
> > are *already* fd based.
> You are missing the point.  To get acceptable behavior of the wakeup
> it is necessary with this approach to open one descriptor _per thread_
> for a futex.  Otherwise all threads get woken upon FUTEX_WAKE.
> This also means you need individual epoll sets for each thread.  You
> cannot share them anymore among all the threads in the process.

I'm not sure if futexes are the best approach to do that, but a way for 
the user to signal an event into a main event loop is needed.

> > On top of that, those fds are very cheap in terms of memory
> They might be when they are counted in dozens.  But here we are
> talking about the possible need to use thousands of additional file
> descriptors.  If they are so cheap to allow thousands of descriptors
> with ease, why would the rlimit for files default to a small number
> (1024 on Fedora right now)?

Right now, ppl do that using pipes. That costs 2 file descriptors and at 
least 4KB of kernel data (plus an inode, a dentry and a file). This just 
to have a way to signal to an event loop dispatcher. The patches I posted 
a few weeks ago introduce an eventfd, that reduces the amount of kernel 
memory to basically a dentry and a file (plus uses only one file 
descriptor, and its 2-3 times faster than pipes. Add to that cost, about 
200 lines of code in fs/eventfd.c.

> > And this approach is not bound to a completely new and monolitic interface.
> So?  It's stil additional, new code for an approach which will have to
> be superceded real soon.  That's just pure overhead to me.

IMO it is better to leave futexes alone. They are great for syncronizing 
MT apps, but do not properly fit an fd-based solution. For that, something 
like eventfd is enough.

- Davide

(Log in to post comments)

Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds