LWN.net Logo

Unioning file systems: Architecture, features, and design choices

Unioning file systems: Architecture, features, and design choices

Posted Mar 20, 2009 22:58 UTC (Fri) by nix (subscriber, #2304)
Parent article: Unioning file systems: Architecture, features, and design choices

Can someone explain *why* seekdir() is so intrinsically hard?

I mean, POSIX is quite happy for all files created after opendir() to not
be reflected in the output from that DIR handle... so why doesn't glibc
simply remember everything it's been given from getdents() until the
closedir()? Seeking on *that* would be trivial.

(The only downside is potentially unbounded userspace memory usage, but if
you're playing with gigabyte-sized directories other things will go wrong
first: e.g. there are a *lot* of apps out there that do things that scale
as O(n log n) or even worse in the size of a directory... and it's only a
user process doing it to itself anyway. Is it just that this memory usage
isn't worth it for a call as never-used as seekdir()?)


(Log in to post comments)

Unioning file systems: Architecture, features, and design choices

Posted Mar 21, 2009 18:25 UTC (Sat) by jbailey (subscriber, #16890) [Link]

Off the top of my head:

* To preserve the old syscall, we need to keep this functionality anyway.

* Retrieving directory entries could be expensive over wan links and such,
taking that huge hit on opendir might be a little much (How many times have
I done ls on a large dir and found myself hammering on C-c?)

* opendir sending me past my ulimit or available ram would be an
interesting DoS attack. Too many files, and you can't ls to figure out
what you should start deleting. No globs for you either. =)

If the interface could change, it might be nice to have a timelimit, and
throw an EINTR or some such on a seekdir that amounted to "suck it up and
start again."

Unioning file systems: Architecture, features, and design choices

Posted Mar 21, 2009 19:34 UTC (Sat) by nix (subscriber, #2304) [Link]

Ah, no. I mis-explained. I expect opendir() to do just what it does now:
but readdir()ing will remember the contents. (This is fine, because you
can't seekdir() to somewhere that you haven't telldir()ed, and you can't
telldir() something you haven't readdir()ed already.)

There's no DoS problem, because the application can keep an eye on the
amount of readdir()ing it's done, and stop if need be. It makes seekdir(),
even over NFS, a doddle, and retrieving dirents is no more expensive than
it is now.

I don't understand why glibc doesn't *already* implement this. Why on
earth is seekdir() the kernel's job?

(And, yes, we'd have to preserve the old syscall, but given the number of
users --- none on my system, two in the *entire* Debian source tree when I
counted it a few years ago --- I don't think anyone would care much, or
even notice, if it rotted gently into brokenness, or completely failed to
work on new filesystems.)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds