User: Password:
|
|
Subscribe / Log in / New account

Throwing one away

Throwing one away

Posted Sep 21, 2012 20:56 UTC (Fri) by neilbrown (subscriber, #359)
In reply to: Throwing one away by faramir
Parent article: Throwing one away

> are there any standards which speak to what one should see in the presence of directory changes?

http://pubs.opengroup.org/onlinepubs/009695399/functions/...

says

> If a file is removed from or added to the directory after the most recent call to opendir() or rewinddir(), whether a subsequent call to readdir_r() returns an entry for that file is unspecified.

Which presumably implies that if a file is not added or removed after opendir, then readdir will return it precisely once. That is certainly how I understand it.


(Log in to post comments)

Throwing one away

Posted Sep 21, 2012 23:06 UTC (Fri) by nix (subscriber, #2304) [Link]

I don't know. POSIX says that readdir() returns "an ordered sequence of all the directory entries in a particular directory", which surely implies that if a file is removed during readdir() you cannot skip other files -- but it never says, nor by my reading implies that you cannot repeat some of them in this situation. This means that if you're occasionally forced to restart a directory because you can't figure out where the requester was, so be it -- though it is probably best if that require multiple things to go wrong.

(e.g. in the scheme I sketched above, if a readdir requester's random-cookie/filename->DIR* entry was expired by the server and the name the readdir requester passed was missing from the directory, readdir would simply start passing back filenames from the start of the directory over again. This does mean that under extreme load, so the cookies kept expiring, and seriously extreme modification rates of a sufficiently huge directory, so that at least one name was deleted while being readdir()ed and while its cookie expired on every pass through the directory, readdir() might never come to an end -- but that's possible under extreme modification rates anyway, even on local filesystems, and this is a pathological case that's unlikely to occur in practice. To be honest, given that bugs in which filenames were persistently omitted if they were in the wrong place in the directory persisted in the BSDs for something like thirty years, it seems that programs' actual requirements of readdir() are rather less extreme than the guarantees!)

Throwing one away

Posted Sep 27, 2012 19:09 UTC (Thu) by cras (guest, #7000) [Link]

rename()s during readdir() can cause the renamed files to be returned twice or never. I'm not entirely sure if other non-renamed files can be returned twice. Deleting files (or at least rename()ing to another directory) can cause some files to be skipped with some filesystems (unfortunately I didn't keep more details). Dovecot's Maildir syncing code has some comments related to this. https://lkml.org/lkml/2004/10/24/230 also.

Throwing one away

Posted Oct 5, 2012 18:56 UTC (Fri) by foom (subscriber, #14868) [Link]

Ooh, according to that lkml thread, it *is* possible to get a consistent state of a directory, on linux. You just need to allocate a buffer big enough for the whole directory to fit in. There doesn't appear to be a way to figure out how big that needs to be ahead-of-time, but at least you could iterate, doubling the buffer each time:

>> 3. the application could be offered an interface for atomic directory
>> reads that requires the application to provide sufficient memory in a
>> single contiguous buffer (making it thread-safe in the same go).
>
>Actually, you can do this today, if you use the underlying
>sys_getdents64 system call. But the application would have to
>allocate potentially a very large amount of userspace memory.


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds