One billion files on Linux
Posted Aug 19, 2010 17:08 UTC (Thu) by bcopeland
Parent article: One billion files on Linux
When trying to look at that many files, you need to avoid running stat() on every one of them or trying to sort the whole list.
Underlying this issue is that today's directories (for ext4 at least) are not set up to iterate in inode order. The consequence is that if you do a walk of the files in the order they are stored in the directory, and the inodes aren't in the cache, you have to seek all over the disk to get to the inode information. I remember reading once that the htree designers planned at some point to group the files in htree leaves into buckets based on inode; I wonder if anything ever came of that?
to post comments)