One billion files on Linux
One billion files on Linux
Posted Aug 19, 2010 10:57 UTC (Thu) by cesarb (subscriber, #6266)In reply to: One billion files on Linux by liljencrantz
Parent article: One billion files on Linux
Things like squid cache directories, git object directories, ccache cache directories, that hidden thumbnails directory in your $HOME... They all have in common that the files are named by a hash or something similar. There is no logical grouping at all here; it is a completely flat namespace.
Most of these work around the large number of files in a single directory this causes by extracting some bits (usually 4 or 8) of the hash and using it as the name of a subdirectory (which works because the hashes used have an almost perfect uniform distribution). Sometimes more than one level is used. If the filesystem can easily deal with a huge number of files in a single directory, this extra complexity is not needed.
There is also Maildir directories, which use one file per message, and the only logical grouping is a "folder" or similar. If you have a million messages in a single "folder" (for instance one named "linux-kernel-mailing-list" which has all the messages you collected since 1999), you need a filesystem which can deal with a million files in a single directory. And here the names are not hashes, so the scheme above fails (and even if it worked, it is not a Maildir anymore).
Posted Aug 19, 2010 18:34 UTC (Thu)
by liljencrantz (guest, #28458)
[Link]
One billion files on Linux