Wheeler: Fixing Unix/Linux/POSIX Filenames
Wheeler: Fixing Unix/Linux/POSIX Filenames
Posted Mar 25, 2009 23:27 UTC (Wed) by jreiser (subscriber, #11027)In reply to: Wheeler: Fixing Unix/Linux/POSIX Filenames by epa
Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames
As nix says, the filename encodes a key to what the file contains. The encoding is radix-254 (NUL and '/' excluded.) This fully utilizes the ASCII control characters [\x01-\x1f] and also the sequences such as subsets of [\xfc-\xff]* which are disallowed by UTF-8. Radix-254 is almost 2 bits per byte more dense than the proposed radix-65 (26 upper case, 26 lower case, 10 digits, dot hyphen underscore). The OS imposes an upper bound on the length of a filename, and there are critical points at various shorter lengths where there are jumps in space*time costs. Enough utility is discarded by radix-65 (as opposed to radix-254) that customers complain.
