Wheeler: Fixing Unix/Linux/POSIX Filenames
Wheeler: Fixing Unix/Linux/POSIX Filenames
Posted Mar 25, 2009 17:46 UTC (Wed) by nix (subscriber, #2304)In reply to: Wheeler: Fixing Unix/Linux/POSIX Filenames by epa
Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames
(I could equally use a directory full of stuff here, but it too would need a name that's hard to type. I pondered a \n-prepended filename because it's even harder to trip over by mistake, but decided that it would look too odd in directory listings of object directories when debugging. There's no danger of user code interpreting these names as options, because user code accesses files in this directory only via a shell-function API.)
And if I've done it, I guarantee you that lots and lots of other people have done it too.
David's proposed constraints on filenames are constraints which can never be imposed by default, at the very least. The semantics of Unix filesystems have been fixed de facto for many years: nobody expects files with odd characters to work on FAT, but nobody expects a Unix system to use a FAT filesystem as a primary datastore either.
Hardwired filename encodings are a good idea only if you can guarantee that this encoding has been the standard for the lifetime of the filesystem. You can't assume that for any existing filesystem: thus you have to decide what to do if filenames are not representable in the encoding scheme chosen. This also conflicts with 'no control characters' in that a good bunch of Unicode characters >127 can be considered 'control characters' of a sort, and there's no guarantee that more won't be added. How to exclude control characters which may be added in the future?
You also can't sensibly exclude shell metacharacters, because you don't know what they are, because they're shell-dependent, and some shells (like zsh) have globbing schemes so complex that ruling out all filenames that might be interpretable as a glob is wholly impractical.
But I agree that these rules all make sense for parts of the filesystem that users might manipulate with arbitrary programs, as opposed to those that are using part of the FS as a single-program-managed datastore. What I think we need is an analogue of pathconf() (setpathconf()?) combined with extra fs metadata, such that constraints of this sort can be imposed and relaxed for *specific directories* (inherited by subdirectories on mkdir(), presumably). *That* might stand a chance of not breaking the entire world in the name of fixing it.
