|
|
Log in / Subscribe / Register

Wheeler: Fixing Unix/Linux/POSIX Filenames

Wheeler: Fixing Unix/Linux/POSIX Filenames

Posted Mar 25, 2009 23:27 UTC (Wed) by jreiser (subscriber, #11027)
In reply to: Wheeler: Fixing Unix/Linux/POSIX Filenames by epa
Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames

As nix says, the filename encodes a key to what the file contains. The encoding is radix-254 (NUL and '/' excluded.) This fully utilizes the ASCII control characters [\x01-\x1f] and also the sequences such as subsets of [\xfc-\xff]* which are disallowed by UTF-8. Radix-254 is almost 2 bits per byte more dense than the proposed radix-65 (26 upper case, 26 lower case, 10 digits, dot hyphen underscore). The OS imposes an upper bound on the length of a filename, and there are critical points at various shorter lengths where there are jumps in space*time costs. Enough utility is discarded by radix-65 (as opposed to radix-254) that customers complain.


to post comments

Wheeler: Fixing Unix/Linux/POSIX Filenames

Posted Mar 26, 2009 14:44 UTC (Thu) by dwheeler (guest, #1216) [Link]

I never proposed radix-65. Radix-65 (26 upper case, 26 lower case, 10 digits, dot hyphen underscore) is what the POSIX standard ALREADY says is all you can depend on; nothing else is portable by that spec.

I want to be able to count on more than what the POSIX spec says; I want to be able to use the entire Unicode character set, minus the control chars and a few additional constraints to prevent lots of problems for the general-purpose user.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds