|
|
Log in / Subscribe / Register

Wheeler: Fixing Unix/Linux/POSIX Filenames

Wheeler: Fixing Unix/Linux/POSIX Filenames

Posted Mar 28, 2009 1:00 UTC (Sat) by nix (subscriber, #2304)
In reply to: Wheeler: Fixing Unix/Linux/POSIX Filenames by mjthayer
Parent article: Wheeler: Fixing Unix/Linux/POSIX Filenames

German ß is problematic too. Whether 'SS' turns into ß or not on
downcasing is *context-dependent* and to a certain extent a matter of
controversy and thus taste (this wasn't always true, but successive waves
of largely-failed spelling reforms have introduced a nice steaming heap of
uncertainty into this part of the written language).


to post comments

Wheeler: Fixing Unix/Linux/POSIX Filenames

Posted Apr 2, 2009 15:54 UTC (Thu) by forthy (guest, #1525) [Link]

It is actually not that bad. As collating sequence, ß=ss (i.e. Mass and Maß sort to the same bin). Except for Austrian telephone books, where ß follows ss, but comes before st (though St. follows Sankt ;-).

However, there's a huge mess in the CJK part of UCS: short and long forms of the same character (sometimes even a special variant for the Japanese character). This should never have happend, the different forms of the same character should be encoded in fonts, not in UCS. So far, not even Mac OS X normalizes these characters, but it is obvious that a mainland China file called "中国" and a Taiwan file called "中國" not only mean the same, but they also refer to the same word, and can be interchanged at will (see for example the Chinese wikipedia entry: the lemma is the short form, the headline is the long form). And it is not easy to access long and short forms with usual input methods (mainland China: Pinyin, Canton: Cantonese Pinyin (gives traditional characters, bug you need to know Cantonese), etc.).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds