The kernel and character set encodings
Posted Feb 21, 2004 7:49 UTC (Sat) by Cato
In reply to: The kernel and character set encodings
Parent article: The kernel and character set encodings
This problem needs to be addressed somewhere, though not necessarily in the kernel (perhaps in glibc or the GUI layer): two users create identical looking filenames using Vietnamese accented characters (letter + 2 accents in different order, 3 Unicode characters altogher). Then, there are two identical-looking filenames and you don't know how to type the 'right' one. Even if there is only one file involved, without Unicode normalisation you wouldn't be able to use bash filename completion, since you might type the accents in a different order to that used in the filename, though there would be no visual clue as to your mistake.
Given these issues, which affect command line tools as much as GUIs, it may be sensible to put NFC normalisation in glibc or the kernel, despite the complexity. Files created from another system on a Linux NFS filesystem would of course bypass glibc, so the alternatives are batch renormalisation (always an option, convmv may do this) or putting NFC in the kernel.
It's not good enough to say 'case-insensitivity should not be in the kernel' - you need to address these use cases and say how and where you would solve them.
to post comments)