Control characters in file names
Posted Nov 30, 2010 1:39 UTC (Tue) by jamesh
In reply to: Control characters in file names
Parent article: Ghosts of Unix past, part 4: High-maintenance designs
As well as being a lot of work, using extended attributes introduces ambiguity. Some extra problems with that suggestion are:
- You could have two files in a directory with the same sequence of unicode code points but different byte representations due to be encoding differently.
- Applications might encounter paths like /latin1-part/utf8-part/sjis-part and need to check the encoding of each path component in order to display it to the user. Perhaps more difficult would be resolving a unicode path to something like this.
- Extended attributes are associated with the file rather than the file name. What do you do if a file has two hard links with differently encoded file names?
Picking one encoding/normalisation is the only sane option, and it would be nice if the kernel would help enforce such a choice.
to post comments)