LWN.net Logo

Control characters in file names

Control characters in file names

Posted Nov 30, 2010 1:39 UTC (Tue) by jamesh (guest, #1159)
In reply to: Control characters in file names by cmccabe
Parent article: Ghosts of Unix past, part 4: High-maintenance designs

As well as being a lot of work, using extended attributes introduces ambiguity. Some extra problems with that suggestion are:

  • You could have two files in a directory with the same sequence of unicode code points but different byte representations due to be encoding differently.
  • Applications might encounter paths like /latin1-part/utf8-part/sjis-part and need to check the encoding of each path component in order to display it to the user. Perhaps more difficult would be resolving a unicode path to something like this.
  • Extended attributes are associated with the file rather than the file name. What do you do if a file has two hard links with differently encoded file names?

Picking one encoding/normalisation is the only sane option, and it would be nice if the kernel would help enforce such a choice.


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds