LWN.net Logo

Unicode bugs

Unicode bugs

Posted Feb 20, 2004 22:23 UTC (Fri) by spitzak (guest, #4593)
In reply to: Unicode bugs by simonl
Parent article: The kernel and character set encodings

Avoiding those bugs is one of the primary reasons why UTF-8 is a good
idea.

"../" in a UTF-8 filename means the *BYTES* for '.', '.', and '/' appear
next to each other. It is entirely irrelevant if the UTF-8 string is
legal or if it contains a byte sequence that some broken software by
Microsoft will turn into a slash.

I don't know how many times this has to be stated. But if your program is
looking at a UTF-8 string and is doing anything other than drawing the
characters on the screen, YOU DO NOT NEED TO DECODE IT! Just look at the
bytes!


(Log in to post comments)

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds