|
|
Log in / Subscribe / Register

Working with UTF-8 in the kernel

Working with UTF-8 in the kernel

Posted Mar 31, 2019 1:46 UTC (Sun) by foom (subscriber, #14868)
In reply to: Working with UTF-8 in the kernel by mirabilos
Parent article: Working with UTF-8 in the kernel

Hopefully the filesystem records what mapping it was created with, like NTFS does. Otherwise, some of your files may become inaccessible when a new mapping is switched to (which, iirc, did happen on HFS+ before. That's not good...)

Re: Turkish swears -- you can name your files either word just fine -- the filesystem does not be change your chosen filename to the other name! Only if you try to make files named both, in the same directory, will you run into an issue. I still claim that is *highly* unlikely.

If we treated oo and u as the same for filename comparison purposes, because that was a very common language's policy, I rather suspect that also wouldn't be a huge problem. (It'd be weird to have such behavior, as that isn't a common policy, however.)


to post comments

Working with UTF-8 in the kernel

Posted Mar 31, 2019 19:17 UTC (Sun) by naptastic (guest, #60139) [Link]

> because that was a very common language's policy

Which one‽ I've never heard of this and I am dying to know! MY BRAIN IS HUNGRY

Working with UTF-8 in the kernel

Posted Apr 4, 2019 5:37 UTC (Thu) by rgmoore (✭ supporter ✭, #75) [Link]

Hopefully the filesystem records what mapping it was created with, like NTFS does. Otherwise, some of your files may become inaccessible when a new mapping is switched to (which, iirc, did happen on HFS+ before. That's not good...)

This seems like the key to me. If the case folding rules can change, there's no way to guarantee that the same file will always be accessible the same way, and that's true whether the case folding happens in the kernel or in userspace.

Working with UTF-8 in the kernel

Posted Apr 4, 2019 12:28 UTC (Thu) by bosyber (guest, #84963) [Link]

> If we treated oo and u as the same for filename comparison purposes, because that was a very common language's policy
Is it? I know that it might be that way effectively in German, but in Dutch it is absolutely not, they are completely different sounds (the german u sound is closer to Dutch oe double sound, but not oo which is a loong vowel in Dutch.).


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds