|
|
Log in / Subscribe / Register

Working with UTF-8 in the kernel

Working with UTF-8 in the kernel

Posted Apr 8, 2019 23:30 UTC (Mon) by dvdeug (subscriber, #10998)
In reply to: Working with UTF-8 in the kernel by foom
Parent article: Working with UTF-8 in the kernel

What rules do they use?

In what way are the Unicode case-folding rules rather complex? They are for the most part fairly simple, one to one matchings of characters, with a few exceptions that you just have to deal with. The German ß and the various titlecase characters in Unicode are there and are going to have to be dealt with.


to post comments

Working with UTF-8 in the kernel

Posted Apr 9, 2019 15:35 UTC (Tue) by foom (subscriber, #14868) [Link]

NTFS and exFAT only maps a single utf16 code unit to another single utf16 code unit, via a lookup table written to disk during filesystem creation. No unicode normalization, no multicharacter equivalencies, and no folding for any characters above FFFF.

You say that other cases "have to be dealt with"...but we have widely used examples showing that to not actually be the case.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds