|
|
Log in / Subscribe / Register

Working with UTF-8 in the kernel

Working with UTF-8 in the kernel

Posted Mar 29, 2019 6:09 UTC (Fri) by khim (subscriber, #9252)
In reply to: Working with UTF-8 in the kernel by zlynx
Parent article: Working with UTF-8 in the kernel

Case normalization removes the need for the whole thing. To implement case-insensitive semantic in usersapce you must check if SoMeFiLeNaMe.txt is there and then create SomeFilename.txt atomically. If kernel is asked to create SomeFilename.txt and returns reference to SoMeFiLeNaMe.txt then this atomicity would be handled in kernel.

P.S. I wonder if these tables (without code) could be exposed to userspace. Userspace guys ALSO often need to deal with Unicode and if kernel already has all these tables... why not use them?


to post comments

Working with UTF-8 in the kernel

Posted Mar 29, 2019 6:35 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (3 responses)

The overhead of cross-address access will probably make it impractical for userspace.

Working with UTF-8 in the kernel

Posted Mar 29, 2019 8:26 UTC (Fri) by felix.s (guest, #104710) [Link] (2 responses)

It seems to work fine for vDSO, doesn't it?

Working with UTF-8 in the kernel

Posted Mar 29, 2019 8:28 UTC (Fri) by Cyberax (✭ supporter ✭, #52523) [Link] (1 responses)

That would work for basically static data. At this point a special file in /proc might work just as well.

Working with UTF-8 in the kernel

Posted Mar 29, 2019 9:31 UTC (Fri) by dezgeg (guest, #92243) [Link]

Having the data tables readable from /proc sounds unattractive due to this part from the article:

"The UTF-8 patches incorporate these rules by processing the provided files into a data structure in a C header file. A fair amount of space is then regained by removing the information for decomposing Hangul (Korean) code points into their base components, since this is a task that can be done algorithmically as well."

Exporting these non-standard tables to userspace would lock in this custom format implementation detail forever.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds