|
|
Log in / Subscribe / Register

Filesystems and case-insensitivity

Filesystems and case-insensitivity

Posted Nov 28, 2018 14:41 UTC (Wed) by mgedmin (guest, #34497)
Parent article: Filesystems and case-insensitivity

> He noted that European distributions typically use single-byte encodings

Am I living in a bubble? What are the European distributions that don't use UTF-8 by default in 2018?


to post comments

Filesystems and case-insensitivity

Posted Nov 28, 2018 16:14 UTC (Wed) by niner (guest, #26151) [Link] (8 responses)

It's a bit disconcerting that someone working on text encoding support in the kernel has such grave misconceptions about encodings. I haven't seen anything but UTF-8 on a European Linux system in more than a decade. Chinese characters (I think he means CJK) are part of the basic multilingual plane and thus are encoded in 3 bytes by UTF-8.

The prevalent GBK encoding uses 2 bytes for such characters, so we're talking about a ~ 50 % increase in storage size. For text. I really wonder who cares about that in 2018. And even more I wonder, who'd care about the storage requirements for file names.

Filesystems and case-insensitivity

Posted Nov 28, 2018 16:58 UTC (Wed) by rahulsundaram (subscriber, #21946) [Link] (1 responses)

>It's a bit disconcerting that someone working on text encoding support in the kernel has such grave misconceptions about encodings

Careful there. This was not a comment from the developer working on text encoding support but from James Bottomley.

Filesystems and case-insensitivity

Posted Nov 28, 2018 17:02 UTC (Wed) by niner (guest, #26151) [Link]

Oh, thanks for the clarification! I misunderstood who the "He" in the sentence referred to.

Filesystems and case-insensitivity

Posted Nov 28, 2018 19:40 UTC (Wed) by roc (subscriber, #30627) [Link]

Also, how often would a filesystem have file names consisting *solely* of CJK?

For example, for Chinese Web pages UTF8 is a win over UTF16 because the majority of the text of a typical Chinese HTML document is actually ASCII.

Filesystems and case-insensitivity

Posted Nov 28, 2018 19:41 UTC (Wed) by roc (subscriber, #30627) [Link] (4 responses)

I think "the Chinese hate UTF-8" is "citation needed".

Filesystems and case-insensitivity

Posted Nov 29, 2018 0:36 UTC (Thu) by willy (subscriber, #9762) [Link] (3 responses)

The "Han unification" part of Unicode appears to have been controversial. https://en.m.wikipedia.org/wiki/Han_unification

But I don't think UTF-8 per se is controversial in China. More so in Russia where it is an evil tool of US oppression.

Filesystems and case-insensitivity

Posted Nov 29, 2018 2:04 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link] (2 responses)

> More so in Russia where it is an evil tool of US oppression.
Uhm, no.

UTF-8 has finally solved the problem with the veritable zoo of commonly used Russian encodings (KOI-8, Win-1251, GOST, GOST-ALT, ISO).

Filesystems and case-insensitivity

Posted Nov 29, 2018 11:04 UTC (Thu) by andrewsh (subscriber, #71043) [Link] (1 responses)

Well, nobody (citation needed) ever used GOST or ISO encodings.

Filesystems and case-insensitivity

Posted Nov 29, 2018 11:07 UTC (Thu) by Cyberax (✭ supporter ✭, #52523) [Link]

GOST was used quite a lot in the pre-Internet era and surfaced periodically afterwards, in random places like receipt printer encodings.

ISO was used sometimes in the Internet. It was rare but it existed.

Filesystems and case-insensitivity

Posted Nov 28, 2018 20:30 UTC (Wed) by HenrikH (subscriber, #31152) [Link]

Came here to ask the same thing, have not seen anything other than UTF-8 here in Scandinavia for the last 10-15 years.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds