Filesystems and case-insensitivity
Filesystems and case-insensitivity
Posted May 29, 2019 23:00 UTC (Wed) by Serentty (guest, #132335)In reply to: Filesystems and case-insensitivity by chithanh
Parent article: Filesystems and case-insensitivity
This is not a deficiency with Unicode. Precomposed characters such as É have only ever been encoded in Unicode as a matter of compatibility with legacy encodings, and wouldn't have been included if not for this. They continue to be used because they save you a few bytes, which you might as well go for even if compression makes it moot in the end. Combining diacritics have always been the preferred method as they are much more flexible, and allow users to compose arbitrary characters without needing to constantly update their software or risk mojibake. Many scripts in Unicode work entirely though combining diacritics and it works just fine; the Indic scripts are good examples. It should be noted that the legacy encodings for these scripts usually worked that way as well. Conformant implementations will treat composed and decomposed characters identically, so the advantage of going down the rabbit hole of trying to provide every precomposed character anyone might ever want isn't really worth it when composition works just as well. If you notice that combining diacritics aren't giving you the nice hand-tweaked glyphs that precomposed characters are, and you end up with the diacritic looking all wrong, take it up with the developer of the text renderer or the font, because that's not how Unicode is supposed to work.
