Rustaceans at the border
Rustaceans at the border
Posted Apr 22, 2022 10:24 UTC (Fri) by smurf (subscriber, #17840)In reply to: Rustaceans at the border by ssokolow
Parent article: Rustaceans at the border
And even if there is, you could mark the offenders, e.g. by placing a combining grapheme joiner U+034F between them.
IMHO the real reason is that, at the time, font rendering engines were not clever enough to show alternate glyphs for composed characters whose naïve supposition of their constituent parts simply doesn't work. (As in, all accented/umlauted/whatever'd capital letters.)
That, or the precedence of Latin-1 with its mountain of composed characters proved too strong and nobody even thought about solving the problem some other way until it was too late.
That, or the problem was deemed unfixable because instead of expanding Han-encoded texts by 50% (three-byte UTF-8 instead of two-byte words) you'd blow them up by >250% (two bytes for radical A, two for radical B, at least one for either marking the end of a glyph or a joiner; more if there's a radical C involved) which would not have been acceptable at the time. After all, at the time Weird Al chastised Microsoft that "in case you haven't noticed, four-gig drives don't grow on trees".
