|
|
Subscribe / Log in / New account

UTF-16

UTF-16

Posted Mar 25, 2010 15:13 UTC (Thu) by marcH (subscriber, #57642)
In reply to: UTF-16 by wahern
Parent article: Resetting PHP 6

> ... because of perceived complexity viz-a-viz the limitations of modern computing hardware and/or inherited [racist] notions of cultural suitability.

The invention of alphabets was a major breakthrough - because they are inherently simpler than logographies. It's not just about computers: compare how much time a child typically needs to learn one versus the other.

> I think the answer is because programmers have this ingrained belief that what works for their source code editor works for everything else. Same models of text manipulation, same APIs.

Of course yes, what did you expect? This problem will be naturally solved when countries with complicated writing systems will stop waiting for the western world to solve problems only they have.

> It's an unjustifiable intransigence.

Yeah, software developers are racists since most of them do not bother about foreign languages...


to post comments

UTF-16

Posted Mar 25, 2010 16:17 UTC (Thu) by tialaramex (subscriber, #21167) [Link] (5 responses)

The Han characters aren't logograms*, but you're right that alphabetic writing systems are better and it isn't even hard to find native Chinese linguists who agree. Some believe that the future of Chinese as a spoken language (or several languages, depending on your politics) depends now on accepting that it will be written in an existing alphabet - probably the Latin alphabet - and the Han characters will in a few generations become a historical curiosity, like runes. For what it's worth your example of teaching children is very apt, I'm told that Chinese schools have begun using the Latin alphabet to teach (young) children in some places already.

* Nobody uses logograms. In a logographic system you have a 1:1 correspondence between graphemes and words. Invent a new word, and you need a new grapheme. Given how readily humans (everywhere) invent new words, this is quickly overwhelming. So, as with the ancient Egyptian system, the Chinese system is clearly influenced by logographic ideas, but it is not a logographic system, a native writer of Chinese can write down words of Chinese they have never seen, based on hearing them and inferring the correct "spelling", just as you might in English.

UTF-16

Posted Mar 25, 2010 19:24 UTC (Thu) by atai (subscriber, #10977) [Link] (4 responses)

As a Chinese, I can tell you that the Chinese characters are not going anywhere. The Chinese characters will stay and be used for Chinese writings, for the next 2000 years just as in the previous 2000 years.

The ideas that China is backwards because of the language and written characters should now go bankrupt.

UTF-16

Posted Mar 25, 2010 22:26 UTC (Thu) by nix (subscriber, #2304) [Link] (3 responses)

Well, after the singular invention of alphabetic writing systems by some
nameless Phoenicians, Mesopotamians and Egyptians 2500-odd years ago,
*everyone* else was backwards. It's an awesome piece of technology. (btw,
the Chinese characters have had numerous major revisions, simplifications
and complexifications over the last two millennia, the most recent being
the traditional/simplified split: any claim that the characters are
unchanged is laughable. They have certainly changed much more than the
Roman alphabet.)

UTF-16

Posted Mar 25, 2010 23:21 UTC (Thu) by atai (subscriber, #10977) [Link] (1 responses)

I don't know if alphabetic writing is forwards or backwards.

But if you say Chinese characters changed more than the Latin alphabet, then you are clearly wrong; the "traditional" Chinese characters certainly stayed mostly the same since 105 BC (What happened in Korea, Japan or Vietnam do not apply because these are not Chinese).

I can read Chinese writings from the 1st Century; can you use today's English spellings or words to read English writings from the 13th Century?

UTF-16

Posted Mar 26, 2010 11:16 UTC (Fri) by mpr22 (subscriber, #60784) [Link]

I can read Chinese writings from the 1st Century; can you use today's English spellings or words to read English writings from the 13th Century?

13th Century English (i.e. what linguists call "Middle English") should be readable-for-meaning by an educated speaker of Modern English with a few marginal glosses. Reading-for-sound is almost as easy (95% of it is covered by "Don't silence the silent-in-Modern-English consonants. Pronounce the vowels like Latin / Italian / Spanish instead of like Modern English").

My understanding is that the Greek of 2000 years ago is similarly readable to fluent Modern Greek users. (The phonological issues are a bit trickier in that case.)

In both cases - and, I'm sure, in the case of classical Chinese - it would take more than just knowing the words and grammar to receive the full meaning of the text. Metaphors and cultural assumptions are tricky things.

McLuhan

Posted Apr 15, 2010 9:27 UTC (Thu) by qu1j0t3 (guest, #25786) [Link]

Anyone who wants to explore the topic of comparative alphabets further may find McLuhan's works, such as The Gutenberg Galaxy, rewarding.

UTF-16

Posted Mar 25, 2010 16:21 UTC (Thu) by paulj (subscriber, #341) [Link] (8 responses)

As a data-point, I believe children in China are first taught pinyin (i.e.
roman alphabet encoding of mandarin), and learn hanzi logography buiding on
their knowledge of pinyin.

UTF-16

Posted Mar 25, 2010 19:33 UTC (Thu) by atai (subscriber, #10977) [Link] (3 responses)

But pinyin is not a writing system for Chinese. It helps with teaching pronunciation.

UTF-16

Posted Mar 26, 2010 2:49 UTC (Fri) by paulj (subscriber, #341) [Link] (2 responses)

I have a (mainland chinese) chinese dictionary here, intended for kids,
and it is indexed by pinyin. From what I have seen of (mainland) chinese,
pinyin appears to be their primary way of writing chinese (i.e. most writing
these days is done electronically, and pinyin is used as the input
encoding).

UTF-16

Posted Mar 26, 2010 15:37 UTC (Fri) by chuckles (guest, #41964) [Link] (1 responses)

I'm in China right now learning Mandarin so I can comment on this. Children learn pinyin at the same time as the characters. The Pinyin is printed over the characters and is used to help with pronunciation. While dictionaries targeted towards little children and foreigners are indexed by pinyin, normal dictionaries used by adults are not. Dictionaries used by adults are indexed by the radicals.
While pinyin is nice, there are no tone markers. So you have a 1 in 5 chance (4 tones plus neutral) of getting it right.
You are correct that pinyin is the input system on computers, cell phones, everything electronic, in mainland china. Taiwan has its own system. Also, Chinese are very proud people, Characters aren't going anywhere for a LONG time.

UTF-16

Posted Mar 26, 2010 21:24 UTC (Fri) by paulj (subscriber, #341) [Link]

Yes, I gather formal pinyin has accents to differentiate the tones, but on a
computer you just enter the roman chars and the computer gives you an
appropriate list of glyphs to pick (with arrow key or number).

And yes they are. Shame there's much misunderstanding (in both directions)
though. Anyway, OT.. ;)

UTF-16

Posted Mar 25, 2010 20:23 UTC (Thu) by khc (guest, #45209) [Link] (3 responses)

I was raised in Hong Kong and not in mainland China, but I do have relatives in China. I've never heard that kids learn pinyin before the characters.

UTF-16

Posted Mar 26, 2010 2:44 UTC (Fri) by paulj (subscriber, #341) [Link] (2 responses)

This is what someone who was raised in China has told me.

UTF-16

Posted Mar 27, 2010 22:39 UTC (Sat) by man_ls (guest, #15091) [Link] (1 responses)

China has 1,325,639,982 inhabitants, according to Google. That is more than the whole of Europe, Russia, US, Canada and Australia combined. Even if there is a central government, we can assume a certain cultural diversity.

UTF-16

Posted Mar 28, 2010 4:22 UTC (Sun) by paulj (subscriber, #341) [Link]

Good point. :)

This was a Han chinese person from north-eastern China, i.e. someone from
the dominant cultural group in China, from the more developed part of China.
I don't know how representative their education was, but I suspect there's
at least some standardisation and uniformity.

UTF-16

Posted Dec 27, 2010 2:01 UTC (Mon) by dvdeug (guest, #10998) [Link]

You're writing in a language with one of the most screwed up orthographies in existence. Convince English speakers to use a reasonable orthography, and then you can start complaining about the rest of the world.

Not only that, some of these scripts you're not supporting are wonders. Just because Arabic is always written in cursive and thus needs complex script support, doesn't mean that it's not an alphabet that's perfectly suited to its language, that is in fact easier to learn for children, then the English alphabet is for English speakers.

Supporting Chinese or Arabic is like any other feature. You can refuse to support it, but if your program is important, patches or forks are going to float around to fix. Since Debian and other distributions are committed to supporting those languages, the version of the program that will be in the distributions will be the forked version. If there is no fork, they may just not include it. That's the cost you'll have to pay for ignoring the features they want.


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds