Setting up international character support (Linux.com)
Posted Feb 7, 2006 11:41 UTC (Tue) by tialaramex
Parent article: Setting up international character support (Linux.com)
This is a pretty awful article, it gets a few facts right, a lot of things wrong and it manages to make the whole thing sound very complicated so that users (especially those with little interest in foreign language and culture) will be inclined to avoid it altogether.
It confuses a locale (information about the user's culture and language, e.g. paper sizes, currency, calendar) with a character set (which is just a bunch of characters, possibly an ordered set) and both of those with a character encoding (a way to turn characters into byte sequences). The last bit is understandable, even early IETF documents confuse character encodings with character sets, because back then the two were often synonymous.
en_GB is the name of a UK locale, with English language, A4 paper, £ sterling currency, the first month is called "January" and decimals are separated with a point not a comma.
UTF-8 is a character encoding, and it encodes the character set ISO 10646 which is sometimes called "Unicode" since it's identical to the Unicode character set, although Unicode.org standardises many things beyond the ISO 10646 character set.
A previous poster pointed out numerous additional errors. If you don't know anything about i18n, Unicode or UTF-8, stay clear of this article. If you do know something, be prepared to be annoyed by it. Next time we see an article which says Linux is "not for commercial use" or that the GPL is "untried, and probably illegal" we should consider this article and remember that incompetence is a more common explanation than maliciousness.
to post comments)