User: Password:
|
|
Subscribe / Log in / New account

UTF family

UTF family

Posted Feb 11, 2011 4:01 UTC (Fri) by tialaramex (subscriber, #21167)
In reply to: Moving to Python 3 by marcH
Parent article: Moving to Python 3

I expect a lot of Java programs (and other programs) work fine with supplementary characters and myraid other thing so long as they leave anything clever to software written by someone else (or more likely a team of somebody elses) who actually knows lots about text.

What were you imagining they should be using java.lang.String.codePointCount() for ? Text is hard, like I said, and a count of Unicode code points is rarely what you need.

Examples of things which are assigned one or more Unicode code points: A harmless, invisible and ignorable marker; indication that subsequent neutral text is intended to be displayed right-to-left; the cedilla accent on a character; a lowercase x; a vertical tab; indication that a non-fatal error occurred in some previous processing.


(Log in to post comments)


Copyright © 2017, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds