|
|
Subscribe / Log in / New account

8 byte characters?

8 byte characters?

Posted Aug 13, 2005 2:53 UTC (Sat) by hp (guest, #5220)
In reply to: 8 byte characters? by ringerc
Parent article: Our bloat problem

Unicode doesn't fit in 16 bits anymore; most apps using 16-bit encodings would be using UTF-16, which has the same variable-length properties as UTF-8. If you pretend each-16-bits-is-one-character then either you're using a broken encoding that can't handle all of Unicode, or you're using UTF-16 in a buggy way. To have one-array-element-is-one-character you have to use a 32-bit encoding.

UTF-8 has the huge advantage that ASCII is a subset of it, which is why everyone uses it for UNIX.


to post comments


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds