Sponsored link Serve your customers, not your servers, with VERIO Linux VPS. Full-access test-drive here. |
8 byte characters?8 byte characters?Posted Aug 20, 2005 6:24 UTC (Sat) by miallen (guest, #10195)In reply to: 8 byte characters? by ringerc Parent article: Our bloat problem Many apps use UCS-2 internally, because it's *MUCH* faster to work with for many things than UTF-8 . I donno about that. First, it is a rare thing that you would say "I want 6 *characters*". The only case that I can actually think of would be if you were printing characters in a terminal which has a fixed number of positions for characters. In this case UCS-2 is easier to use but even then I'm not convinced it's actually faster. It your using Cyrillic, yeah, it will probably be faster but if it's 90% ascii I would have to test that. Consider that UTF-8 occupies almost half the space of UCS-2 and that CPU cache misses account for a LOT of overhead. If you have large collections of strings like from say a big XML file the CPU will do a lot more of waiting for data with UCS-2 as opposed to UTF-8. In truth the encoding of strings is an ant compared to the elephant of data structures and algorithms. If you design your code well and adapt interfaces so that modules can be reused you can improve the efficiency of your code much more than petty compiler options, changing character encodings, etc.
(Log in to post comments)
|
Copyright © 2008, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds
Powered by Rackspace Managed Hosting.