|
|
Subscribe / Log in / New account

Unicode 15 released

Unicode 15 released

Posted Sep 18, 2022 6:44 UTC (Sun) by jond (subscriber, #37669)
In reply to: Unicode 15 released by NYKevin
Parent article: Unicode 15 released

> That's nice, but String is still UTF-16 according to https://docs.oracle.com/javase/7/docs/api/java/lang/Strin...,

I don’t know whether this is fixed now or not, but that page won’t tell you, because it’s for Java SE 7, and OP was talking about the recently released Java SE 18.


to post comments

Unicode 15 released

Posted Sep 18, 2022 8:50 UTC (Sun) by ABCD (subscriber, #53650) [Link]

The Java 18 docs for that class at https://docs.oracle.com/en/java/javase/18/docs/api/java.b... seem to indicate that this hasn't changed, it's still UTF-16.

Unicode 15 released

Posted Sep 18, 2022 8:55 UTC (Sun) by dtlin (subscriber, #36537) [Link]

char being a 16-bit value is hard-baked into the JVM, and thus anything that uses a char[] is inherently operating on UTF-16.

Java 9 did add the +XX:+CompactStrings option (JEP 254), which changed the internal representation of String from char[] to byte[], along with a bit determining whether that representation is Latin-1 or UTF-16, with the former taking up half the space. But there was no change to the user-visible API, it is only an implementation detail.

(Java 9 did add String#codePoints() returning an IntStream of code points, but it's unrelated and you could have implemented that yourself with codePointAt()+offsetByCodePoints() anyway, it's just more convenient.)


Copyright © 2025, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds