|
|
Log in / Subscribe / Register

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 3, 2022 20:34 UTC (Tue) by hsivonen (subscriber, #91034)
In reply to: DeVault: Announcing the Hare programming language by tialaramex
Parent article: DeVault: Announcing the Hare programming language

Allowing all Unicode code points is what Python 3 does. The result is very bizarre: you can have both non-BMP characters as single units and surrogates—even as pairs.

The logical value space of UTF-8 strings is sequences of Unicode scalar values. Rust’s char being restricted toa Unicode scalar value is coherent with this.

(“Unicode code points” is almost never the right answer between “Unicode scalar values” and “UTF-16 code units”.)

A programming language shouldn’t prohibit special scalar values like U+FFFE. CLDR collation uses U+FFFE as merge separator: The concatenation of str1, U+FFFE, and str2 is guaranteed to collate equivalently to first collating on str1 and, if it is equal, then collating on str2—even for Canadian French.


to post comments


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds