|
|
Log in / Subscribe / Register

DeVault: Announcing the Hare programming language

DeVault: Announcing the Hare programming language

Posted May 3, 2022 11:07 UTC (Tue) by mathstuf (subscriber, #69389)
In reply to: DeVault: Announcing the Hare programming language by peter-b
Parent article: DeVault: Announcing the Hare programming language

> it isn't UB for a char8_t* to point to a string buffer that doesn't contain UTF-8.

One reason that comes to my mind is that it would need to eschew `u8ptr++` because if it currently points at a code point encoded with multiple bytes, incrementing one byte would make it UB, no? Or would `++` inspect the byte encoded length of the current code point and jump an appropriate amount? That certainly seems novel too.


to post comments

DeVault: Announcing the Hare programming language

Posted May 5, 2022 0:50 UTC (Thu) by tialaramex (subscriber, #21167) [Link]

Because both of the increment operators in C++ can be overridden for a type, this isn't difficult to do, if C++ wanted to do it.

But I don't expect C++ to actually do that lifting for the same reason it still has both these silly operators in the first place. Backward compatibility trumps all other considerations.

Rust's str actually provides both iterations, if you want the underlying *bytes* you can have those, and of course individual bytes are just UTF-8 code units and on their own don't necessarily mean anything specifically, but if you want "characters" (Rust's char) you can iterate over those and under the hood it is indeed moving forward the appropriate number of bytes each time.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds