|
|
Log in / Subscribe / Register

Rustaceans at the border

Rustaceans at the border

Posted Apr 17, 2022 21:19 UTC (Sun) by bartoc (guest, #124262)
In reply to: Rustaceans at the border by tialaramex
Parent article: Rustaceans at the border

well, the real issue is that getting a feature into C is quite involved, and I can understand nobody wanting to go through all the trouble (including possible travel with airfare and hotel costs and so on) just to standardize a one line function.

the real problem with locales (besides them not really working with variable width encodings, and being based on code units) is that programmers do NOT expect the behavior of many if these functions to change out from under them (printf is locale sensitive!). This is not just beginner users either! When the C++ committee standardized formatting (via std::format) for dates and times they accidentally made it local sensitive by basically saying “interpret the format string as strftime would”, whoops. (the std::format model is locale invariant by default with special specifiers to do locale things, and the ability to pass in a locale object if you wanna use that instead of the global one)

C locales are so totally insufficient for actual internationalization that having everything be locale sensitive basically only results in non-user-facing stuff being mangled. I hope you like your log analytics misclassifying output from all your machines in countries with a different date order than your developers! Its totally insane.

Even if they were useful for localization the actual specification is essentially “do whatever you want, unless its the C locale”, its really, really bad. And in practice implementers do just phone in locales because they aren't really useful for anything anyway. They should be deprecated and removed (or “removed” by specifying that all locales are equivalent to the C locale)

Then theres the multiple attempts at standard C encoding conversion routines, all of which are broken.

Even if you stick to Unicode you can get into trouble with cursed/unexpected unicode translation formats, GB18030 is the worst (and the only really bad one in somewhat common usage)


to post comments

Rustaceans at the border

Posted Apr 19, 2022 15:21 UTC (Tue) by tialaramex (subscriber, #21167) [Link]

We're diverting pretty far from the original topic, but as to it being difficult to get features in C, there isn't any actual requirement.

Consider the BSD sockets API. Why is that commonplace? Did some JTC1 sub-committee sign off on it and then all our operating systems got the same API? No, it's just the right shape and so everybody adopted it and any "standards" are subsequent and simply documenting what was de facto already the case about networking.


Copyright © 2026, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds