LWN.net Logo

Debian switching to EGLIBC

Debian switching to EGLIBC

Posted May 6, 2009 23:37 UTC (Wed) by rleigh (subscriber, #14622)
In reply to: Debian switching to EGLIBC by ajross
Parent article: Debian switching to EGLIBC

You might find the following thread interesting.

http://lists.debian.org/debian-policy/2009/04/msg00018.html

For the various reasons outlined in the text, we are considering
moving the C locale to using UTF-8 rather than US-ASCII as its
locale codeset. This won't be done immediately; we will create
a C.UTF-8 for testing before considering the full switch to default it.

This will give us native UTF-8 end-to-end from source code to
compiled binary to program output and subsequent terminal display.

Regards,
Roger


(Log in to post comments)

Debian switching to EGLIBC

Posted May 7, 2009 6:47 UTC (Thu) by nix (subscriber, #2304) [Link]

It'll be fascinating to see what that breaks when someone throws in a
character with the high bit set :) stuff that relies upon the C locale
rarely makes a distinction between bytes and characters, even where it
should... of course, one would hope that not much such software is left.

Debian switching to EGLIBC

Posted May 8, 2009 2:02 UTC (Fri) by spitzak (guest, #4593) [Link]

Nothing will break when a byte has a high bit set, since it will just be copied to the output unchanged.

Don't panic about UTF-8. The biggest problem with it is people who do not understand it, some of them are good enough programmers that they might write some code that is very damaging, where they actually try to interpret the UTF-8 encoding.

The only real bug in Unix with UTF-8 is a whole lot of documentation that says "character" where it should say "byte". There is nothing wrong with the current implementations.

Debian switching to EGLIBC

Posted May 8, 2009 13:57 UTC (Fri) by nix (subscriber, #2304) [Link]

I covered this 'nothing will care if you feed UTF-8 to a program expecting
a byte stream' canard in my other response. It's trivially wrong.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds