GNU C Library 2.35 released Posted Feb 3, 2022 21:07 UTC (Thu) by josh (subscriber, #17465) [Link] (5 responses) I'm glad C.UTF-8 will become more universal. I currently have this, in my environment setup: if [ -d "/usr/lib/locale/C.UTF-8" ]; then LANG=C.UTF-8 else case " $(echo $(LC_ALL=C locale -a)) " in *\ C.UTF-8\ *) LANG=C.UTF-8 ;; *\ C.utf8\ *) LANG=C.utf8 ;; *\ en_US.utf8\ *) LANG=en_US.utf8 ;; *\ en_US.UTF-8\ *) LANG=en_US.UTF-8 ;; *) LANG=C ;; esac fi LC_CTYPE="$LANG" LC_ADDRESS=C LC_COLLATE=C LC_IDENTIFICATION=C LC_MEASUREMENT=C LC_MESSAGES=C LC_MONETARY=C LC_NAME=C LC_NUMERIC=C LC_PAPER=C LC_TELEPHONE=C LC_TIME=C export LANG LC_ADDRESS LC_COLLATE LC_CTYPE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE LC_TIME I'd love to simplify away the detection to figure out a vaguely sensible UTF-8 locale (and, eventually, also throw away the fallback to LANG=C). GNU C Library 2.35 released Posted Feb 4, 2022 0:25 UTC (Fri) by cortana (subscriber, #24596) [Link] (4 responses) Interesting that locales in the Red Hat universe end in '.utf8', while in the Debian universe they end '.UTF-8'. More aggravating divergence! GNU C Library 2.35 released Posted Feb 4, 2022 18:46 UTC (Fri) by madscientist (subscriber, #16861) [Link] (3 responses) What is Red Hat doing here? UTF-8 is the standard codeset name assigned by IANA. GNU C Library 2.35 released Posted Feb 4, 2022 23:16 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses) I think(?) they are aliases, both should work. And it doesn't seem to be Red Hat specific. On an Ubuntu 20.04 machine 'locale -a' shows, among others: en_US.utf8 (and no en_US.UTF-8) GNU C Library 2.35 released Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link] They are indeed aliases, although the aliasing only works in one direction and probably should have been defined as C.utf8 in the locale directory. $ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale aurel32@debian.org C 1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543 ISO 1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968 The last one falls back to C since the aliasing doesn't work. GNU C Library 2.35 released Posted Feb 6, 2022 14:33 UTC (Sun) by cortana (subscriber, #24596) [Link] Uuh, I spoke too soon. Normal locales end in ".utf8" on both the Debian and Fedora systems I have to hand. It's just C.UTF-8/C.utf8 (respectively) that differs between the two. (As someone else pointed out, C.UTF-8 is still usable as a value for LC_* on the Fedora system, but 'locale -a' shows C.utf8, the locale definition lives in /usr/lib/locale/C.utf8 etc.
Posted Feb 3, 2022 21:07 UTC (Thu) by josh (subscriber, #17465) [Link] (5 responses)
I'm glad C.UTF-8 will become more universal.
I currently have this, in my environment setup:
if [ -d "/usr/lib/locale/C.UTF-8" ]; then LANG=C.UTF-8 else case " $(echo $(LC_ALL=C locale -a)) " in *\ C.UTF-8\ *) LANG=C.UTF-8 ;; *\ C.utf8\ *) LANG=C.utf8 ;; *\ en_US.utf8\ *) LANG=en_US.utf8 ;; *\ en_US.UTF-8\ *) LANG=en_US.UTF-8 ;; *) LANG=C ;; esac fi LC_CTYPE="$LANG" LC_ADDRESS=C LC_COLLATE=C LC_IDENTIFICATION=C LC_MEASUREMENT=C LC_MESSAGES=C LC_MONETARY=C LC_NAME=C LC_NUMERIC=C LC_PAPER=C LC_TELEPHONE=C LC_TIME=C export LANG LC_ADDRESS LC_COLLATE LC_CTYPE LC_IDENTIFICATION LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER LC_TELEPHONE LC_TIME
GNU C Library 2.35 released Posted Feb 4, 2022 0:25 UTC (Fri) by cortana (subscriber, #24596) [Link] (4 responses) Interesting that locales in the Red Hat universe end in '.utf8', while in the Debian universe they end '.UTF-8'. More aggravating divergence! GNU C Library 2.35 released Posted Feb 4, 2022 18:46 UTC (Fri) by madscientist (subscriber, #16861) [Link] (3 responses) What is Red Hat doing here? UTF-8 is the standard codeset name assigned by IANA. GNU C Library 2.35 released Posted Feb 4, 2022 23:16 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses) I think(?) they are aliases, both should work. And it doesn't seem to be Red Hat specific. On an Ubuntu 20.04 machine 'locale -a' shows, among others: en_US.utf8 (and no en_US.UTF-8) GNU C Library 2.35 released Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link] They are indeed aliases, although the aliasing only works in one direction and probably should have been defined as C.utf8 in the locale directory. $ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale aurel32@debian.org C 1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543 ISO 1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968 The last one falls back to C since the aliasing doesn't work. GNU C Library 2.35 released Posted Feb 6, 2022 14:33 UTC (Sun) by cortana (subscriber, #24596) [Link] Uuh, I spoke too soon. Normal locales end in ".utf8" on both the Debian and Fedora systems I have to hand. It's just C.UTF-8/C.utf8 (respectively) that differs between the two. (As someone else pointed out, C.UTF-8 is still usable as a value for LC_* on the Fedora system, but 'locale -a' shows C.utf8, the locale definition lives in /usr/lib/locale/C.utf8 etc.
Posted Feb 4, 2022 0:25 UTC (Fri) by cortana (subscriber, #24596) [Link] (4 responses)
GNU C Library 2.35 released Posted Feb 4, 2022 18:46 UTC (Fri) by madscientist (subscriber, #16861) [Link] (3 responses) What is Red Hat doing here? UTF-8 is the standard codeset name assigned by IANA. GNU C Library 2.35 released Posted Feb 4, 2022 23:16 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses) I think(?) they are aliases, both should work. And it doesn't seem to be Red Hat specific. On an Ubuntu 20.04 machine 'locale -a' shows, among others: en_US.utf8 (and no en_US.UTF-8) GNU C Library 2.35 released Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link] They are indeed aliases, although the aliasing only works in one direction and probably should have been defined as C.utf8 in the locale directory. $ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale aurel32@debian.org C 1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543 ISO 1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968 The last one falls back to C since the aliasing doesn't work. GNU C Library 2.35 released Posted Feb 6, 2022 14:33 UTC (Sun) by cortana (subscriber, #24596) [Link] Uuh, I spoke too soon. Normal locales end in ".utf8" on both the Debian and Fedora systems I have to hand. It's just C.UTF-8/C.utf8 (respectively) that differs between the two. (As someone else pointed out, C.UTF-8 is still usable as a value for LC_* on the Fedora system, but 'locale -a' shows C.utf8, the locale definition lives in /usr/lib/locale/C.utf8 etc.
Posted Feb 4, 2022 18:46 UTC (Fri) by madscientist (subscriber, #16861) [Link] (3 responses)
GNU C Library 2.35 released Posted Feb 4, 2022 23:16 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses) I think(?) they are aliases, both should work. And it doesn't seem to be Red Hat specific. On an Ubuntu 20.04 machine 'locale -a' shows, among others: en_US.utf8 (and no en_US.UTF-8) GNU C Library 2.35 released Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link] They are indeed aliases, although the aliasing only works in one direction and probably should have been defined as C.utf8 in the locale directory. $ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale aurel32@debian.org C 1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543 ISO 1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968 The last one falls back to C since the aliasing doesn't work. GNU C Library 2.35 released Posted Feb 6, 2022 14:33 UTC (Sun) by cortana (subscriber, #24596) [Link] Uuh, I spoke too soon. Normal locales end in ".utf8" on both the Debian and Fedora systems I have to hand. It's just C.UTF-8/C.utf8 (respectively) that differs between the two. (As someone else pointed out, C.UTF-8 is still usable as a value for LC_* on the Fedora system, but 'locale -a' shows C.utf8, the locale definition lives in /usr/lib/locale/C.utf8 etc.
Posted Feb 4, 2022 23:16 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses)
en_US.utf8
(and no en_US.UTF-8)
GNU C Library 2.35 released Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link] They are indeed aliases, although the aliasing only works in one direction and probably should have been defined as C.utf8 in the locale directory. $ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/ bug-glibc-locales@gnu.org American English United States 1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale aurel32@debian.org C 1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543 ISO 1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968 The last one falls back to C since the aliasing doesn't work.
Posted Feb 5, 2022 1:48 UTC (Sat) by dbnichol (subscriber, #39622) [Link]
$ LC_ALL=en_US.UTF-8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/
bug-glibc-locales@gnu.org
American English United States
1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=en_US.utf8 locale LC_IDENTIFICATION English locale for the USA Free Software Foundation, Inc. https://www.gnu.org/software/libc/
1.0 2000-06-24 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.UTF-8 locale LC_IDENTIFICATION C locale
aurel32@debian.org
C
1.6 2016-08-08 i18n:2012;UTF-8;;;;;;;;;;; UTF-8 $ LC_ALL=C.utf8 locale LC_IDENTIFICATION locale: Cannot set LC_CTYPE to default locale: No such file or directory locale: Cannot set LC_MESSAGES to default locale: No such file or directory locale: Cannot set LC_ALL to default locale: No such file or directory ISO/IEC 14652 i18n FDCC-set ISO/IEC JTC1/SC22/WG20 - internationalization C/o Keld Simonsen, Skt. Jorgens Alle 8, DK-1615 Kobenhavn V Keld Simonsen keld@dkuug.dk +45 3122-6543 +45 3325-6543
ISO
1.0 1997-12-20 i18n:1999;ANSI_X3.4-1968;;;;;;;;;;; ANSI_X3.4-1968
The last one falls back to C since the aliasing doesn't work.
Posted Feb 6, 2022 14:33 UTC (Sun) by cortana (subscriber, #24596) [Link]
(As someone else pointed out, C.UTF-8 is still usable as a value for LC_* on the Fedora system, but 'locale -a' shows C.utf8, the locale definition lives in /usr/lib/locale/C.utf8 etc.
Copyright © 2022, Eklektix, Inc. Comments and public postings are copyrighted by their creators. Linux is a registered trademark of Linus Torvalds