Re: Unicode cheatsheet for Perl

[Posted February 28, 2012 by corbet]

From:		Christian Hansen <christian.hansen-AT-mac.com>
To:		Tom Christiansen <tchrist-AT-perl.com>
Subject:		Re: Unicode cheatsheet for Perl
Date:		Tue, 21 Feb 2012 01:21:30 +0100
Message-ID:		<D6FC467F-FAFF-4D48-88E6-84B340B85621@mac.com>
Cc:		Leon Timmermans <fawaka-AT-gmail.com>, Karl Williamson <public-AT-khwilliamson.com>, Perl5 Porters Mailing List <perl5-porters-AT-perl.org>, Jarkko Hietaniemi <jhi-AT-iki.fi>, chansen-AT-cpan.org


21 feb 2012 kl. 00:58 skrev Tom Christiansen:

> Why does it take a new layer?  Why not just make the things
> that get fatalized by 
> 
>    use warnings FATAL => "utf8";
> 
> fatal without saying that?

I would love for this to happen, I have advocated this on #p5p several times, but there is always
the battle of  "backwards compatibility disease". About 10 months ago I reported a security issue
reading the relaxed UTF-8 implementation (still undisclosed and still exploitable) on the perl
security mailing list.

What you state above, was the reason I implemented Unicode::UTF8, but it only decodes strings, not
I/O (good enough for me and my clients as most of our my data is small, few MBytes).

If there would be a consensus in this matter I would happily devote time to see this implemented
and tested [1]

[1] I will not provide a UTF-EBCIDIC implementation, as I believe that's is an ancient encoding not
used by/endorsed by vendor.

--
chansen