Unicode bugs

Posted Feb 19, 2004 13:19 UTC (Thu) by Cato (subscriber, #7643)
In reply to: Unicode bugs by simonl
Parent article: The kernel and character set encodings

Any new functionality can mean security holes, and this applies whether Unicode is implemented in libraries or the kernel. It's important to address Unicode's potential for such holes (overlong UTF-8 encodings etc), but mostly this is just good practice - e.g. you 'filter in' the characters you know are legal, rather than trying to 'filter out' characters that are illegal (it's very easy to miss just one).

I'm not sure Unicode needs to live in the kernel as long as there is good library support, but it's better for library or kernel maintainers to solve these problems once rather than have different buggy implementations in every application.

The specific IIS issues were related to Microsoft's non-standard %uNNNN encoding of 16-bit UCS-2 (Unicode) characters, so I don't think this is a reason to abandon Unicode.

