LWN.net Logo

Unicode != UTF-8

Unicode != UTF-8

Posted Dec 18, 2011 18:32 UTC (Sun) by khim (subscriber, #9252)
In reply to: Cracks in the Foundation (PHP Advent) by smurf
Parent article: Cracks in the Foundation (PHP Advent)

(Speaking of Unicode: LWN still uses ISO-8859-1 encoding … I tried to use ≠ here, but no luck.)

You mean this one: “≠” ? Works fine for me. What I'm doing wrong?


(Log in to post comments)

Unicode != UTF-8

Posted Dec 18, 2011 19:57 UTC (Sun) by tetromino (subscriber, #33846) [Link]

> You mean this one: “≠” ? Works fine for me. What I'm doing wrong?

In "plain text" comment format, Unicode input and character entities both fail for me (LWN displays them as escaped character entities).

In "HTML" comment format, Unicode input (e.g. Ctrl+Alt+u2260 in Firefox → ≠) as well as named (≠ → ≠) and decimal (≠ → ≠) character entities work correctly.

However, the presence of a hex character entity (e.g. ≠) in a comment breaks all Unicode parsing in that comment, leaving all Unicode characters as escaped character entities, like in plain text format. This seems to be a bug in the LWN comment engine.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds