LWN.net Logo

Weird characters

Weird characters

Posted Apr 7, 2004 14:24 UTC (Wed) by corbet (editor, #1)
In reply to: X.Org Foundation releases X Window System X11R6.7 by s_cargo
Parent article: X.Org Foundation releases X Window System X11R6.7

Interesting. I had noticed the funky characters when feeding the PR into the system, but firefox renders them as quotes so I left them as they were. Maybe I should fix them up...


(Log in to post comments)

Weird characters

Posted Apr 7, 2004 15:17 UTC (Wed) by scglwn (subscriber, #1245) [Link]

The html source doesn't literally contain those characters, but instead it contains the so-called decimal entity references: “ and ”. I believe that that's the correct way of doing such things. (e.g. the euro symbol would be €)

Weird characters

Posted Apr 7, 2004 15:34 UTC (Wed) by pkturner (subscriber, #2809) [Link]

When my Mozilla browser downloads the HTML, it has the byte codes 0x93 and 0x94 in those locations. Those are Windows codes, not part of the iso-8859-1 charset.

Weird characters

Posted Apr 8, 2004 7:56 UTC (Thu) by scglwn (subscriber, #1245) [Link]

You are right, sorry for the confusion. After saving the page to a file, Firefox silently replaces the (invalid) character codes by the correct decimal references. Yeeks!

Weird characters - decimals are correct.

Posted Apr 7, 2004 21:44 UTC (Wed) by dwheeler (guest, #1216) [Link]

Yes, the safest way to insert curling quotation marks is the decimal codes. See my paper on quotes in HTML, XML, and SGML for more information.

Weird characters

Posted Apr 7, 2004 18:43 UTC (Wed) by Ross (subscriber, #4065) [Link]

They are in Microsoft's own proprietary character set which is almost, but
not quite, the same as ISO Latin-1. Most Microsoft products silently
insert these proprietary quotes if you have the smart quotes option turned
on. It's much better to use HTML entities with Unicode numbers for curly
quotes or to just use plain ASCII quotes. There's an old script call the
Demoronizer which fixes up problems like these.

What's interesting is that the newer Mozilla releases have capitulated.
They now interpret those quotes "correctly" even if the page's character
set marks those characters as reserved. Kind of sad actually.

“ ” ‘ ’ work on everything I have here

Posted Apr 9, 2004 4:21 UTC (Fri) by leonbrooks (guest, #1494) [Link]

That's what OpenOffice's HTML editor uses, so I've taken to using them in my own HTML, and not found anything that blips yet. Konqeror seems to know enough to replace them with straight quotes if the charset doesn't support them. What does your browser show?

"straight quotes"

“double curlies”

'apostrophes'

‘single curlies’

“ ” ‘ ’ work on everything I have here

Posted Apr 9, 2004 16:32 UTC (Fri) by Ross (subscriber, #4065) [Link]

Yes, all those show up correctly. However the original document didn't use
them -- it used raw 8-bit characters in the reserved range. But as for what
HTML entities to use, “ is even more portable than “.

Copyright © 2013, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds