LWN.net Logo

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 0:49 UTC (Tue) by Richard_J_Neill (subscriber, #23093)
Parent article: New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

What I don't understand is why the homographs can't all resolve to the same place?

Eg the owner of www.bücher.com would automatically get www.bucher.com and www.buecher.com

That would solve all the problems - no homograph collisions, and no confusion over how to spell an IDN. Furthermore, this could be done either by the browser, or by DNS.

The only downsides I can see are:

Many-to-one conversion means 3-10 times as many DNS records.
Reverse lookups would need some sort of canonicalisation.


(Log in to post comments)

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 3:52 UTC (Tue) by darthmdh (guest, #8032) [Link]

I don't understand why I bought a base model car, and the dealer didn't throw in air conditioning, leather seats, ABS, Momo steering wheel & gearstick, GPS, side-impact airbags and a 12-speaker +twin sub, 400W Dolby surround entertainment unit!

After all, its not like they're only in it for the money, right?

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 8:16 UTC (Tue) by ekj (subscriber, #1524) [Link]

There's literally thousands of letters that look similar. For some of them, the similarity or not depends on the font used. Who is to decide what is "too similar" ?

For longer domain-names there's literally millions of different names that all look more or less the same. It'd be rather complicated to have dns handle that, it wouldn't be 2 or 3 registrations for a single domain, it'd be 2 or 3 million.

There's also the issue that some of the homographs are arguably useful. In lots of fonts it is very hard (or impossible) to see the difference between l (small L) and I (capital i) paypal paypaI, would *your* grandmother notice ? Is is *reasonable* to assume people will notice such and base security on that assumption ?

The real solution has to be something different. With my bank (Skandiabanken.no) for example such attacks are made very much more difficult by the use of client-side certificates. The first time you use the bank you have to download a client-side certificate. This is installed in the browser and so configured that it'll only be presented to the real skandiabanken.no site. This has multiple benefits:

  • A phisher that somehow *suceeded* in having you give up account-number and pin-code would still not have all the info needed to access your account, since he won't have the client-side certificate.
  • The real bank-site uses the certificate to say "Hello Eivind Kjørstad" on the login-page. A phisher site would have no way of knowing my name, thus adding another difference between real and fake site. (not everyone would notice the change to "Hello dear customer", but some would.)
  • Firefox changes the colour of the security-key-icon when you're *both* ways authenticated using SSL-certificates. A green background means you're on a site which has presented a valid SSL-certificate AND to which you've presented a SSL-certificate that was accepted.

Avoid broken fonts

Posted Feb 22, 2005 9:36 UTC (Tue) by eru (subscriber, #2753) [Link]

In lots of fonts it is very hard (or impossible) to see the difference between l (small L) and I (capital i) paypal paypaI, would *your* grandmother notice ?

But such fonts are seriously broken, at least for all applications that require accurate information to be conveyed.

It would not be too hard to require that the URL entry field and the status bar must use only fonts where different letters of the alphabet are clearly distinguishable. That does not mean going to monospaced typewriter fonts. Most of the problem goes away by just avoiding sans serif fonts.

Avoid broken fonts

Posted Feb 22, 2005 16:47 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

Most of the problem goes away by just avoiding sans serif fonts

I fear there speaks insufficient examination of other scripts.

Even in Latin, serif, fonts, we're still stuck with letter `O' vs. digit `0', and letter `l' vs. digit `1'. What is ``sans-serif'' in Hindi? Japanese?

Having no ablilty to read non-Latin languages, I've nonetheless seen a number of them (Hebrew, Arabic, Thai, Japanese, &c.) and I strongly suspect they have their own problems of this nature, and probably a number of different ones, to boot.

Avoid broken fonts

Posted Feb 23, 2005 8:45 UTC (Wed) by eru (subscriber, #2753) [Link]

I fear there speaks insufficient examination of other scripts.

Even in Latin, serif, fonts, we're still stuck with letter `O' vs. digit `0', and letter `l' vs. digit `1'.

Note I wrote "most of the problem", not "the entire problem"! Anyway, in the font I am now reading your message in (Times New Roman), letter `O' and digit `0' are clearly different. `l' and `1' could be confused, although there is some difference. Well, the font could be improved. I remember seeing old Telex machines where the font was specially designed to make `l' different, with a small hook at the bottom.

(Reminds me of my father's old mechanical typewriter that actually took advantage of the similarity of `l' and `1': there was no separate key for "one"!)

What is ``sans-serif'' in Hindi? Japanese?

I don't know. The concept probably makes no sense in scripts that belong to other cultures than Western European. However, there are different fonts being used for these scripts. Even not knowing Hindi or Japanese, I can sense the differences.

Some fonts for these languages are no doubt less prone to homographs than others, so a similar solution is likely to be applicable: Pick fonts where the characters are maximally different for applications where URLs are entered or displayed for the user's verification.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 18:32 UTC (Tue) by iabervon (subscriber, #722) [Link]

It is not really necessary to have a client certificate; the server certificate is sufficient identification of the server. The problem is that browsers use certificate authorities to determine whether a certificate should be trusted, which is the real flaw. The browser should really store the certificates of sites you have a relationship with and tell you if you get a new certificate. If you go to an encrypted page where you haven't seen the certificate before, you should go through a process where the browser tells you that the certificate is new, tells you to be suspicious if you thought you'd been there before, lets you compare the fingerprint against a known one (your bank could send you the fingerprint of their certificate in the online banking literature or your statement), and warns you if there are similar domains you've got certificates from before.

Ideally, users could store the set of certificates with their ISPs, too, so that they wouldn't lose all of their recognized certificates when changing computers, to reduce the number of times that a particular individual will legitimately see the new certificate page for the same site.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 23, 2005 13:19 UTC (Wed) by forthy (guest, #1525) [Link]

"paypal vs. paypaI" (with capital "I" instead of lowercase "l"): That's "solved" with normalizing URLs to all-lowercase letters (paypaI->paypai).

Punycode is a bad solution at a real problem. The real problem is that people want (and need) localized URLs. Not so much in the world with latin letters, but the rest is at least as large. The solution simply is wrong: You don't want context-free localized URLs, i.e. you don't want Unicode.

My suggestion is to drop punycode, and create a stringent set of transformations into ASCII. If you want a Chinese domain (e.g. for xinhua, the Chinese news service), you get "xinhua.zn". You are allowed to enter that text in Chinese, the transition process makes sure that you can type something like 薪华.中 in your web-browser, and still get what you need (you have to agree on a particular transcript, though).

You could still even see what you need when there's a backmapping for the preferred rendering. This should be a DNS entry, i.e. if you buy "xinhua.zn", you can ask for such an entry. The entry has to follow the rules (i.e. it has to forward translate to "xinhua.zn"), and can probably also follow further rules (if it's a .zn domain, e.g. it should be Chinese).

BTW LWN: I really wanted these &#xHEX; as above in my text, there's no fucking unescaped & in there. They would show up as unicode Chinese characters to prove my point :-(.

client-side certificate

Posted Feb 25, 2005 16:43 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

The first time you use the bank you have to download a client-side certificate. This is installed in the browser

Does that mean you can't access your bank from another computer? If so, I'd switch banks if it were I.

What you've described doesn't seem to solve the homograph problem, though. The fake bank site would accept the certificate.

client-side certificate

Posted Feb 26, 2005 9:21 UTC (Sat) by Klavs (subscriber, #10563) [Link]

>Does that mean you can't access your bank from another computer? If so, I'd >switch banks if it were I.

If you don't bring your client certificate on, say, a USB-token? This coincides with the good principal of "something you have". With all (AFAIK) banks in DK incl. Skandiabanken, you have to have something you know, and something you have. They are only lacking "something you are" ;) - much better than just something you know(ie. like a password).

>What you've described doesn't seem to solve the homograph problem, though. >The fake bank site would accept the certificate.
The fake bank site would NEVER get the cerficate (except if they'd done the good old DNS-spoofing) - as the browser can easily see that www.skandiabanken.no and www.skandsome-idn-abanken.no is NOT the same site.

client-side certificate

Posted Feb 26, 2005 18:27 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

Yes, I was confused. You're talking about a scheme to make the stealing of a password unproductive (because the password isn't useful by itself), rather than to prevent someone from being fooled into thinking he is talking to his bank when he is not.

Improving on the security of the password is good, but for a whole bunch of other reasons, phishing itself needs to be dealt with too.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds