LWN.net Logo

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

MozillaZine reports that IDN support will not be disabled. The details of the new short term solution are available. "Darin Fisher, network supremo, has pulled it out of the bag and come up with a less drastic short-term solution to the IDN problem. It has just been checked in for all three upcoming releases. Read about it over in bug 282270, but basically IDN will still work, but all occurrences of IDN domains in the browser UI (URL bar, security info etc.) will be the punycode form. There is a pref to re-enable full IDN - set "network.IDN_show_punycode" to false. As with the previous plan, this preference will be set to true in all official builds." Meanwhile the search for a long term solution continues.
(Log in to post comments)

IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 21, 2005 23:35 UTC (Mon) by ballombe (subscriber, #9523) [Link]

There will be no solution because the problem is elsewhere: the assumption
that domain name or URL are a secure way to identify a site is flawed.
It has been shown in the past countless times (DNS spoofing, typo domain,
etc.). IDN make things marginaly worse.

IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 0:29 UTC (Tue) by clugstj (subscriber, #4020) [Link]

Not quite. IDN makes matters MUCH worse. Buying a domain name that looks just like a well know one is easy, spoofing DNS effectively is much more difficult.

IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 16:42 UTC (Tue) by ballombe (subscriber, #9523) [Link]

Buying a domain name that looks just like a well know one is easy even
without IDN. Just use a bit of social engineering when choosing your
domain name and introducing it to the user.

PetNames --- a solution to the Homograph URL and Phishing problem.

Posted Feb 22, 2005 23:05 UTC (Tue) by AnswerGuy (guest, #1256) [Link]

There are possible solutions. None will work for everyone. There are too many people who are way too gullible or lazy. However, there are technical means to mitigate the majority of the problem and give any reasonable competent and motivated person the level of protection they need to avoid being hooked by phishing scams.

I don't have time or space to cover the full range of this dicussion. However, a couple of pointers may serve:

Sorry I don't have time to actually cover the details here. It would be a timely topic for one of LWN's feature articles Hint!, Hint!

Basically the short form of the mozilla component to the solution is to have color coding and "Pet" icons next to those URL references that are "known good" (because they are among your personally configured list of "PetNames"). The details of how URLs get adopted as pets are the tricky part --- just as the whole matter of key management is the hardest challenge of modern cryptography.

JimD

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 21, 2005 23:49 UTC (Mon) by bk (guest, #25617) [Link]

Definition of punycode: http://en.wikipedia.org/wiki/Punycode

Example:

Original unicode IDN: pàypal.com
punycode equivalent: xn--pypal-rqa.com

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 0:43 UTC (Tue) by fjf33 (subscriber, #5768) [Link]

I like this solution a LOT better than some I have seen floating around. There is no collision between punycode and the regular ANSI I hope. But even then it probably is a margin case anyways. I wonder if typing in unicode in the browser bar will be OK or people will have to write in punycode. There are people that actually use IDN domains.

/And/ another thing...

Posted Feb 22, 2005 17:20 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

There is no collision between punycode and the regular ANSI I hope.

Punycode is certainly ugly, but by definition it's in ASCII---see bk's example for pàypal.com, above.

So, when everyone's used to seeing funky URLs in Unicode, all you need do is register the domain `ax--qrmasdflkh-v9.com', and no one will give it a second glance.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 0:49 UTC (Tue) by Richard_J_Neill (subscriber, #23093) [Link]

What I don't understand is why the homographs can't all resolve to the same place?

Eg the owner of www.bücher.com would automatically get www.bucher.com and www.buecher.com

That would solve all the problems - no homograph collisions, and no confusion over how to spell an IDN. Furthermore, this could be done either by the browser, or by DNS.

The only downsides I can see are:

Many-to-one conversion means 3-10 times as many DNS records.
Reverse lookups would need some sort of canonicalisation.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 3:52 UTC (Tue) by darthmdh (guest, #8032) [Link]

I don't understand why I bought a base model car, and the dealer didn't throw in air conditioning, leather seats, ABS, Momo steering wheel & gearstick, GPS, side-impact airbags and a 12-speaker +twin sub, 400W Dolby surround entertainment unit!

After all, its not like they're only in it for the money, right?

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 8:16 UTC (Tue) by ekj (guest, #1524) [Link]

There's literally thousands of letters that look similar. For some of them, the similarity or not depends on the font used. Who is to decide what is "too similar" ?

For longer domain-names there's literally millions of different names that all look more or less the same. It'd be rather complicated to have dns handle that, it wouldn't be 2 or 3 registrations for a single domain, it'd be 2 or 3 million.

There's also the issue that some of the homographs are arguably useful. In lots of fonts it is very hard (or impossible) to see the difference between l (small L) and I (capital i) paypal paypaI, would *your* grandmother notice ? Is is *reasonable* to assume people will notice such and base security on that assumption ?

The real solution has to be something different. With my bank (Skandiabanken.no) for example such attacks are made very much more difficult by the use of client-side certificates. The first time you use the bank you have to download a client-side certificate. This is installed in the browser and so configured that it'll only be presented to the real skandiabanken.no site. This has multiple benefits:

  • A phisher that somehow *suceeded* in having you give up account-number and pin-code would still not have all the info needed to access your account, since he won't have the client-side certificate.
  • The real bank-site uses the certificate to say "Hello Eivind Kjørstad" on the login-page. A phisher site would have no way of knowing my name, thus adding another difference between real and fake site. (not everyone would notice the change to "Hello dear customer", but some would.)
  • Firefox changes the colour of the security-key-icon when you're *both* ways authenticated using SSL-certificates. A green background means you're on a site which has presented a valid SSL-certificate AND to which you've presented a SSL-certificate that was accepted.

Avoid broken fonts

Posted Feb 22, 2005 9:36 UTC (Tue) by eru (subscriber, #2753) [Link]

In lots of fonts it is very hard (or impossible) to see the difference between l (small L) and I (capital i) paypal paypaI, would *your* grandmother notice ?

But such fonts are seriously broken, at least for all applications that require accurate information to be conveyed.

It would not be too hard to require that the URL entry field and the status bar must use only fonts where different letters of the alphabet are clearly distinguishable. That does not mean going to monospaced typewriter fonts. Most of the problem goes away by just avoiding sans serif fonts.

Avoid broken fonts

Posted Feb 22, 2005 16:47 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

Most of the problem goes away by just avoiding sans serif fonts

I fear there speaks insufficient examination of other scripts.

Even in Latin, serif, fonts, we're still stuck with letter `O' vs. digit `0', and letter `l' vs. digit `1'. What is ``sans-serif'' in Hindi? Japanese?

Having no ablilty to read non-Latin languages, I've nonetheless seen a number of them (Hebrew, Arabic, Thai, Japanese, &c.) and I strongly suspect they have their own problems of this nature, and probably a number of different ones, to boot.

Avoid broken fonts

Posted Feb 23, 2005 8:45 UTC (Wed) by eru (subscriber, #2753) [Link]

I fear there speaks insufficient examination of other scripts.

Even in Latin, serif, fonts, we're still stuck with letter `O' vs. digit `0', and letter `l' vs. digit `1'.

Note I wrote "most of the problem", not "the entire problem"! Anyway, in the font I am now reading your message in (Times New Roman), letter `O' and digit `0' are clearly different. `l' and `1' could be confused, although there is some difference. Well, the font could be improved. I remember seeing old Telex machines where the font was specially designed to make `l' different, with a small hook at the bottom.

(Reminds me of my father's old mechanical typewriter that actually took advantage of the similarity of `l' and `1': there was no separate key for "one"!)

What is ``sans-serif'' in Hindi? Japanese?

I don't know. The concept probably makes no sense in scripts that belong to other cultures than Western European. However, there are different fonts being used for these scripts. Even not knowing Hindi or Japanese, I can sense the differences.

Some fonts for these languages are no doubt less prone to homographs than others, so a similar solution is likely to be applicable: Pick fonts where the characters are maximally different for applications where URLs are entered or displayed for the user's verification.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 18:32 UTC (Tue) by iabervon (subscriber, #722) [Link]

It is not really necessary to have a client certificate; the server certificate is sufficient identification of the server. The problem is that browsers use certificate authorities to determine whether a certificate should be trusted, which is the real flaw. The browser should really store the certificates of sites you have a relationship with and tell you if you get a new certificate. If you go to an encrypted page where you haven't seen the certificate before, you should go through a process where the browser tells you that the certificate is new, tells you to be suspicious if you thought you'd been there before, lets you compare the fingerprint against a known one (your bank could send you the fingerprint of their certificate in the online banking literature or your statement), and warns you if there are similar domains you've got certificates from before.

Ideally, users could store the set of certificates with their ISPs, too, so that they wouldn't lose all of their recognized certificates when changing computers, to reduce the number of times that a particular individual will legitimately see the new certificate page for the same site.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 23, 2005 13:19 UTC (Wed) by forthy (guest, #1525) [Link]

"paypal vs. paypaI" (with capital "I" instead of lowercase "l"): That's "solved" with normalizing URLs to all-lowercase letters (paypaI->paypai).

Punycode is a bad solution at a real problem. The real problem is that people want (and need) localized URLs. Not so much in the world with latin letters, but the rest is at least as large. The solution simply is wrong: You don't want context-free localized URLs, i.e. you don't want Unicode.

My suggestion is to drop punycode, and create a stringent set of transformations into ASCII. If you want a Chinese domain (e.g. for xinhua, the Chinese news service), you get "xinhua.zn". You are allowed to enter that text in Chinese, the transition process makes sure that you can type something like 薪华.中 in your web-browser, and still get what you need (you have to agree on a particular transcript, though).

You could still even see what you need when there's a backmapping for the preferred rendering. This should be a DNS entry, i.e. if you buy "xinhua.zn", you can ask for such an entry. The entry has to follow the rules (i.e. it has to forward translate to "xinhua.zn"), and can probably also follow further rules (if it's a .zn domain, e.g. it should be Chinese).

BTW LWN: I really wanted these &#xHEX; as above in my text, there's no fucking unescaped & in there. They would show up as unicode Chinese characters to prove my point :-(.

client-side certificate

Posted Feb 25, 2005 16:43 UTC (Fri) by giraffedata (subscriber, #1954) [Link]

The first time you use the bank you have to download a client-side certificate. This is installed in the browser

Does that mean you can't access your bank from another computer? If so, I'd switch banks if it were I.

What you've described doesn't seem to solve the homograph problem, though. The fake bank site would accept the certificate.

client-side certificate

Posted Feb 26, 2005 9:21 UTC (Sat) by Klavs (subscriber, #10563) [Link]

>Does that mean you can't access your bank from another computer? If so, I'd >switch banks if it were I.

If you don't bring your client certificate on, say, a USB-token? This coincides with the good principal of "something you have". With all (AFAIK) banks in DK incl. Skandiabanken, you have to have something you know, and something you have. They are only lacking "something you are" ;) - much better than just something you know(ie. like a password).

>What you've described doesn't seem to solve the homograph problem, though. >The fake bank site would accept the certificate.
The fake bank site would NEVER get the cerficate (except if they'd done the good old DNS-spoofing) - as the browser can easily see that www.skandiabanken.no and www.skandsome-idn-abanken.no is NOT the same site.

client-side certificate

Posted Feb 26, 2005 18:27 UTC (Sat) by giraffedata (subscriber, #1954) [Link]

Yes, I was confused. You're talking about a scheme to make the stealing of a password unproductive (because the password isn't useful by itself), rather than to prevent someone from being fooled into thinking he is talking to his bank when he is not.

Improving on the security of the password is good, but for a whole bunch of other reasons, phishing itself needs to be dealt with too.

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 13:10 UTC (Tue) by danielos (subscriber, #6053) [Link]

and could be seen both punycode and full idn at same time?

and to know it? xn--pypal-rqa.com is not a legal domain?

Long term solution ... I think it simply does not exist

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 22, 2005 17:29 UTC (Tue) by man_ls (subscriber, #15091) [Link]

punycode may not be the panacea, but that does not mean that there is no long-term solution.

See, the main problem seems to appear when mixing characters from different sets, e.g. "amazon" with a greek omicron, or latin names with other non-latin caracters. The long-term solution might be therefore to divide characters into sets according to alphabet (latin, greek, russian, hindi...) and show punycode only when some characters are from different sets. Otherwise, all-greek domain names should appear in Greek characters, all-hindi in Hindi...

New IDN Homograph Spoofing Response: IDN Will Not Be Disabled (MozillaZine)

Posted Feb 24, 2005 13:04 UTC (Thu) by ibukanov (subscriber, #3942) [Link]

> Otherwise, all-greek domain names should appear in Greek characters, all-hindi in Hindi...

It would not work in general. Depending on a font used you may or may not spot the difference between "paypal" and its equivalent written with only Cyrillic letters and digit 1, "раура1".

P.S. To view the Cyrillic letter properly in the above please set the encoding for the page to UTF-8. LWN unfortunately still uses Latin-1 which is quite ironic for the pages that talk about Unicode.

Suggested Extension to Fix

Posted Feb 22, 2005 17:08 UTC (Tue) by Max.Hyre (subscriber, #1054) [Link]

Having skimmed the Punycode spec. (RFC 3942), I can see that the fix is moderately effective, at least for URLs already in ASCII. It's also a pain for someone reading, say, Hindi who wants to actually read the URL.

As an addition to the temporary fix, how about adding a button next to the URL display which, when pressed, will convert the URL to Unicode so it can actually be read? That sounds relatively doable to one who hasn't checked the source. :-) (By `URL display', I mean the URL-entry bar at the top, where one can see the URL currently visited, and may type in a new URL.)

In addition, it's desirable, and probably more important, to show the URL about to be clicked on, before the click. It's tougher, because a button won't work while hovering. Maybe, when hovering, convert the info line at the bottom into two lines, showing both the Punycode and the readable text?

Suggested Extension to Fix

Posted Feb 24, 2005 13:55 UTC (Thu) by hummassa (subscriber, #307) [Link]

This sounds like a good idea -- maybe the tooltip to the url bar?

But... it would open another can of worms...

For instance, if you have two similar, but different, non-Latin glyphs, the only way to make certain that you are in the site for your bank is what it seems is knowing the punycode name.

Copyright © 2005, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds