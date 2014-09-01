Chrome - fixed in 59 (current stable is 57)
Firefox - no plans to change; you can adjust network.IDN_show_punycode in about:config
IE - immune
Safari - immune
With now over 1000 top-level domains, and however many homographic
matches among character sets, expecting people to register dozens of
matching domains seems unrealistic.
I think that, plus a "you have never visited this site before" kind of warning could go a long way towards combating these kinds of attacks.
I think the real devil is going to be in the UI. You don't want to make it overly scary (otherwise you penalize domains which use some unicode characters correctly), but it can't be so unnoticable that you won't be able to tell when it matters.
The thing is, why should an English speaking person get a warning when they visit a Cyrillic url, but a Russian speaking person doesn't get a warning when visiting a url with Latin characters? Why is apple.com assumed to be legitimate and аррІе.com is considered the fraud?
In fact I'm almost sure that browsers originally used to disable IDNs using some kind of scheme that relied on language preferences back when they first started being used. I suspect they eventually abandoned that approach for this very reason. It only seems like a good idea if you're English speaking (or at least some other Latin-based language).
"You have never visited a site in this language/character set before"
More_Info. Cancel? Proceed?
mike@blob:~$ telnet gmail-smtp-in.l.google.com 25
Trying 66.102.1.26...
Connected to gmail-smtp-in.l.google.com.
Escape character is '^]'.
220 mx.google.com ESMTP 19si14686133wmr.1 - gsmtp
EHLO whatever
250-mx.google.com at your service, [164.132.228.175]
250-SIZE 157286400
250-8BITMIME
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
MAIL FROM:<fakeaddress@ycombinator.com>
250 2.1.0 OK 19si14686133wmr.1 - gsmtp
RCPT TO:<*****@gmail.com>
250 2.1.5 OK 19si14686133wmr.1 - gsmtp
DATA
354 Go ahead 19si14686133wmr.1 - gsmtp
From: "Fake Address" <fakeaddress@ycombinator.com>
To: *****@gmail.com
Subject: This is a spoofed email
Spoof spoof spoof
--
Spoofy McSpoof
.
250 2.0.0 OK 1492497764 19si14686133wmr.1 - gsmtp
The only clue is, in the web interface Google displays a grey octagon with a red question mark inside it next to the sender address. And when you hover over that a tooltip says:
"Gmail couldn't verify that ycombinator.com actually sent this message (and not a spammer)"
So yeah. I would dispute "Spoofing isnt so easy for gmail and yahoo inboxes" - They're as shit as everyone else.
Still, most people are unable to confirm the origin of an email. The warning, if any, is likely to be ignored.
https://github.com/NebulousLabs/glyphcheck
(btw, Wikipedia notes that "The term homograph is sometimes used synonymously with homoglyph, but in the usual linguistic sense, homographs are words that are spelled the same but have different meanings, a property of words, not characters.")
With some work, it could be made language-agnostic, but that's more than I have time for right now. If comments aren't an issue, you can just grep through all your source files for the offending characters, which shouldn't take more than a simple bash script.
However apple.com with a CC reset form could be a mighty easy way to scam a lot of people into giving up the personal details which could easily lead to full blown identify theft.
Thankfully FF/Chrome are patching this
Someone else's example that looks like "app.com" ( http://www.xn--80a6aa.com/) translates to the Cyrillic text, even in Pale Moon. I wonder if Apple's site is on a hard-coded blacklist in the browser, or if every update includes the top-1000 list, or something?
I remember reading about issues with Unicode domains years ago, though. It surprises me that something hasn't been figured out by this point. One mitigation that I remember being discussed was coloring characters from different scripts in different colors, to make variant characters more obvious.