Hacker News new | past | comments | ask | show | jobs | submit login

> Internationalized domain names [1]

Does anyone know how this works in general?

It seems like an invitation to severe cybersquatting to allow someone to register e.g. goógle.com. I know those marks mean something in non-English languages, but for me (and I suspect most English speakers) it's very easy to mistake for dust on the monitor those symbols that Europeans insist on putting over their letters, like "`" or "'" or [[cos(90), sin(90)], [-sin(90), cos(90)]] * ":" [2] or other weird symbols that I don't even know how to type because they aren't linear transformations of ASCII characters.

[1] http://www.winehq.org/announce/1.6

[2] http://xkcd.com/184/




-- disclaimer: this is old info and I never got too deep into it, so take it with a (big) grain of salt :) and please, correct me if I'm wrong somewhere!

From what I can remember of some of Chrome's and Firefox's battles with this, one of the end-states was something like this, using punycode (example here[1] seems more useful than the wikipedia page[2]):

If system language matches domain name's language, display in localized characters. Otherwise, display punycode, to prevent homoglyph attacks.

"domain name's language" is of course a very vague definition. A better one might be "same UTF section". And it's all hairy and a bit problematic and I don't recall any conclusive, ideal solutions, and somewhat doubt it's possible. But handling the vast majority of legitimate uses in the best-possible way while preventing homoglyph attacks is pretty darned good.

And yeah, there will be sin/cosine/theta/pi.com attempts. But having an un-typable name is a choice made by the domain owner, just like any other. There's nothing preventing me from buying fdaocclasuro--ieja83q92e-jfksdl7a.com, which is at least as obtuse as any punycode URL, but that hasn't been a complaint in the past.

[1] http://mothereff.in/punycode [2] http://en.wikipedia.org/wiki/Punycode


And yeah, there will be sin/cosine/theta/pi.com attempts

I apologize for being thick skulled but I do not understand what you are referring to and I did not understand that similar statement in the comment to which you replied. What am I missing?

Thanks in advance for explaining something to me that everyone else seems to have no problem understanding.


No problem :) I mean that people will buy / try to buy π.com, Θ.com, ☃.com, and even 🍤.com (pi, theta, snowman, and fried shrimp characters, if you can't see them). It's inevitable. Even though you can't really type them, people will buy them because they're weird / unique / well recognized. I'm merely saying it's not a problem.


Thank you for your reply. I was fixated on the equation (and xkcd image) from the original comment at the expense of overlooking "weird symbols that I don't even know how to type."


> Does anyone know how this works in general?

Here are obligatory Wikipedia links: http://en.wikipedia.org/wiki/Internationalized_domain_name http://en.wikipedia.org/wiki/IDN_homograph_attack

> It seems like an invitation to severe cybersquatting to allow someone to register e.g. goógle.com.

First, you're a couple years late on this.

Second, registries that enable IDN generally formulate anti-spoofing rules. It is unlikely that an entity that is not Google and does not control google.pl would be able to register goógle.pl.

Third, any truly sensitive content should be behind HTTPS anyway.

Fourth, most of the world uses non-ASCII scripts and generally speaking couldn't give a damn about problems the scripts give the small minority stuck with ASCII in the 1980s ;)




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: