
Ask HN: Popularity of Unicode Versions of URLs? - supahfly_remix
When I look at web marketing data (e.g., Alexa), I only ever see results for URLs with Latin-based URLs, even for regions with non-Latin alphabets.  Do users from places with non-Latin alphabets type in Latin website names, or do they use non-Unicode versions?<p>I know China likes to use numerical URLs, which avoids this problem, but what about other places?
======
porbelm
There's a solution but it's awkward. We scandinavians use æøå but rarely do we
use them in URIs. Like the electronics chain "Elkjøp" uses "elkjop.no" \- this
is coincidentally one of the few that has actually registered "elkjøp.no".

I have "kråke.re" myself but the DNS entry is really xn--krke-roa.re because
international character DNS is an ugly, ugly hack.

~~~
supahfly_remix
Very interesting. Is it easy to type this special characters on your mobile?

That encoding is punycode, right?

~~~
Lorenz-Kraft
I think all modern mobil keyboards show the regularly used chars depending on
your language. So its very easy.

I think the punnycode encoding is just used for the domains, but not for the
urls. The urls should be normal utf-8 encoding. Wonder why they made a
difference at all.

------
Lorenz-Kraft
i once tried the unicode urls and they are a pain in the a§§ because some
browser (and mail clients) interpret them differently. Also some search
engines and/or crawlers interpret them different (mostly they are double
encoded). This in turn results in several "errors" that the developer then has
to re- en/decode again server side to serve the correct content to url.

On the other side: Just "transliterate" a url is super simple and people all
over the world can at least read the url (and probably memorise). For example:
ä => ae => everybody in Germany knows how to read and interpret this.

SEO wise: No difference at all.

------
randomerr
I believe because of Cyrillic URL spoofing from a about 10-15 years ago most
people stay away from UNICODE urls. China's numeric URL's maybe so that
tracking URL's are easier, but that's just a guess.

