

The digital detective: Mikko Hypponen's war on malware is escalating - olegp
http://www.wired.co.uk/magazine/archive/2012/04/features/the-digital-detective?page=all

======
peteretep
The best thing about Mikko Hyppönen is no-one can spell his name the same way
twice (including him - the rock and roll umlaut comes and goes). This makes
him an absolutely awesome case for detecting people by name in text corpuses.
Once you've solved it for Mikko, most other Western names present little in
the way of issues.

~~~
kennu
Finnish people quite often find it easier to spell their name without the
umlauts when appearing online. The situation used to be much worse when email
headers and BBS usernames were still ASCII-only (before Unicode), but you
still occasionally run into problems.

Most recently, I've had trouble using a street address with umlauts for Amazon
AWS billing. Bad web apps tend to convert the umlauts into HTML entities and
then recursively escape those entities with &amp;s when editing data or
resubmitting forms.

~~~
bandy
E-mail headers, BBS stuff, and of course IRC. I can't remember if it was { for
ö or another special.

