Hacker News new | past | comments | ask | show | jobs | submit login

Some of this (Latin and Cyrillic A/a, for example) is relevant to e.g. Internationalized Domain Name issues (https://en.wikipedia.org/wiki/IDN_homograph_attack).

I once worked on what we called DYM tools (for "Did You Mean"). The goal was to assist native English speakers who were learning a second language find words they heard, in that language's electronic dictionary. We knew, for example, that native speakers of English have difficulty distinguishing the dental and retroflex consonants of Hindi, so the DYM allowed mismatches of those. A perfect spelling was at the top of the list returned from the dictionary, while a misspelling--such as writing a dental for a retroflex--resulted in the mismatched word being a little lower in the list. We tailored our DYMs for particular target languages (and always for native speakers of English). As you say, getting something like this to work for multiple languages at the same time would be difficult.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: