Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

As a non-native-English-speaker, the idea of having invisible bugs in code due to lookalike characters terrifies me.

The allowed set of characters would have to be vetted very carefully before I'd consider non-ASCII.

Edit: A small taste of what we can expect: https://i.imgur.com/k8S00sM.jpg




As a Finn, I don't see how the additional characters ä and ö (which in Finnish are not umlauted a and o but separate letters in the alphabet) could suddenly cause invisible bugs like that. But I can very easily see how ASCIIfying ä and ö to a and o can lead to real misunderstandings (just as an example väärä means "wrong" or "incorrect", but vaara means "danger" (also "esker" but that's less likely to cause confusion :P) Germans might be fine with transcribing ä and ö to ae and oe but in Finnish that's not proper.

Also, bugs caused by accidental homographs in identifier names are caught by any reasonable type system, just like any typos. As for your edit, you can't stop clever people being clever without code reviews anyway, no matter what the language.


> the idea of having invisible bugs in code due to lookalike characters terrifies me.

Unicode maintains a list of these characters (homographs), and Rust rejects them / warns on them.


It's inventive, I'll give them that.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: