Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"98% of web sites are encoded in UTF8"

And quite a few web sites claim to be encoded in UTF8 and serve latin-1. It is best to check, or at least to specify error handling on your decoder.




My guess is that the reality is "98% of websites are valid UTF-8 documents". A large portions contain only ASCIIs so they happen to be, just indistinguishable with truly UTF8 encoded ones until they break.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: