Hacker News new | past | comments | ask | show | jobs | submit login

No. Once you screw up encoding, the information is generally gone. It's not just a matter of munging, it's often a matter of having to grovel over the entire file, by hand, correcting things.

Programmers seem to love to think that encoding errors are a joke, but they aren't. The data is gone. That's a big deal. Why are you even writing a program in the first place if it's just going to output unrecoverable gibberish? So you can throw the onus on the user to figure it out?

And that's to say nothing of trying to recover the date.




It drives me bonkers. Use UTF-8. Use other encodings only when talking to systems that require it, and use those other encodings only when actually reading or writing the data. Translate to UTF-8 at the earliest opportunity, and translate from UTF-8 at the last possible moment, and only if you must.

This isn't the 90s. This stuff is basically solved now, except people can't be bothered to use the solution.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: