

The (Unfortunate) Truth About Unicode In Python  - edw519
http://www.cmlenz.net/archives/2008/07/the-truth-about-unicode-in-python

======
jrockway
Yeah, Unicode is really painful. One one hand, most languages have half-assed
support (Python, Ruby, PHP), which makes it hard to "pick up" good unicode
habits. On the other hand, programmers using the langauges that do have real
support (Perl, Java) don't use it correctly. The result is that any correctly-
displayed international characters in the data those programs process is
merely a coincidence. That's scary.

The problem is that reading UTF-8 from a web browser, then storing it in an
8-bit clean database (everything), and then displaying that binary data back
to the web browser usually works OK. So people don't realize their app is
completely broken until it starts behaving in a weird way, then they blame
whoever is nearby.

Anyway, I wish all the major languages would make misusing unicode a fatal
error. That way things could never appear to work. (In Perl, you can do that
with "use encoding::warnings 'fatal'", but nobody ever uses that... they just
cross their fingers and hope for the best.)

The really depressing part is that for every time I answer someone's unicode
question on a mailing list, there are 10 other people who immediately ask the
same question a week or two later. I'm not sure I understand why. It doesn't
happen with other classes of questions. I guess people don't realize that
other people could possibly have had the same problem, and don't Google it?
Dunno.

</rant>

~~~
gruseom
I haven't learned Unicode properly - but I will need to. What do you
recommend? Is there a good book you know of, or some other way to go?

~~~
jauco
<http://www.joelonsoftware.com/articles/Unicode.html>

By the way, python 3000 will force every coder to at least make a conscious
decision about the character encoding and will have no none-unicode strings
anymore.'

[update]Speaking of the devil: [http://sayspy.blogspot.com/2008/07/trying-to-
make-sense-of-u...](http://sayspy.blogspot.com/2008/07/trying-to-make-sense-
of-unicode-terms.html)

