> *Statements like "first 256 code points in Unicode map to Latin-1" make li...

IgorPartola · on Feb 15, 2011

My point is that in Python land it is silly to call u"" strings, "Unicode" strings. Unicode strings are strings in UTF-8/16/32 and a bunch of lesser-used encodings. For that matter "" could be used as a Unicode string as long as it's only ASCII. What the docs should be talking about is ASCII vs UTF-16, not ASCII/Latin-1 vs Unicode. This starts making a difference when questions like "How much memory is consumed by this string?" or "What characters can I not store in Python?" are asked. In this light, Python 3 makes a big improvement: it has immutable byte arrays and it has encoded strings.