> Statements like "first 256 code points in Unicode map to Latin-1" make little sense.
That's not true. Latin-1 is both a character set and an encoding, so "the first 256 Unicode code points map to the corresponding Latin-1 characters" is a reasonable statement.
You can also say "the first 128 Unicode code points, when encoded in UTF-8, are equal to the corresponding Latin-1 encoding".
My point is that in Python land it is silly to call u"" strings, "Unicode" strings. Unicode strings are strings in UTF-8/16/32 and a bunch of lesser-used encodings. For that matter "" could be used as a Unicode string as long as it's only ASCII. What the docs should be talking about is ASCII vs UTF-16, not ASCII/Latin-1 vs Unicode. This starts making a difference when questions like "How much memory is consumed by this string?" or "What characters can I not store in Python?" are asked. In this light, Python 3 makes a big improvement: it has immutable byte arrays and it has encoded strings.
That's not true. Latin-1 is both a character set and an encoding, so "the first 256 Unicode code points map to the corresponding Latin-1 characters" is a reasonable statement.
You can also say "the first 128 Unicode code points, when encoded in UTF-8, are equal to the corresponding Latin-1 encoding".