
A Python internals adventure (2014) - luu
https://flowerhack.dreamwidth.org/3594.html
======
mar77i
The reason for why the whole unicode checking is done in python3 is to
guarantee Unicode support in python3. I actually disliked the old half-string-
half-binary approach and almost from the start enjoyed the clear distinction
between str and bytes in python3.

That being said, the strings/bytes cleanup was also one of the few things that
really broke backward compatibility with 2.x.

------
tbodt
The convention with the Python C API is to return a non-NULL pointer to a
python object on success, and return NULL and set the exception global
variable on error. Yes, global variables are also alive and well.

    
    
        PyObject *fout = _PySys_GetObjectId(&PyId_stdout);
        stdout_encoding = _PyObject_GetAttrId(fout, &PyId_encoding);
    

The python equivalent of this is `sys.stdout.encoding`. The StringIO object
was constructed without an encoding, so this is None.

    
    
        stdout_encoding_str = PyUnicode_AsUTF8(stdout_encoding);
    

This tries to convert None to a C string, which fails.

------
jwilk
The bug has been fixed since then:

[https://bugs.python.org/issue8256](https://bugs.python.org/issue8256)

