Every Python program should be tested with Emoji characters, they're a real tort...

slavik81 · on Jan 14, 2020

Note that you need to test on every platform, as the default file encoding may vary. I missed that bug in part because it worked correctly on Linux.

mark-r · on Jan 14, 2020

Good point. I do almost all of my Python on Windows where it's much easier to get an error.

WorldMaker · on Jan 14, 2020

Every program in general should be tested with Emoji characters at this point.

mark-r · on Jan 14, 2020

Not a bad idea, but I think Python is more likely to have hidden bugs that this will uncover. A language that accepts bytes as input and emits the same on output will probably work fine on UTF-8 for example.

WorldMaker · on Jan 14, 2020

That's the Python 2 mentality and a large part of this discussion was that it didn't work in hindsight, that you can't just be "encoding oblivious", but it usually doesn't show up as an obvious problem until you least expect it. Our input and output devices are aren't always homozygous on byte encoding (and quite possibly very rarely are; we have decades and decades of kludges around this), and testing every program with Emoji has become one of my favorite pastimes for finding failure cases.