
A source-transforming module loader for Python - aaronchall
http://gregoryszorc.com/blog/2017/03/13/from-__past__-import-bytes_literals/
======
x1798DE
I don't understand why it was less tedious to write a source-transforming
module than to just make the static code changes _once_ and be explicit about
it.`b''` is supported in Python 2.7, so it's just redundant. I think people
will be confused if they look at a nominally Python 3 codebase and find that
string literals are being encoded as bytes.

It's certainly a clever hack, I just think you're more likely to introduce
"subtle bugs" when you are doing something decidedly non-standard
_implicitly_.

~~~
klibertp
When you change the source code, you have to take care to test on both Python
2 and 3. With this solution, the transformation (I think) only happens on
Python 3, so there is no way to introduce bugs in Python 2 at all. Which is an
important thing if your code base is big enough.

While explicit is better than implicit the practicality still beats purity.

EDIT: yup, only happens on Py3: [https://www.mercurial-
scm.org/repo/hg/rev/1c22400db72d#l2.10](https://www.mercurial-
scm.org/repo/hg/rev/1c22400db72d#l2.10)

~~~
x1798DE
But that makes things worse, because you have two different versions, one of
which you can't see directly. b'' is the same as '' in Python 2 in the same
way u'' is the same as '' in Python 3.

Since this transformation _only_ has any effect on Python 3, it doesn't matter
if you apply it in both 2 and 3.

~~~
klibertp
By "this transformation" you mean manually going through the whole hg code
base and inserting "b" characters before most string literals.

Even if you are 100% sure that semantically `b""` and `""` in 2.7 (let's
forget about <2.7 for a moment) don't differ, you cannot be sure that the
transformation itself - i.e., editing the source files - won't break
something. You have two ways of doing this: either manually, where you risk
human error, or by writing a tool, where you risk missing certain cases. In
both cases, your perfectly good Py2 code base breaks for no good reason.
Moreover, even if you create a tool for applying the transformation (actually,
it looks like even this loader could be used for this) to the source, once you
run it and commit (imagine the commit diff and its review process!), you can't
run it again if you discover your tool missed something (so iterating and
improving the tool becomes harder).

With the solution described in the article, you have 100% guarantee that
nothing breaks on Py2 and simultaneously you can work on improving the tool.
Once they're satisfied with the tool performance and the Py3 porting is more
advanced, they can use it to convert the source files. IMHO that's a really
sane strategy here.

------
tetraodonpuffer
This is pretty cool, but couldn't this have been leveraged to be just an
intermediate step of the actual source-level substitution?

i.e. you run this tool on the codebase, cache somewhere the post-tokenized-
and-rewritten output, then run a separate tool that does the '' -> b'' change
in the source code, rerun the tokenizer without the translation and verify the
two outputs are identical?

This way one would not need to do this at runtime every time, and you would
have a guarantee that your source code text changes are correct (because you
compare the tokenized output between the '' and the b'' versions of the code).

------
sevensor
Seems like a perfectly reasonable approach to hoisting Hg into the future. A
big ship turns slowly.

------
falsedan
Wow, I though I'd seen the last of source filters in 2000 & perl.

