
Python 3: Ten years later - speedplane
https://archive.fosdem.org/2018/schedule/event/python3/
======
the_mitsuhiko
I think Python 3 primarily showed that breaking backwards incompatibility
across an entire ecosystem is worse than one could have predicted. Even today
Python 2 is far from dead.

On the other hand the core development team "outsourced" many of the issues
that people actually want to have solved to the community like packaging and
distribution.

Python's biggest downfall is a highly academic approach with slow release
cycles which makes large feature development hard and detached from the real
world. Python 3's unicode model is completely outdated but was supposed to be
the big feature of the version. While that was happening however the rest of
the world started moving over to UTF-8. There was no feedback loop that could
have enabled Python to be UTF-8 based. It was released and then decisions were
locked in.

Similar things happened in asyncio. New versions of it only land with every
major version of the language which meant that a lot of asyncio only became
really useful with 3.6 and 3.7 which was years after it was initially created.

I have very high hopes that the new leadership of the language will be able to
tackle some of these problems. Python 3 is way less backwards incompatible to
2.7 now than it was 10 years ago. The two versions moved closer to each other
and today it's pretty easy to support both from a single codebase. With faster
iteration we might have been able to cut this time short by 5 years or more

~~~
avian
> Python 3's unicode model is completely outdated but was supposed to be the
> big feature of the version. While that was happening however the rest of the
> world started moving over to UTF-8.

I’m not sure what you mean. Are you suggesting that Python use UTF-8 encoded
strings internally instead of decoded Unicode codepoints like it does
currently?

I’ve worked on Perl that uses that approach and it’s an absolute nightmare.
All string operations suddenly must be aware of details of UTF encoding, which
complicates even simple things immensely. You get hard-to-find bugs when some
code somewhere introduces invalid UTF to a string. The only real benefit you
get is slightly smaller memory footprint when dealing mostly with English
text.

I much prefer Python’s decode-process-encode approach.

~~~
rurban
The python nightmare is purely performance due to its use of the UCS-2 or 4
encoding (wchar_t without surrogate pairs). UTF-8 was ruled out for its
inability for direct indexing, but this proved wrong. 4x longer string buffers
trash the cache much more than the needed CPU cycles for utf-8 decoding.

perl's problem is that they didn't make room for the original encoding, it's
only latin-1 or utf-8 from everything else. so the context needs to store it,
like IO, process. and for performance they silently loose the utf-8 flag for
128—255 codepoints.

but all this is still better than the libc/POSIX design failure to use ENV
vars, where the encoding is global, not localized. every single string API
needs to do a getenv, and its interpretation is action at a distance.

~~~
avian
> it's only latin-1 or utf-8

There’s no other way, is it? You basically have to have the complete string
library re-implemented for each supported multi-byte encoding.

~~~
rurban
No. Foreign encodings are internally converted to utf-8 via the Encoding
library and IO layers. The string library is only done for latin-1 and utf-8.

The only trouble is that you have to remember the original encoding, if you
need that information. It's stored nowhere. E.g. to write to a file in the
original encoding you have to open it with the correct IO layer. perl6 got
that fixed.

------
guitarbill
Python 3 is worth it just for Unicode handling. As someone using a language
that is uses non-ASCII characters, thank you if you're using Python 3 and made
your software more internationally compatible.

~~~
saurik
...or you could just use the unicode type in Python 2; like: all of my Python
2 software handled languages in other character sets correctly (and yes: this
was stress tested often as I have had many many users from China and the
Middle East). There was absolutely nothing wrong with Python 2 and Unicode,
and if anything Python 3 just made a bunch of stuff impossible to do correctly
(in particular encodings for filenames) until years later (I think Python 3.4
or something is when they finally got their Unicode shit together; meanwhile,
Python 2 worked great).

~~~
Latty
Except doing the right thing by default is important, and the reality is that
Python 3 code is way more likely to handle Unicode correctly, which is what we
want.

People constantly complain, but honestly changes like Python 3 have to happen
- the alternative is just another completely new language taking over instead.

~~~
slavik81
I'm less enthused with the defaults than you are. How many people do you think
would correctly handle foreign text in this Python 3 exercise:

Write a Python 3 program that takes two arguments, reads a text file named
"input.txt", replaces all instances of the first argument in the text with the
second argument, and writes the result to "output.txt". For extra credit, add
an option to print to stdout.

My bet is very few people would write that program correctly.

~~~
avian
Ok, I'll bite.

This is the most straightforward way I can think of to implement the program
you described. I think this is more or less what someone would come up with
after glancing at the Python tutorial:

[https://gist.github.com/avian2/e3968bf933212ee561410b15495e6...](https://gist.github.com/avian2/e3968bf933212ee561410b15495e626f)

What is wrong with it? It works as expected with non-ASCII characters in both
command-line arguments and in input.txt (Python 3.5.3 on Linux,
LANG=en_US.UTF-8). Yes, it doesn't work if you have a file that's not
consistent with your locale, but you can't really blame Python for that.

~~~
slavik81
I saved this page from the English Wikipedia as input.txt:
[https://en.wikipedia.org/wiki/Shinz%C5%8D_Abe](https://en.wikipedia.org/wiki/Shinz%C5%8D_Abe)

    
    
        C:\Users\local\tmp>python --version
        Python 3.7.1
    
        C:\Users\local\tmp>python foo.py Diet Parliament
        Traceback (most recent call last):
          File "foo.py", line 7, in <module>
            text = f.read()
          File "C:\Users\local\AppData\Local\Programs\Python\Python37-32\lib\encodings\
        cp1252.py", line 23, in decode
            return codecs.charmap_decode(input,self.errors,decoding_table)[0]
        UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 107: char
        acter maps to <undefined>
    

Imagine I wasn't a developer who has spent dozens of hours on fixing encoding
problems in Python. Imagine I'm just some person using some software I've been
given and this is what I encounter. Can I actually complete my work? Is this a
good user experience?

~~~
ubernostrum
The way I usually put it is that Python 3 shifted its priorities.

Previously, if you were a UNIX-y scripter writing UNIX-y scripts on your
UNIX-y OS, the fact that Python just kind of pretended encoding issues would
never exist was a help to you. It adopted the same "everything is ASCII, or at
most UTF-8 in the ASCII range, and if it isn't I'll break in cryptic ways"
approach as most other UNIX-y scripting things.

If you were doing anything other than UNIX-y scripting on your UNIX-y OS, this
easily became a huge nightmare in Python 2. Django went through a massive
rewrite early in its history precisely because of this, to ensure that
encoding/decoding happened at the boundaries and everything you'd work with
inside a Django app was already a Unicode string. And I remember what it was
like trying to work on the web before that approach, and what the work to fix
it was like.

Python 3 decided to make the UNIX-y scripters actually learn what a horrid
mess UNIX is with respect to locales and encodings and filesystem paths, in
order to free the rest of the Python community from the nightmares inflicted
by prioritizing the UNIX-y scripters to the exclusion of everyone else. So
yes, you have to do more work. Yes, you have to learn that a file path is
actually an opaque bag of bytes that may not be in any actual encoding and
thus can never properly decode to a string. Yes, you have to learn to use
fsencode() and the surrogateescape handler in order not to blow up your
scripts.

But I'm OK with that, because it puts the workload on you when you're using
such a system, rather than magically trying to fix it for you at the cost of
everyone else's sanity. It also means that you have to learn to write those
scripts correctly. Which is more work than what Python 2 required, but not the
world-ending apocalyptic horror it's usually presented as (and is, again,
mostly the fault of UNIX-y systems doing their old UNIX-y things, not the
fault of Python).

~~~
slavik81
So, write the program correctly. Show me.

> Python 3 decided to make the UNIX-y scripters actually learn what a horrid
> mess UNIX is with respect to locales and encodings and filesystem paths

Show me how much Python 3 improves this. To expand on the program before, make
it a directory named 'input' full of files, and a directory named 'output' to
put the processed files in. Print each file name as the corresponding file is
processed to indicate progress.

I would be applauding Python if it did make the difficulties with this
exercise obvious, but it absolutely does not. The file system APIs return
strings, but the strings they return may not be valid Unicode. PEP 383 turned
the Python 3 str type into a bag of bytes.

Python tries to sweep encodings under the rug. It makes the encoding a default
value all over the place and hides conversions everywhere.

I 100% agree that developers need to think about encodings and handle them in
their programs. That's exactly why I hate string handling in Python 3: because
rather than making you handle the corner cases, it pretends they don't exist,
until they're found one by one by your users.

Python 3 encourages developers to write broken string handling code.

~~~
ubernostrum
_So, write the program correctly. Show me._

You just want to fight someone because you're angry, and I don't do that. No
matter what someone writes you'll find a way to argue into it being wrong and
then prance around declaring "victory".

(I'd also bet that you probably couldn't do it if _I_ were the one who got to
set the evaluation criteria, and you also couldn't pass other "challenges"
like writing proper HTTP handling -- the person who gets to grade the
challenge always "wins", which is why you want to be the person who grades the
challenge)

 _PEP 383 turned the Python 3 str type into a bag of bytes_

PEP 383 provided a way to read certain things -- primarily filesystem paths
which can be basically _anything_ \-- using an escape mechanism to replace
non-decodable bytes with surrogates when decoding to string, which in turn
allow losslessly transforming back to the original bag of bytes.

Which is necessary, because there are real filesystems out there that really
have paths and names that can never validly decode from any known text
encoding. It doesn't turn strings into "bags of bytes"; the resulting str
still is an iterable of actual valid Unicode code points.

 _Python tries to sweep encodings under the rug._

As the saying goes, you can't reason someone out of a position they didn't
reason their way into, so I won't try here.

 _Python 3 encourages developers to write broken string handling code._

Python 3 no longer tries to cover for the random gibberish that's legal to put
in filesystem paths, and makes the developer handle it. Are there lots of
developers out there who don't _realize_ that filesystem paths can legally
contain undecodable garbage? Sure. That's not Python's problem to solve,
though; it gives you the surrogateescape handler, and the fsencode helper, and
keeps working on things like PEP 538 and PEP 540 to try to give you tools to
work around it. But Python can't magically fix the mess that is UNIX locales
and bag-of-bytes paths (nothing can, short of burning UNIX down and starting
over), and doesn't try to do it for you.

~~~
slavik81
You're right that I'm frustrated. I'm not even upset at anyone in this thead,
but from previous discussions. I appologise for carrying that baggage here.
Having my morality questioned when I legitimately want to make programs work
correctly for foreign languages has left me... emotional on this subject.

Unix has its own problems, but the program avian wrote works correctly on
Linux. Most encoding issues I encountered when working with Python 3 were on
Windows.

PEP 383 was making the best of a bad situation without breaking the API again.
The real mistake was having the functions return strings in 3.0. The operating
system APIs should have returned path objects that require an explicit
conversion to string with an explicit error handling mechanism.

Python gives you all the tools you need to do this right, but they're easy to
unknowingly use in ways that break on corner cases. A well-defined API should
guide you towards the correct solution and should make pitfalls obvious.

In any case, I should probably give it a rest. I work hard to make sure my
programs do this stuff right, and I suppose that's all I can really do.

------
iagooar
As an outsider who only uses Python when I need a specific tool that's only
available through pip: it's still one of my biggest nightmares.

Every single time I need to install something with pip on MacOS there is this
hour-long struggle with figuring out how to get the correct version of Python
(2.7 vs 3) working with the correct version of pip working with the correct
version of the library itself.

I don't know, I guess people who use Python regularly have figured it out, or
maybe it's just a MacOS thing. For me, it's just a major PITA even 10 years
after release.

~~~
Al-Khwarizmi
I use Ubuntu and it's the exact same thing for me, so I don't think it's
MacOS-specific. To be honest, the whole versioning quagmire is the main reason
why I don't use Python as one of my main languages, but just as an outsider
like you. I like the language but I hate wasting my time with that kind of
thing. Say what you will about Java, but you can just slap a jar file into a
folder and it will work with the latest version, with zero drama.

~~~
mehrdadn
I feel like the fundamental issue is that it simply does not make sense for
pip to be able to install or upgrade any Python packages managed by the system
package manager, any more than it makes sense for it to upgrade the system-
level Python interpreter itself. It should just outright not have the ability
to do this.

------
bakery2k
No, IMO. Don't get me wrong, Python 3 is a better language than Python 2. I
just don't think it's _that_ much better that it was worth breaking backwards
compatibility and stalling the language for several years.

Python 3 did enough to break everyone's existing code, but nowhere near enough
to make that worthwhile.

It's common for other language communities to consider the Python 2/3
transition as an example of "what not to do".

~~~
ravenstine
How hard is it to simply keep one's code up to date? I'm pretty sure there
weren't any radically different paradigms introduced in Python 3.

Somehow the world of JavaScript has been able to quickly adapt to a rapidly
changing language and the community is hardly fractured along the lines of
language. Adopting new language features and browser APIs is just a matter of
fact. What's different about Python?

~~~
h1d
JavaScript has been stuck with an ancient version because of browser adoption
like IE6.

There's a good reason people wanted to migrate to newer JS and I never heard
people say ES6 got something wrong over ES5.

------
dang
Please don't rewrite titles, especially to make them more baity. This is in
the site guidelines: "Please use the original title, unless it is misleading
or linkbait; don't editorialize."

Titles are by far the strongest influence on discussion, so this rule is much
more important than it seems.

[https://news.ycombinator.com/newsguidelines.html](https://news.ycombinator.com/newsguidelines.html)

(Submitted title was "Python 3 Is Now 10 Years Old Was It Worth It?".)

~~~
starpilot
In the submission form, have the title field auto populated from the URL.

~~~
dang
You don't have to look at many URLs to see how hard that would be.

Not to mention "unless it misleading or linkbait", which is not something code
can take care of.

------
kmarc
Just one word: typing [1].

Makes life a lot easier.

[1]:
[https://docs.python.org/3/library/typing.html](https://docs.python.org/3/library/typing.html)

~~~
Rotareti
Two words: typing, mypy [1]

The editor support that I get from typing/mypy is fantastic! It feels like
writing in a statically typed language. To me this is by far the most
important feature of Python3. It's a godsend for larger/complicated code
bases.

[1]: [https://github.com/python/mypy](https://github.com/python/mypy)

~~~
kmarc
Yes, you are right. I'm also using mypy along with pylint, yapf and isort.
Helps keeping my python codes clean

------
j1elo
I think the conversion would have been faster, especially for people who don't
care, if the main distributions collaborated and stopped providing Python 2.x;
99% of software writers would just check out how to write their code in Python
3, or move their code to that version. The other 1% would be people who "knew
what they're doing" and custom install whatever version they needed.

Imagine if Ubuntu 14.04 (Trusty) didn't bring Python 2. All devs of relevant
stuff would have received a shower of upgrade requests. Maybe 2014 was too
early to force this, but I'm sure in 2018 it could have been done with Ubuntu
18.04.

I know this looks like work for the sake of upgrading, without any real
benefit for most projects. But I guess it's not wrong either; you just
couldn't expect that Qt 3.3 (circa 2004) was distributed 10 years later in
Ubuntu 14.04 (Qt is just a library, not a language, but it practically
transforms the way you write C++ so it's a valid comparison IMO); however here
we are in 2018, 10 years after the release of Python 3, and Python 2 is still
the default interpreter in Ubuntu 18.04.

------
randomsearch
Nope.

But it survived, which is the important thing. I thought 3 would kill Python,
but it has survived and that is a very good thing.

Anecdotally, non-nerds fins python easier to read than other popular
languages. It’s become the go-to language for a bunch of domains that reach
outside core nerding (data analysis, education, DSL like stuff) and I predict
Python will therefore be widely used in 2050, and probably be popular aged
100.

------
olliej
It's a good example of why JS hasn't had changes that break existing code, so
that's good

------
MR4D
Python 3 is the open source equivalent of Vista - too much potential, but too
much time to realize it, and the world doesn’t slow down in the meantime.

I love python. For me, it’s damn near perfect in many ways. But it’s looking
more and more like Perl 6 than what it should have been.

The sooner we get to Python 4 (whatever that ends up being), the better, as we
need to move past the v2 or v3 question for developers and users.

------
bespoken
Python almost became my main language, but I moved to another language when I
saw it was stuck at 2.7. Another reason was the lambda's only allowing 1 line
instead of the flexible closures in so many other languages. And my
expectations for future improvements were very low as the core maintainer
looked quite stubborn to me.

~~~
zmmmmm
Pretty much how I ended up doing most of my coding in Groovy. I essentially
wanted Python. But Python wasn't a good enough Python. (being honest, other
things attracted me to Groovy as well - Java ecosystem etc, but it's
interesting to me that in the larger picture, I'm sort of using Groovy as a
proxy for a Python without Python's flaws).

------
bayesian_horse
Yes. And I guess it could have gone worth. On the other hand, I don't expect
the community in a mood to embark on Python 4 any time soon, though it would
make sense, in terms of version numbers and time passed.

------
dethos
Definitely yes, it took some time and had some costs but nowadays python (3)
is a much better language. Perhaps it could have been done differently,
however it is in the past now.

------
dis-sys
It worth it, not for the language itself but for teaching everyone a good
lesson - don't destroy the backwards compatibility when experimenting your
incremental ideas.

------
pacifika
Random scaremongers. Just look at pypi

------
cft
I wish they spent the effort on making Python 2 faster and removing GIL
instead.

~~~
monk_e_boy
Why? I teach Python and c++ to A level students. They pick up the basics in a
few weeks. C++ is pretty easy. We push our students through the c++ institute
exam with few issues.

Choose the right tool for the job. If you need speed, choose c

~~~
Santosh83
C++ is pretty easy!? Wow, that's an opinion you often don't hear. :-)

~~~
Joeboy
As in "pretty easy to introduce catastrophic errors with".

~~~
klipt
If you use the right features of C++17 errors are easier to avoid than ever.

The problem is that this elegant subset of C++ is hidden inside ... the rest
of C++.

I guess that's the counter example to Python's 2/3 breaking change. C++ just
keeps adding new features while retaining all the old ones for backwards
compatibility, even if they're awkward in context of the new language.

~~~
pritambaral
I once read somewhere that C++ would be fine if one just used a sane subset of
C++, but the challenge was agreeing on which subset.

------
speedplane
No.

------
Michaelanjello
There are still idiots on Github using Python 2 for new projects. These are
usually academics who don't actually share any love for technology. Their code
will never be used by anyone.

~~~
upofadown
There has been a pledge by the largest group of Python developers not to mess
around with Python 2. So if you want your code to be able to just run forever
without any concern for all the new and exciting language ideas being loaded
into Python 3 then Python 2 is the logical choice.

~~~
bakery2k
I've heard of this - people considering Python 2.7 as an "LTS" version of the
language, receiving bug fixes but otherwise remaining unchanged for many
years.

I wonder, once 2.7 is EOL in a years time, whether the language developers
will consider a "3.x LTS" release?

