Hacker News new | past | comments | ask | show | jobs | submit login

why on earth is it a good thing that people are trying to kill off Python2?

Because the community is fragmented and that weakens language adoption, productivity, and enjoyment.

The 2-3 schism in Python has been a pain to deal with for years. I use Python casually here and there, but I'm so sick of trying to do something quickly in Python and finding out that I'm on a machine that only has 2 but the module I need is only for 3 or having the bulk of my app written in 3 but the only version of a module I need is for 2.

Same goes for examples you find on the Internet to do a particular thing. Ooops, it's 2.x syntax/modules so you have to fix it up for 3. Ooops, it's 3 syntax/modules so you have to fix it up for 2.

For the good of any computer language, old versions need to eventually die off.

>For the good of any computer language, old versions need to eventually die off.

I would say instead: for any good computer language, new versions need to retain compatibility with old versions. Every single system I run has python 2.7 (including my brand new Macbook running latest OS X). Luckily I don't need numpy, and I can probably do without python at all if I have to.

Compare to perl: I can run old perl scripts on basically any system without having to worry about it breaking. I do all of my server scripting in perl for this reason.

Perl6 is incompatible with Perl5, so they're also going through a similarly painful transition.

Everyone waxes lyrical about how Python 2.x was "good enough and why would you change it", but there were several things in Python 2.x that were objectively awful (unicode was broken, iterator variables can leak to the outer scope, division of integers producing integers, xrange, raw_input, mixed indentation "working", etc). And while no single issue alone would justify Python 3.x, the combination of all of these issues as well as some other semantic problems justified the need for Python 3.

Of course, that being said, Python 3 also had some issues that it took several versions to iron out.

I am wondering. Why is division of integers returning an integer an awful thing?


{Describing the old python-2 behavior:}

-------- Quote: -------

The classic division operator makes it hard to write numerical expressions that are supposed to give correct results from arbitrary numerical inputs. For all other operators, one can write down a formula such as xy*2 + z, and the calculated result will be close to the mathematical result (within the limits of numerical accuracy, of course) for any numerical input type (int, long, float, or complex). But division poses a problem: if the expressions for both arguments happen to have an integral type, it implements floor division rather than true division.


To guarantee the correct mathematical behavior in python2, one would probably have to write:

    def true_math_div(a,b) :
        if type(a) is int or type(a) is long :
            a = float(a)
        if type(b) is int or type(b) is long :
            b = float(b)
        return a/b
as a and b could be int, float, complex, or some object defining the method __div__.

What's wrong about just `float(a)/float(b)`?

not everything that can be divided can be reasonably cast to a float

Also, it's ugly as sin

I wouldn't call it awful, but it is slightly annoying. In a dynamically typed language it's hard to know apriori if a variable is an int or a float with a whole number value and you end up having to write x/float(y) all over the place just to make sure your code does what you want it to do.

The new case of / always being float division and // always being integer division just makes everything more explicit.

This particular language quirk I don't think has anything to do with dynamic typing: it's equally annoying in C-style languages where 3/2 and 3/2.0 mean different things.

sure, but in C if you have a line that looks like

   z = 3/y
you'll know that y is either always a float or always an int depending on its type and thus you'll 'know' what z is.

Because it's a duck that doesn't quack.

Perl 6 is a different language. Last time I checked there was no plan to discontinue Perl 5.

Wont they have a naming/versioning problem?

No. Perl 6 has its own version scheme, and the number at the end of the language name is just a number.

I'm not a particular fan of that - they should have changed the name entirely, IMO, eg. to Rby - but Perl 5 is not getting discontinued.

They have it already. Perl6 is about 10 years old, had at least 4 virtual machines. When it was announced people thought perl 5 was obsolete. And in the same time came RoR. A few years(like 7 maybe?) ago a some new developers took over perl5 and added some really interesting new features. That made some perl6 developers unhappy, wanting to focus community around perl6. Perl5 developers believe a completely new language should have a different name. Some suggested renaming perl5 perl7. Then there was a website perl11.org redirecting visitors to scala-lang.org. That was about 3 years ago. Since then - I don't care:) Perl6 will have to find itself a niche. The one once occupied by perl5 - a ducktape of the web is now taken by js. So, a long transition is better than a remote perspective of unknown transition.

Good point, I didn't realise Perl6 was in a similar situation. However it seems that adoption of Perl6 is very low, so I suspect Perl5 isn't going to get killed off any time soon.

Very little new stuff is being developed in Perl, period. I don't know anyone working in Perl who is not maintaining existing codebases. I haven't heard Perl proposed for anything new in over a decade. That further reduces the motivation to move to Perl 6.

It depends what you're using it for. Certainly it wouldn't make sense to use it for web stuff. However I use perl exclusively for server-side "shell" scripting, and it excels at that. More powerful than bash, less compatibility worries than python (I've had issues even between 2.* versions). If I have perl5.* on a server my scripts work everywhere. I regularly start new server-side scripting projects using perl, including a recent webpage-to-video conversion batch script. To be honest there is no other platform that I would even consider using for this kind of thing. Is there even anything that comes close to perl's cross-platform, rapid development, stability and general compatibility?

I dunno, on Arch I've seen several times as Perl 5.x updates roll in, Perl scripts will break on stuff. Especially Perl 5.22 and 5.26.

That's because of binary API changes, and only affects Perl modules with C parts in them. Source compatibility has been preserved at least across the entire 5.x line.


The plan for Perl6 is to eventually run Perl5 code with the `use v5` statement. You just don't throw 30 years worth of CPAN into the garbage bin. Meanwhile, Perl5 is being actively developed, new minor versions are released at least once a year, people are actively improving the language via extensions and even the core is evolving slowly but steadily. I don't thing it will be abandoned any time soon. Perl6 is being developed separately, mosty by other people.

Perl6 has nothing to do with perl5, it's a completely different language. I would call the transition fatal, not painful.

> I would call the transition fatal, not painful.

It's really sad. In retrospect they certainly should have named it something different. The Perl 5 community could have progressed, maybe even made a Perl 6, while the NGPerl skunkworks project continued independently for 15 years.

I never quite understood why the division of integers resulting in integers was a problem.

I get that for people with no knowledge of programming whatsoever it can be confusing, but it's standard behavior in nearly every typed language.

In C/C++, you divide an Int and you get an Int. If you want floating point division, you divide by a float. Problem Solved.

Almost EVERYTHING else could have been done with slow migration, or simply documenting the odd or quirky behavior.

Iterator variable leaking is just a result of poor programming.

Raw_input vs input could have been solved by slowly deprecating input. Or just left as is.

xrange is the same story.

It all seems like someone just decided to throw their hands up in a fit, throw their toys on the ground, and make a new Python. I enthusiastically jumped at Python 3 in the begining, then ran into problems, and crawled back to my 2.7 and decided I could let others fight the stupid cult war that was coming over 2 vs 3. I'd rather use a tool that works than change all my code over someone's idea that newer is better.

> I never quite understood why the division of integers resulting in integers was a problem [...] but it's standard behavior in nearly every typed language.

And hence why it's a problem in Python. If you have a function like

    def foo(a, b):
      return a / b
What are the types of a and b? You don't know. Sure, you could check with isinstance, but now you've broken the illusion of duck typing. The argument is that Python should just do "the right thing". Just because 3/4==0 in most languages doesn't justify repeating that mistake. Not to mention that float(a)/float(b) -- aside from being incorrect in the general case -- is ugly as hell.

> Iterator variable leaking is just a result of poor programming.

I think it's a language bug. Why? Imagine if you could do this in C:

    for (int a = 1; i <= 3; a++)
    printf("%d\n", a);
People would think this is clearly a language bug because it breaks the scoping semantics (not to mention common sense). Now, you could argue that Python supports this sort of nonsense so it's "fine":

    # a is not defined
    if something:
       a = 1
       a = 2
But the reason that works is because otherwise Python would need to have JS-like "var" syntax to declare a variable. Personally I think that'd be better, but the for-loop situation is different.

> raw_input vs input could have been solved by slowly deprecating input. Or just left as is. xrange is the same story.

And this is how PHP ended up having three different string escaping functions for SQL statements. The first one (escape_string) was broken, so they added real_escape_string, and then they needed to handle connection-specific quirks so they added mysqli_real_escape_string.

Some people still use escape_string even though it's broken -- and the language devs cannot fix it because "that would break users".

Serious question, what was the complaint against xrange?

I don’t recall it ever surprising me or behaving poorly.

The issue is that "xrange" does the right thing, but the seemingly-more-correct (unless you happen to know about "xrange") "range" does the wrong thing (create a list rather than a generator, which is inefficient). Same thing with "raw_input" doing the right thing, but "input" being completely wrong (and an attack vector because it's just a glorified "exec").

And instead of behind-the-scenes lazily creating the range, so you have a generator that reifies when its modified and has a nice __str__ etc they go make incompatible changes; got it! ;)

I know rather too much about how CPython and other VMs work under the hood and am in no mood to try and save the day any more. I still use python, but the latest stuff around MyPy and async io just make me despair frankly. I think rust+go will probably pick up a lot of people who used to care deeply about python. So it is.

Perl is a weird language language for you to cite when we're talking about languages that do major transitions well. Perl 6 was created because Perl 5 was an evolutionary dead end, took 15 years, has no adoption, and Perl 5 usage shrunk tremendously in the meantime. Not saying this glibly as I love Perl and I hope Perl 6 sees some adoption, but come on now. Also remember that Perl 5 itself was nearly a complete rewrite from 4.

Exactly my thought regarding the need to evolve as a language. Perl is mostly dead because it didn't keep up with the times and I say that as a person who spent a tremendous amount of time learning and using Perl. The Schwartzarian transform was a beautiful thing to behold.

Perl is born to solve portability and limitation problems of shells (csh, sh, ksh*), awk and sed. People who do not live on the command line can not appreciate the power of perl.

> The Schwartzarian transform was a beautiful thing to behold.

Meh, more like a clutch for systems where you couldn't install List::UtilsBy. (One of my favorite Perl modules of all time.)

There's 17 years between the introduction of the Schwartzian Transform (1994) and List::UtilsBy (2011 according to CPAN).

Huh. I guess it's one of those cases where once you have a thing, you wonder how you could have gone 20 years without it. :) Although, to be fair, I only started working with Perl (beyond one-liners on the shell) in 2012.

There's also no push, whatsoever, to "get with the parrot program". Larry Wall is not holding a gun to Perl 5's head.

Unlike Python 3's relation to Python 2, Perl 6 is genuinely a new language. It would make little sense for the community to discontinue Perl 5 and push for Perl 6.

In retrospect it probably would have been better for the community if they'd given Perl 6 a new name.

Yeah, I really don't get this Python mindset at all. I have out in the wild both Scheme code that's approaching 30 years old, and Perl code approaching 20 years old, running on commercial systems. There's no real need for constant version churn.

Perl example is a bad one given Perl 6 having near zero adoption, and being incompatible with 5. And this is hardly "constant" version churn. This was a major event for Python to fix some long-running issues that needed fixing, but could only be done with an incompatible version.

Over the course of 2014, the "pip" project alone went from 1.4.1 to 6.0.6(!) through 2 minor releases, 1 major release, and 9 point releases, 6 of which were on the same day as another point release (it's called a "release candidate", people). They included regressions such as "segfaults on Windows at every invocation", "freezes if the specific server pypi.org is not reachable", and dropping support for a python version that was less than 2 years old at that point. They also introduced a third package archive format into the ecosystem.

The library ecosystem is the problem, and it's what's driving the language churn.

The python devs learned their lesson and aren’t going to do a major breakage like 2 to 3 again.


It’s a balancing act. Breaking compatibility creates work, unfixed bugs and design flaws create work, which is more work varies. Retaining compatibility can be, in sum, a harmful decision.

Only if you are pretty lucky with what you write.

Back when I cared about Perl, doing "man perldelta" was a nice way to entertain myself.

That's a really poor reason to kill an old version. To be clear, I FEEL YOUR PAIN. However, new isn't legit just because it's new.

As a non-Python pro, I cannot say why one version over another. However, arguing that the old one is bad just because it is inconvenient isn't valid.

As someone who struggles with versions of Python on my Mac and on production servers (and with code that runs on 2.7 and 3.6+), as far as I'm concerned (as a pragmatic solutions-first person goes), I cannot discern the difference between ancient Python and new Python.

There are really few languages with such impact as Python. So we cannot blame the Python community for this situation as they probably really labored over their decisions regarding compatibility and versions. But in retrospect, I would have preferred they killed off the old version long ago. It would definitely have made life better for the users in the long run.

I think the argument here is that Py3 is better for many people (Unicode-by-default was generally the reason, but these days it's also the numerous language improvements, e.g. async/await). For those who find themselves on the fence with no particular personal reason to go either way, though, going where the others are is a legitimate way to choose.

What we often forget are the large numbers of people who just use Python (or any other tech tool) as a means to an end.

In this case I'm included. Python is not my first, second, or third love. But it is the most available and the simplest tool available to glue things together. CSVs, json, web apis (private and commercial), etc., are all so easy to do with Python.

So my guess is that there are a lot of users who may not even realize the benefits that Python3.x gives vs 2. "We" don't know or care about features we don't need. But we do feel the pain of modules that only work for one version.

In hindsight, I would have voted for a hard break from 2->3 perhaps 2 years ago. I suspect the ultimate human time effort would have been less than we waste now straddling or stumbling with two versions.

2 years ago is still very far into the 2->3 split. The 9 year anniversary of Python 3 is just over a fortnight from today.

*I can feel the pain of devs that have to deal with Python 2 and 3. To me the beginning of the ugly Python 3.0 announcement was too off putting, it stopped my interest in Python and I went back to PHP, Java and later also NodeJS, Go.

In comparison PHP and Java always successfully transitioned to new versions, keeping it backward-compatible and do baby steps instead of a big incompatible cut. PHP canceled the ill fated PHP6 fork, and went from PHP 5.2 then up to 5.6 and then jumped to PHP7 (as several PHP6 books got published about an alpha version).

Language with a rocky transition (mostly due to incompatible syntax) were C#/dotNet 1->2, Perl 5 -> 6, Lua 5.1 -> 5.3, Ruby 1 -> 2, Swift 1 -> 2 -> 3, Rust 0.x -> 1, and more

And? So did PHP 5.x. So do all languages. That's what I mentioned with "baby steps".

You can run PHP3, PHP4 and PHP5 projects with little or no change at all, code dating back to 1990s with PHP7. If you cared a bit and adjusted your code over the years, most changes are announced many versions ago and got deprecated. E.g the original MySQL API had been deprecated for a decade or so years, and only got removed with v7, yet it's easy to update the code to the newer APIs, as it was possible since early 2000s, when the newer API got introduced and stayed unchanged since then. And you could use a shim too.

PHP and Java (and several other languages) have really kept an eye on backwards compatibility, you cannot deny that or paint it in another light.

I was responding to "keeping it backward-compatible", which it did not do. In any case, Python 3 is not exactly a brand new language, many codebases can be adjusted to run on both engines without much effort. I don't think the difference is as stark as you've painted it.

On the other hand, there’s still a lot of code that needs porting. See http://portingdb.xyz

> without much effort

It's an enormous amount of effort. It's been a decade and the end is nowhere in site. Python 3 is a tragedy.

Sure, but the community could continue to reject Python3 instead as it had been doing for years. It seems like it is catching on a bit now, but I hadn't really seen any good reason for it other than the upcoming EOL.

Look at the statement from the numpy group:

The NumPy project has supported both Python 2 and Python 3 in parallel since 2010, and has found that supporting Python 2 is an increasing burden on our limited resources;

That's a real team saying that they just can't support 2 major versions of the language any longer.

That is like the biggest fake argument ever. There is plenty of resources and Python 2 support is neither a burden nor this burden in any way increasing. The are plenty of people ready to step up to continue Py2 support (Even I would be glad to help).

This is a pure political decision based on ideology.

As the codebase grows, you need to maintain 2 growing codebases, how is that not increasing the burden?

Official support will be dropped by 2020, by then you will be relying on the community (who ?) to provide bug and security fixes. I'm not aware of anybody stepping up and declaring they will take over maintenance.

At this point, insisting on python 2 is the ideological "side". There's no practical nor realisitic reasoning behind it. Major parts of the community are moving to python 3 and dropping python 2.

You can stay with python 2 and maintain the language / libraries, but don't begrudge those that move on.

Can you be more precise with what you mean by "the community". It wasn't the community that said it was dropping support.

I've plan to migrate away from py2 by 2020 too, just not to py3.

The resources aren't "code" but people and time. You have a limited set of folks who consistently contribute and become reviewers/committers. This is all based on volunteer time - no one is paying these folks to do it. So, asking these folks to divide their limited volunteered time between multiple versions of python is unfair. I think this is the right decision to take.

If you feel you have "plenty of resources" you can fork the python2 version of numpy and maintain it.

> There is plenty of resources and Python 2 support is neither a burden nor this burden in any way increasing.

Aside from what others have said, NumFOCUS is woefully underfunded. If you're interested in seeing continued development of NumPy and other amazing scientific Python packages, you should think about contributing!


(Not a NumFOCUS person although I occasionally volunteer with them and definitely donate on a recurring basis.)

You can fork numpy and maintain Python 2 support.

Python would have lost popularity and would have eventually died without Python 3. Being a scripting language with a relatively low barrier to entry has always been among its selling points. Text processing is a very common use-case for such languages. In a Unicode world, you can't really have a language that is supposed to be easy to use yet requires contortions and has major pitfalls in simply handling text.

Yep, I'm always surprised by the number of people of people here who dismiss the usefulness of unicode. Not "dismissing" the hard way, but simply saying it's not a problem. I understand that we may have a lot of western/english people here but, unicode for me is a life saver. For example, I work on stuff for french speaking people, and I need the euro sign. In that very simple case, unicode already solve many issues : I don't have to convert between ISO-8859-1, ISO-8859-15, CP-1252 anymore, whatever my database uses, I just enforce unicode. Moreover, I can have french comment in source code (that's very important because sometimes we're talking about laws and some of our technical terms in those laws are not translated in english).

(I understand that in this particular case, I could have enforced ISO-8859-15 as well, but the crux of the matter is that with unicode built in python3, I don't have to think about it anymore)

And now my customer is copy/pasting additional information from documents coming from other countries as well...

You do realize unicode has been "built into Python" pretty much since the beginning (Python 2.0, 17 years ago), right?

The main difference is that unicode has a more efficient internal storage since Python 3.3+ (a neat technical detail), and that mixing bytestrings and unicode will fail with an explicit error since 3.0+ (a good idea IMO).

But that Python 2.7 didn't support unicode is simply FUD.

I have moved away from Python 2.7 because unicode support was not good enough.

Not good enough means I had to prefix every single string with "u" to make sure it's unicode. It was especially painful with code like this :

   logger.debug("Name of the person I'm debugging {}".format(name))
forgetting the "u" simply lead to various crashes if the name had characters that could not be encoded in whatever the debug was piped to. Always thinking about the "u" was nothing I had the time to.

Just an example.

Well now you have to prefix b"" instead for bytestrings. The argument works both ways -- there's no magical improvement that happened in Python 3 (except that nice storage optimization in 3.3 that I mentioned above).

It's actually good practice to be explicit about the type, and write b"" and u"" always (easier on the person reading the code). The u'' literal prefix was re-introduced in 3.3 for this reason.

The argument works both ways

It doesn't because strings and text are a lot more common than bytes. Yours is a really weird line of argument - that the 3.x changes and what's in 2.7 are fundamentally equivalent and thus the changes are outright unnecessary and that the people who made them just got it wrong and did it for no apparent reason or benefit. I get that someone might not like what they did or how they did it but your take should give you pause just by its general improbability.

You're mixing up two things: separating string types in the language, and using prefix literals. Completely orthogonal concerns.

As I said, u'' literals were re-introduced by the Python developers themselves, in Python 3.3.

I don't think I am. In this thread you're repeatedly making the point that 2.7 supported Unicode and the difference is mostly technical details of things like internal representation and/or a matter of prefixes or whatnot. This just isn't true. The fundamental change is - in Python 2, strings are bags of bytes and in Python 3 strings are collections of Unicode codepoints and you need to go through an encoding to convert to and from bytes. This is a big (and necessary) deal. No amount of debating the finer points of implementation, feature retention or reintroduction, etc is going to make that difference not matter.

What I said:

1. BOTH Python 2 and Python 3 come with built-in support for BOTH bytestring and unicode (contrary to OP's claims I responded to)

2. That mixing bytestrings and unicode will fail with an explicit error since 3.0+ (a good idea IMO)

3. Unicode has a more efficient internal storage since Python 3.3+ (a neat technical detail)

4. It's good practice to be explicit about the type of literals, and write b"" and u"" always

5. That Python 2.7 doesn't support unicode is simply FUD.

Can you articulate which point you're actually contesting? I'll be happy to clarify, but I'm honestly unsure what you're responding to.

I think almost all of these are wrong.

The person you replied to didn't claim Python 2 doesn't support unicode. 'Bytestrings' has what is wrong with Python 2 neatly summarized in a single word (and this, incidentally, is a term the Python documentation avoids these days because it's bad). 3 is true but not really related to the topic at hand. 4 is, I think, outright wrong. As to 5, I'm not sure why you would even want to defend that. It's not what the poster said and even if they had said it, they'd be just wrong - it's not 'FUD'. That is just you being grumpy and rude.

u"" was reintroduced to avoid people of fixing all their strings (see PEP 414 rationale).

that was my point, by moving to python 3, I removed all my "u", the thing other developers not wanted (see PEP-414 again); I loved the "purity".

but removing u was tedious (at best).

In my use case, strings are "string" and binary are "bytes". Which I think is much safer.

I don't think anyone claimed it didn't support Unicode. Only that it allowed mixing bytes / strings and the default type most people used from the beginning was str. That's a trap that they'll regret the moment they actually need to handle something outside of Latin-1.

Lots of python 2 code out there fails because the default option was simple but bad. I know, because my name broke the CI pipelines in a few different projects.

Did you actually read the thread you're replying to, or are you on auto-pilot?

Let's be honest, A lot of peoples experience with Python is restricted to the North America.

For these folks encountering anything other than ASCII is pretty uncommon.

Personally, I've worked on a Python 2.x project deployed in heavy industry across the globe including Japan and the number of times we had Python 2 unicode nightmare issues was too many to mention.

For these folks encountering anything other than ASCII is pretty uncommon.

Hmmm... [THINKING FACE (U+1F914)]

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact