Hacker News new | past | comments | ask | show | jobs | submit login

Sure, but the community could continue to reject Python3 instead as it had been doing for years. It seems like it is catching on a bit now, but I hadn't really seen any good reason for it other than the upcoming EOL.

Look at the statement from the numpy group:

The NumPy project has supported both Python 2 and Python 3 in parallel since 2010, and has found that supporting Python 2 is an increasing burden on our limited resources;

That's a real team saying that they just can't support 2 major versions of the language any longer.

That is like the biggest fake argument ever. There is plenty of resources and Python 2 support is neither a burden nor this burden in any way increasing. The are plenty of people ready to step up to continue Py2 support (Even I would be glad to help).

This is a pure political decision based on ideology.

As the codebase grows, you need to maintain 2 growing codebases, how is that not increasing the burden?

Official support will be dropped by 2020, by then you will be relying on the community (who ?) to provide bug and security fixes. I'm not aware of anybody stepping up and declaring they will take over maintenance.

At this point, insisting on python 2 is the ideological "side". There's no practical nor realisitic reasoning behind it. Major parts of the community are moving to python 3 and dropping python 2.

You can stay with python 2 and maintain the language / libraries, but don't begrudge those that move on.

Can you be more precise with what you mean by "the community". It wasn't the community that said it was dropping support.

I've plan to migrate away from py2 by 2020 too, just not to py3.

The resources aren't "code" but people and time. You have a limited set of folks who consistently contribute and become reviewers/committers. This is all based on volunteer time - no one is paying these folks to do it. So, asking these folks to divide their limited volunteered time between multiple versions of python is unfair. I think this is the right decision to take.

If you feel you have "plenty of resources" you can fork the python2 version of numpy and maintain it.

> There is plenty of resources and Python 2 support is neither a burden nor this burden in any way increasing.

Aside from what others have said, NumFOCUS is woefully underfunded. If you're interested in seeing continued development of NumPy and other amazing scientific Python packages, you should think about contributing!


(Not a NumFOCUS person although I occasionally volunteer with them and definitely donate on a recurring basis.)

You can fork numpy and maintain Python 2 support.

Python would have lost popularity and would have eventually died without Python 3. Being a scripting language with a relatively low barrier to entry has always been among its selling points. Text processing is a very common use-case for such languages. In a Unicode world, you can't really have a language that is supposed to be easy to use yet requires contortions and has major pitfalls in simply handling text.

Yep, I'm always surprised by the number of people of people here who dismiss the usefulness of unicode. Not "dismissing" the hard way, but simply saying it's not a problem. I understand that we may have a lot of western/english people here but, unicode for me is a life saver. For example, I work on stuff for french speaking people, and I need the euro sign. In that very simple case, unicode already solve many issues : I don't have to convert between ISO-8859-1, ISO-8859-15, CP-1252 anymore, whatever my database uses, I just enforce unicode. Moreover, I can have french comment in source code (that's very important because sometimes we're talking about laws and some of our technical terms in those laws are not translated in english).

(I understand that in this particular case, I could have enforced ISO-8859-15 as well, but the crux of the matter is that with unicode built in python3, I don't have to think about it anymore)

And now my customer is copy/pasting additional information from documents coming from other countries as well...

You do realize unicode has been "built into Python" pretty much since the beginning (Python 2.0, 17 years ago), right?

The main difference is that unicode has a more efficient internal storage since Python 3.3+ (a neat technical detail), and that mixing bytestrings and unicode will fail with an explicit error since 3.0+ (a good idea IMO).

But that Python 2.7 didn't support unicode is simply FUD.

I have moved away from Python 2.7 because unicode support was not good enough.

Not good enough means I had to prefix every single string with "u" to make sure it's unicode. It was especially painful with code like this :

   logger.debug("Name of the person I'm debugging {}".format(name))
forgetting the "u" simply lead to various crashes if the name had characters that could not be encoded in whatever the debug was piped to. Always thinking about the "u" was nothing I had the time to.

Just an example.

Well now you have to prefix b"" instead for bytestrings. The argument works both ways -- there's no magical improvement that happened in Python 3 (except that nice storage optimization in 3.3 that I mentioned above).

It's actually good practice to be explicit about the type, and write b"" and u"" always (easier on the person reading the code). The u'' literal prefix was re-introduced in 3.3 for this reason.

The argument works both ways

It doesn't because strings and text are a lot more common than bytes. Yours is a really weird line of argument - that the 3.x changes and what's in 2.7 are fundamentally equivalent and thus the changes are outright unnecessary and that the people who made them just got it wrong and did it for no apparent reason or benefit. I get that someone might not like what they did or how they did it but your take should give you pause just by its general improbability.

You're mixing up two things: separating string types in the language, and using prefix literals. Completely orthogonal concerns.

As I said, u'' literals were re-introduced by the Python developers themselves, in Python 3.3.

I don't think I am. In this thread you're repeatedly making the point that 2.7 supported Unicode and the difference is mostly technical details of things like internal representation and/or a matter of prefixes or whatnot. This just isn't true. The fundamental change is - in Python 2, strings are bags of bytes and in Python 3 strings are collections of Unicode codepoints and you need to go through an encoding to convert to and from bytes. This is a big (and necessary) deal. No amount of debating the finer points of implementation, feature retention or reintroduction, etc is going to make that difference not matter.

What I said:

1. BOTH Python 2 and Python 3 come with built-in support for BOTH bytestring and unicode (contrary to OP's claims I responded to)

2. That mixing bytestrings and unicode will fail with an explicit error since 3.0+ (a good idea IMO)

3. Unicode has a more efficient internal storage since Python 3.3+ (a neat technical detail)

4. It's good practice to be explicit about the type of literals, and write b"" and u"" always

5. That Python 2.7 doesn't support unicode is simply FUD.

Can you articulate which point you're actually contesting? I'll be happy to clarify, but I'm honestly unsure what you're responding to.

I think almost all of these are wrong.

The person you replied to didn't claim Python 2 doesn't support unicode. 'Bytestrings' has what is wrong with Python 2 neatly summarized in a single word (and this, incidentally, is a term the Python documentation avoids these days because it's bad). 3 is true but not really related to the topic at hand. 4 is, I think, outright wrong. As to 5, I'm not sure why you would even want to defend that. It's not what the poster said and even if they had said it, they'd be just wrong - it's not 'FUD'. That is just you being grumpy and rude.

u"" was reintroduced to avoid people of fixing all their strings (see PEP 414 rationale).

that was my point, by moving to python 3, I removed all my "u", the thing other developers not wanted (see PEP-414 again); I loved the "purity".

but removing u was tedious (at best).

In my use case, strings are "string" and binary are "bytes". Which I think is much safer.

I don't think anyone claimed it didn't support Unicode. Only that it allowed mixing bytes / strings and the default type most people used from the beginning was str. That's a trap that they'll regret the moment they actually need to handle something outside of Latin-1.

Lots of python 2 code out there fails because the default option was simple but bad. I know, because my name broke the CI pipelines in a few different projects.

Did you actually read the thread you're replying to, or are you on auto-pilot?

Let's be honest, A lot of peoples experience with Python is restricted to the North America.

For these folks encountering anything other than ASCII is pretty uncommon.

Personally, I've worked on a Python 2.x project deployed in heavy industry across the globe including Japan and the number of times we had Python 2 unicode nightmare issues was too many to mention.

For these folks encountering anything other than ASCII is pretty uncommon.

Hmmm... [THINKING FACE (U+1F914)]

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact