Hacker News new | comments | show | ask | jobs | submit login
New IPython release drops Python 2.7 support (readthedocs.io)
214 points by peterdemin 9 days ago | hide | past | web | 126 comments | favorite

The 5.x fixed a lot of my UX quibbles with the completion menu, and made syntax highlighting, and now they're further improving the completion quality and the menu. I'm so excited!

And now they are removing obsolete piece of software like Py2.7. Pretty awesome! We have almost twenty years of Python2 software, the day of not looking more to back is coming for python community.

You are welcome. We hope to have more contribution and bring you more awesomeness. Already installing Jedi-dev will make things faster and complete more edge cases !

And you just can't hide it.

Speaking of UX and hiding, you can hide the menu bar, so you don't have to lose control unless you like it:

  echo "jQuery('#header-container').hide();" >> ~/.jupyter/custom/custom.js

Wow, that's awesome. Didn't know I can do custom tweaking with Jupyter.

It's YOUR browser, for heaven's sake!

It's a shame that humorous comments are so consistently downvoted here. Are we all supposed to be post-laughter now or something?

Low-content jokes never did particularly well on HN. Like it or loathe it, it's been a cultural feature here for years.

I don't remember it being this bad maybe ~8 years ago.

yeah, the HN commenter majority are stuffy and humourless but quick on the downvote. just leave them to their playground.

It has nothing to do with that. The fear is not what's added, but what's removed. Experience tells us that jokes will quickly dominate the top of threads, leading to a loss for serious comments. Sure, there's enough room for as many comments as you like in a thread, but only the top x actually get read, usually.

FWIW, I thought the comment was pretty funny, and the first answer was even better.

I use Juypter and the iPython nearly every day. The new autocomplete looks extremely promising. If you are running Python 2.7 it may not be a big deal to just use iPython 5.x though.

The two major uses I have are prototyping nearly any coding project that requires python and teaching myself data analysis. This has saved me hours if not days due to the fact that the feedback loop is so fast. When I code in other languages I desire the iPython interface.

>When I code in other languages I desire the iPython interface.

It's not specifically IPython, but Jupyter does support many other languages through kernels!


Also instead of linking to the doc you might want to read the announce blog post: http://blog.jupyter.org/2017/04/19/release-of-ipython-6-0/

A comment from a beginner python user -- but I need to rant ...

I've been doing a bit of python the last few weeks for some image processing/computer vision tasks (using opencv and numpy).

I have to say, all together it's a pretty miserable developer experience.

Python is incredibly slow - forcing pretty much all computation no matter how trivial into contortions via numpy incantations so that the inner loops can run in native extensions -- these Incantations have a lot of implicit not well documented magic. Miss some details in the behavior of the magic and suddenly you have a major 10x slowdown -- but good luck finding where. I would kill for an easy to use tool like xcodes Time Profiler ...

API usage errors (even those where invariants are checked at runtime) are ridiculously under informative -- opencv for example does quite a bit of runtime sanity checking on the shape and type of arguments to its methods -- but somehow even simple details as to which parameter is the cause of the error don't get reported in the stack trace severely increasing the cognitive load required to identify the mismatch -- not fun when multiple arguments to an API are the result of a chain of numpy magic data munging. This may be an opencv complaint more than python (aside: opencv is pretty terrible.)

I'm not sure what I'm doing wrong with python but I find the majority of my code to be sort of menial data munging -- and I haven't figured out good patterns to organize this munging in any sensical way --with a static language d.r.y patterns to centralize such plumbing operations have the awesome effect of moving invariants into reasonable places -- in python without any ability to organjze guarantees, as the code base evolves I find myself needing to repeatedly check data shapes/types -- there doesn't seem to be an obviously useful way to organize verification of data types as the necessary invariants become apparent. These issues are compounded by the fact that refactoring is an enormous pain in the ass!

I feel like all my python code is throwaway code. Maybe that's what I'm missing -- I need to just accept that all the python numeric code I write is pure one-off junk, embrace copy paste and never try to reuse any of it ...

Sorry for the rant! I remember loving dynamic languages when I first discovered them - but right now, I really miss c++ (or even better swift).

I can't imagine the number of hours wasted because of these overly dynamic tools -- and there is simply no decreasing that lost time in the future -- as these languages grow if the house of cards ecosystems they sit atop grow and motivates more use then ever more developer hours will be lost to avoidable triviality ...

"I would kill for an easy to use tool like xcodes Time Profiler" -- maybe this(https://www.visualstudio.com/vs/debugging-and-diagnostics/) can help. "I really miss c++ (or even better swift)." -- maybe this(http://mypy-lang.org/) can help. I am just trying to help and have no intention of going into static vs dynamic typing but if you want to do data science stuff in a statically typed language, you should have a look at F# and its type providers(http://fsharp.org/guides/data-science/). I personally don't agree with a few of your criticisms of python, but you needed to rant so I'll just ignore them :)

Sorry, I forgot giving a link to Python Tools for Visual Studio which includes the debugging and profiling utilities I linked to(https://www.visualstudio.com/vs/python/).

   forcing pretty much all computation no matter how trivial
   into contortions via numpy incantations so that the inner
   loops can run in native extensions
That's actually a big plus in my book.

Welcome to the development of Matlab-style code. It can be pretty disheartening to see a complicated hand-crafted algorithm replaced by, say, a matrix multiplication. Try to find Matlab/numpy implementations from paper authors. This programming style is a bit difficult to get used to initially. But it will become incredibly powerful as you start thinking in algebraic operations.

Also, don't do debug runs on the full data. Implement ideas with a subset that computes fast. Then simplify the code and make it fast. Then run it on the full dataset.

If you absolutely need to combine linear algebra operations with fast imperative code, have a go at the Julia language [1] or use a C++ library like Eigen [2] or Armadillo [3].

[1] https://julialang.org/

[2] http://eigen.tuxfamily.org/index.php?title=Main_Page

[3] http://arma.sourceforge.net/

As a longtime user of Python's scientific stack, I understand your pain.

In a practical sense, the libraries that make up the Python scientific stack -- numpy, pandas, matplolib, sklearn, etc. -- are all akin to domain-specific languages, each with its own magic behavior, quirks, and recommended best practices that one must master. It takes a long while to become proficient in all of them. The good news is that there's a large, growing community of helpful people using these tools, so you can usually find the answer to any question with a single Google search.

For numpy specifically, if you are interested in getting the most out of it with the fewest possible headaches, I would recommend the following online book: http://www.labri.fr/perso/nrougier/from-python-to-numpy/

Otherwise, please make sure to read bmarkovic's spot-on comment in this thread: https://news.ycombinator.com/item?id=14158597

Man really? I personally love the Python OpenCv Wrapper.

From a stack-overflow (https://stackoverflow.com/questions/13432800/does-performanc...)... "I remember I read somewhere that performance penalty is <1%, don't remember where. A rough estimate with some basic functions in OpenCV shows a worst-case penalty of <4%".

I'm working on a project that downloads a 2 hour video, frame by frame. Then after, I run image detection on a photo or multiple photos against every single one of those frames.

My code was slow as shit at first. I decided to profile and realized the only speed hog was SIFT's detectAndCompute(). It was taking ~.5s for each frame passed to the algorithm.

So I ended up trading memory for speed and now create huge PyTables loaded with every single frame's KeyPoints and Descriptors. I do this when I first download the videos frames and even though that takes a while, I can now run image detection against 6000 frames (100 minutes) with however many photos in about (24 minutes).

Point is, even just using the built-in cProfile helped a lot because there was only one function that was truly affecting my python. Other than that, I love the python translation. I'm surprised you have memory problems though. OpenCV always murdered my CPU's but never was really memory intensive

The idea of using Python for high performance numerical work is daft. The whole scientific Python thing revolves around the fact that Python is an excellent low boilerplate research and prototyping language with possible production uses where performance isn't critical. Unfortunately this lead to a lot of software in this area being developed for it. You wouldn't be doing performance critical production stuff with R or Matlab? I'm afraid that for serious numbers crunching nothing can truly replace compiled, statically typed languages and additionally you can't rely on GPGPU abstractions without fully understanding the underlying mechanisms and I'm pretty sure that the same is true for NumPy. Your problem are your expectations.

I'm doing some pretty heavy image recognition while stitching the output of a camera to image objects larger than the camera can see in one go from two different angles in real time. Python works fine for high performance numerical work, the only thing you have to keep an eye on is to use python as much as possible as the glue and the various libraries written in C or C++ to do the heavy lifting for you (and there are tons of such libraries).

The reason why python is the right tool for this job is because it has such libraries for just about everything, from serial communications to a bunch of hardware driving stepper motors, relays and reading inputs to imaging and number crunching and all the other bits that went into this. I'd have a much harder time achieving the same effect in any other language and likely it would have cost me a lot more time.

Python is not perfect (far from it) but it gets the job done.

You just switch critical sections to Cython. It knows how to handle NumPy objects and you get compiled, statically typed code when it is needed for performance.

Numba is a solid option as well.

Lots of good replies to this -- and some good links which I am going to follow. I appreciate the comments!

I appreciate this comment as well -- and indeed I will agree that my expectations are part of the problem.

I'm not actually doing 'performance critical' work -- but rather prototyping tasks -- the speed of my feedback cycle is the only thing that's performance sensitive within the context of my rant -- and this includes time to run the code and the time to debug it.

I should also acknowledge other (self-induced) problems including the need to run code on a remote server for extra hardware capability vs my pitiful laptop.

I should acknowledge that part of my frustration could come from my development process as from the inherent nature of the underlying tools themselves ... I've experimented with my process -- trying to find something that works for me and I've ended up with this hobbled kludge of tools:

- a script that re-runs rsync on file changes - atom + hydrogen extension connected via ssh to a jupyter kernel_gateway - some ssh terminals where I run longer running code from command line - an sshfs mount of a remote directory on the server for viewing some output artifacts ...

Avoidable runtime errors after longish-running processing tasks are a very frustrating time-sync ...

Now that I have a better idea of your dev setup I think I can be of a bit more help :)

1. If you are prototyping and prefer working on a remote server, have a look at(https://notebooks.azure.com/), memory is limited to 4Gb but its free and jupyter notebooks are great for prototyping.

2. For prototyping you should try and do REPL driven development, what I mean is that you should be ok with just playing around with the library API before you write a longish-running processing task, that would reduce(not remove) the chances of a runtime error. Jupyter notebooks excel in this too as you can just try out code in a cell, learn from it, rinse and repeat. You can also use your IDE to send code fragments to REPL and get immediate feedback on them. This way of iterative development would make sure that the speed of your feedback cycle isn't slow. If python's feedback cycle seems slower to you than C++, you are definitely not using the REPL enough :)

3. If debugging is a pain point definitely give an IDE a try, I prefer Visual Studio as I am on Windows. You can very well go with Pycharm or vscode(not technically an IDE but has a debugger so that's that). I personally prefer Jupyter notebook or Emacs with my code and REPL in split windows for prototyping, different strokes for different folks.

I personally love working with python, I even write my blog posts in a Jupyter Notebook so although it isn't perfect, it doesn't have to be frustrating either :) Hope my suggestions can be of some help!

There's "not for performance-critical use", and then there's "why is it gobbling up all the RAM (~10GB) and make it impossible for me to do anything, when the entire data I fed it is several hundred MB."

Python frequently saunters into the second territory. Well, I guess there are tools available to profile memory usage and stuff, but if I had to spend that much effort tracking down memory issues, I might as well rewrite it in C++. (It doesn't help (or it helps?) that I'm much more comfortable with C++ than Python. YMMV.)

There are some easy to avoid pathological cases such as concatenating numpy arrays in a loop.

Take this gem from a well known course:

   np.concatenate([x.next() for i in range(x.nb)])
That looks pretty innocent but it can eat up your memory in an eyeblink if the input is large enough. That's the sort of pitfall that a lot of python code suffers from because the abstractions are just nice enough to make you believe this will work without penalty and without knowing how it is implemented under the hood you're suddenly out a few gigs of ram.

If np.concatenate accepts any iterable, removing the brackets is what you want to do there.

That won't help. Concatenate needs to know the total length up front. It will evaluate all the input items at once.

My experience is that Python is much better about this than many other languages in its class, Ruby in particular. It's going to be hard to beat manual pointer/memory management in terms of raw performance, but Python is pretty respectable for an untyped, memory-managed language.

I have many of the issues you have when I process data, but the issues are with the nature of data-processing, not python. Even in R, or MatLab, I struggle with formatting the data, making sure the size and order of tensors match up for vector calculus operations, and so on.

I think we need a better abstract language for checking matrix algebra operations.

I think we already have such languages, I would never be able to make better arguments than a Turing award winner so I'll just leave this link(http://www.jsoftware.com/papers/tot.htm) "Notation as a Tool of Thought" by Kenneth E. Iverson.

I had some of the same frustration with Python, and eventually switched to julia-lang for this kind of stuff. You still don't get the static type analysis step of something like C++ or Haskell, but you can more easily control the types that flow through your computations (compared to python) by using julia's type system. You don't have to deal with numpy's magic, because julia has very nice linear algebra built-ins, and as long as you produce type-stable julia code it's usually plenty fast. There's also an opencv package for julia, although I've never used it.

I started out learning programming using Python and gradually came to prefer strongly typed languages. I'm aware of 1 1/2 options for doing this kind of work in a strongly typed language that isn't C++:

Option 1: F# or C# with the Deedle[1] library. Deedle provides series and dataframe classes, along with various stats functions. I believe that there are also some vis tools. Type providers in F# allow you to specify a data source, such as a CSV or database, and then not only infer the types but also give you intellisense autocompletion. See the F# guide for data science[2] for more info.

Option 1.5: I hesitate to recommend the following, because there's simply not much here yet, but there is a dataframe library[3] for Nim[4]. Nim is a strongly typed language with a Python-like syntax which compiles to C and is apparently quite fast. It has multiple options for garbage collection but also supports manual memory management. It offers lisp-like macros for implementing DSLs, which the dataframe library I mentioned uses quite a bit. The main problems with Nim are of course the lack of libraries and the need for a notebook-like environment such as Jupyter, which are certainly big problems indeed. But I think that Nim is something to look out for over the next few years.

As much as I like Deedle and F#'s features, I've personally decided to abandon the use of Microsoft technologies due to their many user-unfriendly actions regarding Windows 10 and privacy. I don't fault anyone else for using Deedle, though, because it is a nice tool. This is just a personal decision of mine.

[1] http://bluemountaincapital.github.io/Deedle/ [2] http://fsharp.org/guides/data-science/ [3] https://github.com/bluenote10/NimData [4] https://nim-lang.org

Whilst Nim doesn't have a REPL/jupyter like environment, I've found that it compiles reasonably quickly, and you can setup Vscode to compile and run on save which can make testing things pretty fast and easy. :)

>>I started out learning programming using Python and gradually came to prefer strongly typed languages.

Python IS strongly typed. It is also dynamically typed.

D'oh, I meant statically typed. I blame only getting 3 hours of sleep... :)

I use python for build automation, custom build jobs, code generation for C++, glueing command-line tools together and for multi-platform shell-scripting. For that it is really great. I also had some tasks where vanilla python was simply too slow. In this case I found it easier to move the slow parts into a small cross-platform C++ command line tool and call that from python instead of trying to make the python code faster. YMMV of course :)

Python may just not be good choice of language for your problem. Have you looked at Julia? It claims to be fast for the kinds of things you're doing, and it shares a lot of the Python ecosystem -- the Ju in Jupyter is from Julia, i.e., you can work in notebooks, and you can use existing Python (and C/Fortran) libraries directly. Also, Cython might be an option to speed up your code.

Debugging Python is a pain, but it's an amazingly expressive language and a joy to work with. However, once you find that the cost of debugging outweighs the benefits, you probably should switch to another language. Even more so if you are facing performance issues!

As for myself, the first serious language I taught myself was Python, so I'll always have a soft spot for it. Now that I've begun to realize Python's shortcomings, I've started to work more with statically typed languages; my favorites are Scala and C++, and my next goal is to work with Rust.

It's also important to check out other debugging tools for Python. If you're relying on the build in debugger tools, it's a pretty miserable experience. I'm a big fan of pudb [0] that gives a much better experience.

[0] https://pypi.python.org/pypi/pudb

Why did you feel that you needed to start using Python in the first place? Do you have prior programming experience? Regarding data munging, that's sometimes a significant part of the CV/IP process.

Personally, I love Python. It's been working fairly well for our ML projects utilizing scikit-learn to process fairly large spatiotemporal datasets (temporally varying 2D and 3D datasets). I find numpy to be critical for keeping runtimes reasonable.

Regardless of the language, profiling may be necessary in order to obtain acceptable performance.

How would C++ help with data munging? No matter what tool you choose, you still have to go through the pain of getting your data into the shape it needs to be in to do the analysis, charting, etc.

Also, Python has plenty of abstraction capabilities, just like C++. There's no reason all your code should be throwaway. Make use of classes, magic methods, list comprehensions, hashes, the Pandas library (lots of powerful abstractions for managing data in there).

I would kill for an easy to use tool like xcodes Time Profiler ...

PyCharm professional edition comes with a pretty nifty profiler.

vprof is also quite nice (its free) - https://github.com/nvdv/vprof

Wow, I only knew of snakeviz: https://jiffyclub.github.io/snakeviz/

Now, I would kill for the ability to compare two profiles, but putting them on two different tabs will do for now ...

All the stuff you're complaining about is stuff that isn't really a python problem. They are definitely valid complaints, but they are with nonstandard things. Numpy is not really Python, it's magic on top of Python. Same goes for OpenCV, which is really just a bridge over to native stuff.

Python is very popular for the use cases you describe, but I have long felt it's akin to using a Honda Civic that has been souped up to be a specialized race car. You can definitely make a Honda fast around a racetrack, but that's not what they were designed to do. You're better off using a purpose built machine.

That being said, a purpose built racecar is harder to drive, doesn't have AC, needs more maintenance, has more expensive tires that wear out faster, etc...

Nothing in life is free, there is always a trade off.

You should try Go. It writes quickly like Python, but it's much simpler and more performant. It's also statically typed so you don't have to search all over to figure out the real type of a parameter, etc. Finally, everything compiles into a single static binary, so you don't have to worry about having the right version of an interpreter or libraries installed on your system--just pass around the binary and run it! I also came to it by way of C++ and found it nice that any type can be passed around by value or by reference, unlike Python, Java, etc where everything is a reference and it's hard to tell what (if anything) will be stack allocated. http://tour.golang.org

Grandparent is writing numerical code. I doubt that Go which has neither operator overloading nor built-in types such as vector/matrix will work here.

Neither operator overloading nor built-in numerical types are necessary (or arguably even useful) for scientific computing. Further, Python also lacks built-in vector/matrix types.

Python is a great language, but the numpy library is abominable.

With that in mind, you can try to program your numerics code hiding as much as possible the fact that you use numpy. I have found that this is the only barely bearable way to do numerics/image processing in python.

Numpy is not perfect but this is the first time I've heard it being referred to as abominable. What is it doing fundamentally wrong?

I actually find numpy to be fairly consistent and intuitive. Usually the code ends up being faster and more concise. I love the masking ability of the ndarrays as well as the support for broadcasting operations.

Yeah, I'm scratching my head over that comment as well.

Any slightly complicated indexing of array in numpy is a pain. I've spent hours reading the numpy documentation for indexing. After that project I've completely forgotten how to do indexing in numpy. I expect I'll spend more hours reading the same documentation if I ever go back to numpy. Compared to Matlab, it is ridiculously complicated.

I don't see how this doesn't lead to a LOT of silently broken code ...

In [3]: x = np.array([1,2,3])

In [4]: x[0:500] Out[4]: array([1, 2, 3])

Maybe "abominable" is an exaggeration...

However, after translating some exercices from Octave to python/numpy, I have experienced a great deal of frustration. A simple three-line solution in Octave becomes a fifteen-line behemoth that does not even work consistently across different versions of the system.

Can you give a specific example?

I feel for you. Perhaps some of this may help:

It sounds like you're mostly having trouble with OpenCV, which has a horrid API, especially the python version adds another layer of confusion on top of it, and dumping things back and forth from opencv Mat objects to numpy arrays sucks and wastes time. The documentation also is fairly bad. The error messages, like you say, kinda suck. So I can't help you much with that but know that the opencv python bindings fall WAY below the standard of what's considered good. Stuff like simplecv is a bit better but doesn't cover nearly as much as OpenCV does.

Numpy itself is great though, for how little code you often need for a complex, vectorized matrix operation. I've used stuff like Eigen (or even boost matrices or raw BLAS stuff) which I think is way more painful to deal with. For the annoying cruft that any of those bring, the minor speed boost over numpy isn't worth it for me most of the time.

The other thing is, when folks complain about the slowness of python when doing computationally heavy stuff like this, it's often the situation that they have a bunch of vectorized operations, and in the middle somewhere, they stick in something like a regular python for loop, which is orders of magnitude slower. The trick, especially for hot loops, is to just avoid using anything but vectorized operations. In something like C++ of course you're less likely to notice this happening because there is much less overhead, but there'd still be an impact of doing stuff the non-vectorized way anyway.

For profiling, take a look at this: https://www.huyng.com/posts/python-performance-analysis plus the internal cprofile module that does call-graph profiling. It's nothing amazing like vtune or whatever, but it gets the job done.

As far as architecture goes - the relative lack of typing structure in python definitely makes life easier in many ways and cuts down on boilerplate, but it also makes it much easier to shoot yourself in the foot like you say. One thing to keep in mind is that when writing data heavy code there is an additional layer of concerns in play. Stuff like the "shape" of your data. It sounds like if you're running into this, what you want is to make a big diagram / graph of how data is flowing through your code, each stage of computation, and what the data looks like before and after. When you understand this well, it's much easier to reason about the code.

One thing that helps to know is that the style of programming in the python world is a bit different. In C++ land, the instinct is to start with building architecture, and when you inevitably realize that you've misunderstood the problem domain, you do a massive refactor. This usually constitutes large changes in how your code is structured but C++ gives you tools, like strong types, to do this rather easily. With python, unless you're setting of on a massive project, you start with minimal architecture. Write code that does the thing you need. Then, as you find yourself repeating yourself or doing unnecessary things, or dealing with complexity you don't need to know, only then do you start giving structure to your code bit by bit, as you need it. It's a different philosophy that's the result of a different set of functionalities.

In the larger context, I mostly have the opposite experience as you - I can pull out C++ whenever I need it, but I almost never do. The set of tools I have for computation an analysis in python are so much richer. For example, imagine c++ code for reading an excel sheet, and scraping some data off the web, cleaning both, joining them together, running a complex mathematical calculation, and saving a few plots. And I can do this from scratch in half an hour, around 100 lines of code. I also have an interactive shell that I can explore the data and the libraries I have, and I'm not stuck in the code-compile-code-compile loop. The productivity boost is amazing. In contrast, I don't even want to imagine what it would be like to attempt to do this in C++. Python is great glue. Admittedly, you're pushing at the edges of what python is useful for. It's definitely possible (I've written computer vision stuff with python before, and it worked OK), but if you're not making use of any of these benefits of using python, you might as well use C++ and enjoy the speed and strong typing.

Hope this helps your python experience be a bit smoother!

While a lot of your points are valid, I'll say I haven't found anything better in the "dynamic" space. It's still better than MATLAB...

Also, do you use pandas, and why not?

Does scikit-image have the algorithms you need? If so, I'd highly recommend it for image processing - the API and documentation are well done.

Try Julia. It's fast and it combines a dynamic language and types very nicely.

Re: speed. Do you suppose pypy might help?

Sounds like you need a good mentor.

I understand the notebook machinery itself is running on Python 3, but can I still launch a Python 2.7 kernel? I am still working with Python 2-only libraries at the moment.

Yes you can. Kernels are completely independent from the notebook itself.

It'd be really nice if they followed python3, and named it ipython3. This is just going to confuse install processes if I now have to install "ipython<6".

Please read the blogpost (http://blog.jupyter.org/2017/04/19/release-of-ipython-6-0/) we went to great length to make sure that `pip install ipython` just install the right IPython. `pip install ipython` will just install 5.x on Python 2. So you don't have to worry about which versions are incompatible.

The `ipython3` command already exists regardless! (You should try it.) On my particular configuration:

  $ ipython3
  Python 3.5.2 |Anaconda 4.2.0 (64-bit)| (default, Jul  2 2016, 17:53:06) 
  Type "copyright", "credits" or "license" for more information.

  IPython 5.1.0 -- An enhanced Interactive Python.
  ?         -> Introduction and overview of IPython's features.
  %quickref -> Quick reference.
  help      -> Python's own help system.
  object?   -> Details about 'object', use 'object??' for extra details.
  Enabling tab completion
  In [1]:

In 3 years python 2.7 will be EOL (it already is in maintenance mode).

There's no good reason to prefix it that way.

That's not necessarily a Python thing, but a Linux distribution thing or whichever package manager or installer you are using. I like all my Python 3+ installs just called python, personally.

It is a Python thing, see https://www.python.org/dev/peps/pep-0394/. And changing the definition of `python` is just going to cause issues again once Python 4 eventually comes around.

It's unnecessary. You don't need to jump through those hoops.

Pip can handle all sorts of requirements including Python version, OS, etc. pip install ipython currently grabs ipython-5.3.0 on Python 2.7.

It does! But make sure to update pip itself (pip install --upgrade pip), because it only does the right thing as of pip 9.0. Some of the IPython team actually worked on the infrastructure (pip & pypi) to make this possible, to minimise the pain for installing on Python 2.

I hate that. For example many IDEs look for pylint, not pylint3 or pylint-3 ...

The one without a version number should be a symlink instead. RPM and DEB support that.

This recommendation should really be updated IMO. `python3` will never see full adoption as long as `python` is reserved for Python 2. I understand the need as a compatibility thing, but at some point the forward-looking cost of distinguishing `python3` exceeds the legacy cost of applications that just say `python` when they mean `python2`.

The best option (in my opinion) would be to add a simple way on the first line of a python file to express the language.

Then 'python' can read the file, and execute it under the right language.

The problem is if we ever switch 'python' to 'python3', then there will be 5 or so years when 'python' becomes unusable, as you won't know on a given machine if it's python2 or python3, and almost no program works on both.

> if we ever switch 'python' to 'python3'... you won't know on a given machine if it's python2 or python3

This kind of already happened. Arch already has Python 3 as 'python', and if you use any of the popular environment tools (virtualenv, pyenv, conda), whichever version of Python you pick when creating an environment will be 'python' inside that env.

It's not that big a problem, though. It's quite possible to write code that runs on both Python 2 and 3, and it's surprisingly rare (in my experience) to call unqualified 'python' rather than using an explicit path.

There's no need, if you use virtualenv (as you should) you'll get the right version.

I really like that I can install command line tools and then just reference: <virtual env>/bin/<tool name> and it will run it with the correct interpreter.

That's fine for me and you, but no good for just distributing a script. Then I want to be able to just say to people 'run this script', maybe explain 'chmod +x' to them at best. Every extra step is a pain, and virtual env is a major pain.

I'm not sure what are you talking about, if you use entry_points in setup.py then it will create right entries, set proper permissions and ensure the right python is run.

If you use setuptools correctly it is fairly easy to install and run packages e.g.:

  venv-3.5 my_env
  my_env/bin/pip install mypackage
or if you don't want to publish it to PyPi:

  venv-3.5 my_env
  my_env/bin/pip install mypackage.whl

I'm talking about ease of use for non-experts.

I sometimes send people python scripts, I can just tell them to run it from the command line. I can assume any linux or Mac OS X distro will have some version of python2 installed, so I can just say "run 'python myscript.py'".

If I give them your instructions, then from my computer I don't have venv-3.5, or venv. I'm sure they won't either. Standard ubuntu installs don't have virtualenv or pip either, so non-root users have pain installing any of these things.

Ok, so you're talking about simple scripts that are contained in a single file, in that case it's possible to simply write a script that works on all versions. I wrote many scripts like this (ok, not all versions but 2.6-3.6), it's really on the developer.

I mean even within a single major version there are changes that are incompatible. For example 2.6 doesn't have OrderedDict, doesn't have set comprehension. 3.3 doesn't have asyncio, 3.4 doesn't have async/await, 3.5 doesn't have types annotation for variables etc.

So having python2/python3 doesn't really benefit much especially that at this point everyone is moving away from 2.7.

Edit: virtualenv is builtin in python since 3.4.

What's wrong with the standard shebang?

#!/usr/bin/env python3

Python won't read and "execute" that line. The shell does that. So...

would work, but

    python script_in_indeterminate_version.py

Yes, that's how I usually do it, although most of the time I'm on Windows. Is there a benefit of executing python directly?

Most obviously, calling the script as an executable requires you to have execute permissions on it, whereas calling python requires read permissions on the script.

More generally, the whole subthread is about the behavior of the command 'python'. Responding to the complaint "I want 'python' to have behavior X" with "it doesn't, so do something different" isn't really appropriate to a discussion of how 'python' should behave.

Why duplicate functionality already available in the shell though? Maybe I'm missing something, but if you already have write permission on the script and the permission to run python, you might as well make the script executable.

They did duplicate this functionality in the Python launcher on Windows [1], but that's 'py', not 'python'. There might be good reasons for not replacing the regular python binary with a launcher on Linux, I don't know.

[1] https://docs.python.org/3/using/windows.html#from-a-script

> Why duplicate functionality already available in the shell though?

Well, the only reason that functionality is available in the shell is that python is capable of processing python scripts. The shell isn't. So this question doesn't make sense to me.

#!/usr/bin/env python3 heading an executable script.py means "when I execute this file, I really mean 'run the command /usr/bin/env python3 script.py", and that means "run the command 'python3 script.py'.

And running scripts by invoking python yourself is absolutely standard. Consider "python setup.py install" or, for django, "python manage.py shell".

The functionality is available in the shell to let you execute scripts without having to know or care what language or version something is written in. That's taken care of by the script author who puts the appropriate shebang in there instead.

I'm not suggesting that the shell would execute Python scripts, I'm suggesting that the shell can be used to dispatch the script to the correct Python version. Just like all the other scripting languages. You seem familiar with the concept, so I don't get what's confusing about it.

I agree that 'python' should read the shebang when available, at the moment PEP 394 says it should mean 'python2' but is subject to change in the future, but it's easy to solve the problem yourself. Either invoke scripts directly with the ./script.py syntax, or make a script that reads the shebang and uses the right version and then symlink that as 'python' instead.

I may be wrong, but this is simply saying that jupyter (the notebook interface) requires python 3, not IPython (the backend interface), which should still support python 2.

Read the blog post http://blog.jupyter.org/2017/04/19/release-of-ipython-6-0/ it's IPython which drop support for 2.7 not Jupyter notebook (yet).

Wow, I thought that this would never happen.

I personally think they should have just declared Python 3 to be a new language, inspired by python 2, like many other competing new-pythons.

Instead, they didn't play fair, and gave themselves an unfair advantage.

Damn that Python team, giving themselves an unfair advantage in developing Python.

Seriously, what am I reading here?

Maybe this is a joke, and then I'd say it's pretty neat one. I mean, just look at that, what other language got a similar problem of that scale! And let's not pretend it all passed and we are happily moving forward: this year (this month, actually) I've seen very prominent NN-course, suggesting Python 2.7 for all assignments and code samples, and quite useful library (also NN-related), supporting 2.7 only. I guess, some of mitsuhiko's (Ronarcher's) developments didn't move to 3.* as well.

So, yeah, IPython dropping 2.7 is pretty huge. Almost like moving to a different language.

> I've seen very prominent NN-course, suggesting Python 2.7 for all assignments and code samples, and quite useful library (also NN-related), supporting 2.7 only.

Oh, that's plain silly. Isn't the whole point of Python 2.7 to facilitate the transition to Python 3?

Silly or not, but it's still very common mindset with scientific-oriented people (who care about ML, econometrics, statistics and such much more than "programming" per se), who form a quite notable category of Python users. Actually, for me they are the reason why I'm still using Python a lot (although I try to stick to 3.x), because for other use-cases, like scripting or web-dev I mostly (but not completely) moved to other languages.

What defines python? You purposely ignored the distinctions I clarified in my post.

Let's say I developed a python-like language called "schlython", can I call on the python-2 community to convert all "old" python-2 code into my competing language?

Let's say you developed a Python-like language called Python 2.6, can you call on the Python-2.5 community to convert all old code? Come on, nobody was forced at gunpoint to port to Python 3, that's why there are still abandoned Python 2 projects all over the Cheese Shop.

What conversions are needed for the py2.6 code?

> nobody was forced at gunpoint

The caretakers of py2 are the promoters of py3. Support was dropped for py2 without any serious effort to find new caretakers, so that the demise of py2 would promote switching.

Enough with the conspiracy theories already. There was no reason to find "new caretakers" for Python 2.7 because it's simply an old version of the language.

Reading through your posts here, I don't know what your problem is with the Python team, but please do get over it. It's incredibly annoying dealing with someone who just seems to think they're entitled for more than a decade of support of a version of their free product.

What conspiracy theory - Who decides that py2 needs to be killed off?

> but please do get over it

This is an ad hom - please make your posts more substantial, it feels like you are the one with a chip on your shoulder if you can't respond to these points. Your original comment added nothing to the thread but snark.

There will be many Python 2 production codebases that will forever remain at Python 2. Whether Python 3 is a new language or not will not make a difference.

I really don't understand why. Python 3 is not a radical departure and the usual complaint is the dependency graph. Over the last 18 months that has really healed up, to the point where the Wall of Superpowers is almost 100% green.

I understand that some people want to keep old code running as long as humanly possible, which is their prerogative. But there's no reason to imply that conversion to Py3 is unduly difficult or something that shouldn't be undertaken, even for large/complex codebases. If Google can write grumpy to transpile Python code to Go, there's no reason it can't improve 2to3 to handle their incompatibilities.

I think the main issue is library compatibility, which gives people the impression that compatibility is an unreachable target.

The truth is, once most libraries are 3.x compatible, porting is very easy (as easy as a 2.x -> 2.y transition). And now, we've reached that point. People are starting to catch up.

There are some exceptions of course, those who heavily rely on 2.x unicode behaviour and such, but all in all they are rare. So now, it's much easier than it used to be.

> as easy as a 2.x -> 2.y transition

You mean I can just run all my python2 programs with python3 runtime with no changes at all. Because that was how every upgrade before python3 went.

This is simply not true. For example, Python 2.6 deprecated string exceptions. If you expect in any language to write code and have it run forever, you expect that language never to change.

Yep. It's not Python-exclusive either. I can't upgrade to Ruby 2.4 yet because one of my dependencies hasn't been updated.

Breaking changes are a routine thing in any maintained language; heck, even a conservative project like GCC breaks code between major releases.

Were string exceptions deprecated in 2.6 because they don't exist in py3?

String exceptions don't exist in py3 because they were deprecated in 2.6.

Nearly all my code co-runs on 2.7 / 3.6 with almost no hacks. So, yes.

And your code is all code?


The code that doesn't run on Python 2+3 only runs on Python 3. Have at it.

What is your point here?

What is yours? You're the one asking.

Asking what?

eikenberry asked: "can just run all my python2 programs with python3 runtime with no changes at all"

You replied: "Nearly all my code co-runs on 2.7 / 3.6 with almost no hacks. So, yes.", implying that, yes, you can run py2 programs without change, because this is true of your code

To which I replied "And your code is all code?"; Meaning, just because this is true of your code, doesn't mean it is true of eikenberry's code, or any/all code.

You then responded with a project that apparently runs fully on py3, but only partially on py2 - what is the point you are making by posting that project?

I was going to say "because hosted solutions don't support Python 3" but I see that two days ago Amazon announced Lambda support for Python 3, which was my big holdup.

So I guess it's time for me to get on board with Python3!

What is the largest Python 2 codebase you can point at which has gone Python 3?


"I really don't understand why."

Because you are suffering from the exponential growth mindset, in which the next new thing is as important as the sum total of all things that came before it. In that mindset, the future is emphasized and the past is heavily discounted.


But that mindset is not shared by most firms in most industries, it's a unique pathology to the web.

Businesses don't re-write working code just because a newer language version is available. Landlords don't tear down old apartment buildings just because more efficient building technologies are available. We live in a world in which COBOL runs a lot of mission critical code.

Anything that touches important code that is now working is a risk and an expense, and what is the tangible gain for undertaking this expense? What new features will be added? How much more revenue will you get? What is the opportunity cost of having your engineers do that than something else, like adding a feature or improving test coverage?

Before you say that it's simple, be aware that you are talking about messing with libraries that may not be maintained anymore. You are going to spend a lot of time debugging those old libraries as well as writing unit tests for them. Nothing in the world of software engineering is simple, especially when it comes to maintaining a large body of scripts.

Imagine if in the Java world, it was announced that the Java 8 runtime would not support running Java 6 or older jars. How many jars are businesses running that were written in 2005? No one even know who wrote those libraries.

Suppose in the C world, it was announced that code written in C'99 and before would no longer compile. The Linux kernel has code written in the 90s, and GCC has code written in the 70s and it's still supposed to compile under the most modern compiler.

Moreover automated code re-writing tools don't come with guarantees of soundness or accuracy. There will be breakage, it will occur in random places, and there is zero upside to spending a lot of money to get an existing project to the same level of functionality as it had before you started messing with it.

People who work in other languages get this. It's really not a difficult concept. Everyone other than the Python community agrees that Python did a massive screw up with python 3, just as the Perl community committed suicide by making Perl 6 not be compatible with Perl 5.

I don't pretend to know what will be the future of Python -- maybe there isn't enough enterprise users out there to make a difference for the direction of the language -- but you do need to understand why breaking existing code is a deal breaker for the majority of business customers. You may disagree, but at least don't pretend that there is some irrational mysterious resistance to converting python2 code to python3.

> making Perl 6 not be compatible with Perl 5.

Imo that's either inaccurate or misleading.

In Perl 6, `use` followed by a module name followed by a `:from` adverb leads Perl 6 to load the module, initialize it, and import its public functions, constants, variables, etc., AUTOMATICALLY BINDING THEM TO PERL 6, for any given supported other language the module is written in, provided someone has installed a suitable loader/binder (and the user has installed that plus the other language's interpreter, and the other language's module(s) that they wish to use).

I'll start with a toy example.

With a suitable "Perl 5" loader/binder in place, one could write the following and it would work as expected provided the user had installed the loader/binder, a compatible Perl 5 interpreter, and the Business::ISBN module (from https://metacpan.org/pod/Business::ISBN):

    use Business::ISBN:from<Perl5>;
    my $isbn = Business::ISBN.new( '9781491954324' );
    say $isbn.as_isbn10.as_string;
Stefan Seifert has written just such a Perl 5 loader/binder and steadily improved it over the last few years to the point it's a serious solution.

With Stefan's loader/binder, not only do toy examples like the above work, so do non-toy examples like writing controllers for the Catalyst MVC web framework even though Catalyst is large and complex, written in Perl 5 with XS (C lib) extensions, written without regard to Perl 6, and even though a controller has to be a subclass of a Perl 5 class provided by Catalyst.

This argument is disingenuous.

The same feature is available in any language with a decent FFI (must support loading an arbitrary shared library, marshalling/demarshalling data between languages, and invoking functions in the shared library).

This is possible in Python, Ruby, Java, Racket, Rust, Perl, Go, and plenty of other languages I'm forgetting or haven't used. Yet who would seriously argue that "Python is compatible with Perl" because you can embed libperl in a Python program?

You might as well argue that "Rakudo is compatible with libssl", for all the relationship Rakudo source code has with Perl source code.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact