Hacker News new | comments | show | ask | jobs | submit login
Why Is Python Growing So Quickly? (stackoverflow.blog)
254 points by lainon 9 months ago | hide | past | web | favorite | 267 comments

> Why Is Python Growing So Quickly?

Really easy to answer: It is stupid fun to program with Python. There are libraries for everything you can imagine and they are generally very easy to use. Once you grok the virtual environment thing, writing apps/programs/scripts is just a matter of creating a new env and installing the libs you need. Testing ideas with Jupyter notebook is fun, fast and rewarding. Pycharm is awesome. VSCode with Python just works. You can automate tons of boring stuff (Thanks Al!)... and the list goes on... and on...

You're listing off tooling, libraries, and frameworks written with or targeting Python, which isn't Python itself. As soon as you want to go off the rails in any of those things, Python gets as complicated as any other language.

My take on the matter is that Python is gaining traction due to its market share of the Data Science/Machine Learning field, which is what's really gaining traction and growing quickly. Had Perl or Ruby been the premier language to use for DS/ML, they would be growing instead of Python.

Come on, we all know a language and its library ecosystem go hand in hand. Trying to separate them from the standpoint of adoption is pointless. TFA comes to its conclusion by looking at libraries. In your hypothetical where Perl or Ruby take Python's place, that would manifest as Perl or Ruby having better data science libraries. The causality would probably be just as cyclical as it is for any "best language/library for task X" question.

This. Most schools are moving away from C or Java being their primary language. Most comp sci professors acknowledge the usefulness of a scripting language and so when they need to pick one, they pick one they have the most familiarity with. Due to numpy and scipy this tends to be Python over Ruby.

Also many things about python that I don't like such as `len` being a function rather than a method on a list are very trivial when coming from a language like C which has hordes of tiny inconsistencies.

The whitespace thing in python is pretty gimmicky but does also hold a certain amount of pedantic appeal.

The Python design FAQ explains why Len is a built-in function instead of a method. Guido wanted a guarantee that whenever he called it, it'd return a whole number. As just another method, you might find a new type that defines a length in fractional terms, which would break any code wanting to use that as a list index, etc.

But that would be on your for misusing the length method. All the built-in and popular libraries would still define length properly.

Perhaps, but "your" mistake might impede the popularity of the language. A wide range of well-designed community libraries are a major reason to choose Python.

Take a look at the design rationale for the new __matmul__ magic method. It's not for the standard library, it's for the community.

The need to do that reveals a couple of weaknesses (or more charitably, tradeoffs) of Python: lack of type constraints, and (related) lack of well-defined, invariant interfaces on objects.

Trade-offs indeed. Magic methods indicate the use of interfaces defined by traits rather than formal types. Duck Typing. It's flexible.

But you can also see that Guido has been into types lately. He's been pushing the development of mypy and the new optional type system.

Ruby is just used rarely outside RoR, realistically speaking.

Actually, you should qualify that a bit more. Ruby is rarely used outside of Ruby on Rails in North America. In Japan, it is widely used and Ruby on Rails is fair less common.

What were Ruby used for in japan?

Ruby is a very versatile language in its own right. While the other poster is correct that it is used for anything that perl and python are used for it's also taking a large chunk from the Java and C# crowd in Japan. Everything from server side applications to command line tools and even GUI based desktop applications can be written in Ruby. Using FFI I have made some quite amazing applications with Ruby.

Rails is amazing and I personally love it. But it's not the only thing Ruby is good for.

Probably anything that Perl and Python are used for.

Well, true, you have a point. Ruby is rarely used outside of Rails. Which is a shame, really.

Go to Japan.

Whitespace is no issue with a good editor (Sublime, vim, emacs...).

I dislike the many inconsistencies in Python. You mentioned one. Others: sort vs sorted (destructive vs non destructive). Also: not being able to chain methods because many don't return a value. (Scheme, Elixir or even Ruby are imho much more elegant.)

But all in all Python is a very useful programming language with a rich eco system.

The Python design FAQ explains that .append and .sort return None to remind you they are mutation methods. I find this more elegant than Ruby, etc., where many things look like pure functions but aren't.


I know the reasons why and how mutating methods work in Python.

Nevertheless I find this here much more elegant:

data.sort.tail.first # or this data.append(7).append(3)

More elegant than this:


data.append(7) # and then


I love method chaining in Elixir or Javascript. In Python I have to create temp variables here and there to capture / further process intermediate results.

Having said that: Python has it's strengths. I use it more than any other programming language.

I understand the preference and the difficulty created by methods and operators that are essentially statements instead of expressions. However, when making that criticism we should also acknowledge the benefit of requiring or encouraging multiple lines for those actions.

    if (x=42) {}
Has burned many people. Append having a useful return is comparable, though probably not as much of a trap.

> sorted(data)[1:][0]

I don't want to get into the weeds, but shouldn't that just be

Or if you like unpacking for clarity

    first, second = sorted(data)[:2]


> Had Perl or Ruby been the premier language to use for DS/ML, they would be growing instead of Python.

Well Perl was the premiere data science language for a while. Python made inroads and eventually took over. I think you're mostly right through, it's just a matter of other factors affecting the communities at the time causing Perl to not grow nearly as quick as Python, and Python getting a good numeric library. Had Perl not lost a lot of community in the early 2000's, it might have retained this area.

Perl just lost the plot. I think BioPerl was ahead of the game before the Python scientific libraries stole the show so they only have themselves to blame. Perl's successor Ruby seems to be headed in the same direction by failing to diversify which is a shame because Ruby is much better language than Python.

You still have to explain why python is the premier language for that.

Libraries. And to build them you need certain language features. 2 big ones: complex numbers and operator overloading. The lack of these meant that everyone doing simulation and data analysis skipped Java altogether.

The reason C is so popular is because it's the scripting language of Unix. C++ is the scripting language of Windows. Objective-C, Macs and iOS.

Well, Python is the scripting language of a ton of useful libraries which are relevant to a lot of very interesting fields.

R sits in a similar spot, but Python is a better, friendlier language, which means actual programmers are more apt to recommend it as an offhand solution to a simple problem someone who isn't seen as a programmer is having: "Oh, that's easy. Just copy and paste this Python code, and maybe hack at it until it does something." You can get scientists programming with recommendations like that.

"It's psuedo-code that works"

This. I took the habit of writing very Python-like pseudocode while taking lecture notes, and having it run almost verbatim at home was a blast.

Honestly if other languages had the ability to use words like and, or, not, foreach <thing> in <thing> it would make learning to code that much more accessible. Even a language like C could do this. I understand that obviously you're not going to change something like a bit-wise (>>/<<) operator into something that flat says what it does (drop bit right/left), and I appreciate a good use of a ternary operator, you'd think it would be fairly trivial to de-codify some of the more basic operations.

Python has super-easy bindings to C too, that's another reason it's popular in scientific computing

And because it has bindings to C, it has bindings to a lot of other languages traditionally compiled to machine language, such as Fortran and C++, making SciPy possible, because that optionally depends on LAPACK, which is written in Fortran.


Of course, the people who just install and run SciPy don't need to know that, but the fact Python has a nice C FFI means it can run literally decades' worth of software across a fair number of languages.

I'd be using lua if it had python's library breadth. Or maybe julia. Not python.

I have come to the conclusion that part of the reason python is so fun is because it is slow.

There are so many lovely and easy to use libraries in Python. I took a couple and tried to be inspired by them to write a C++ library, maybe for addition to boost.

However, if you are writing C++, people expect your library to be fast and memory efficient -- you can't go around sticking the entirety of a parsed file in a std::string. Also, people then start saying your code better interoperate with their templated classes well, and they want to replace the memory management entirely, and it goes on...

Of course, the resulting libraries are very fast and very clever, but they have lost the joy. They also take much longer to write, maintain and use.

Watch out for conflating abstraction level with speed. Coding at a low level of abstraction means sweating the small stuff, but it's easy to fool yourself into thinking you're making things fast - when actually, you're micro-managing.

Painting with broader brushstrokes with less concern for micro-efficiencies lets you block out a solution faster. Another part of the problem with C++ is that the components come with so many rough edges that need to be filed down that you can't help getting swamped in a certain level of detail, unless you start with a large toolbox or framework that lets you work back at a higher level again.

You ought to have the equivalent of Java's file stream, buffer and text decoder as three separate bits that you can compose, so that putting a whole file in a string is never even a temptation. But if you start out with a sparse toolbox, it's a slog.

On the other hand, there were times when I had to get a piece of paper and draw diagrams to construct a twenty-line sequence of numpy calls, when the same would have been achieved by several completely straightforward double (or triple) loops in C++. Writing the same loop in Python was out of question because it would be too slow.

Performance grants you some freedom.

Use Numba (http://numba.pydata.org). Have your cake and eat it too.

One of my clients recently replaced a major chunk of their C++ code with Python and achieved a speedup from multi-hour to sub-millisecond. That's a big win.

Python encourages readable code, which enables smarter algorithms, which ultimately swamps all those little memory optimizations. Oh, and PyPy/Numba/Cython let you get those little optimizations, too.

In addition to that there are loads of libraries for Python which are written in C, which gives you speed in the right moments.

I think the readable code thing is really really important, too. I can understand other peoples code quicker and learn more about the actual algorithm unlike in C++ where I wonder half of the time about syntactical non-sense and optimizations for certain compiler targets.

Then again, Python and C++ target very different groups and I like both languages for certain things.

> One of my clients recently replaced a major chunk of their C++ code with Python and achieved a speedup from multi-hour to sub-millisecond.

I'd really like to know what they were doing and how they could do it so wrong that the C++ version took hours while the Python version was faster than a blink of the eye.

Good question. I did have a situation where looping over a fairly large Pandas dataframe took 12 minutes, but using the underlying numpy values took only 3 seconds. The looping was required to do complicated row and column operations.

Similarly, naybe the C++ code was using an inefficient data structure for a large dataset compared to using something like Numpy.

It had to do with the number of dimensions and combinations that were being looped over. That kind of scale difference generally comes from combinatoric explosion -- exponential or high polynomial time vs linear or log-linear time.

The rewrite realized an approximation was reasonable in place of a more "correct" algorithm.

Note that I'm not saying Python was the cause, but the ease of reading and writing code may have helped the programmer come to this algorithm. Instead of stressing about bugs and deadlines. Algorithm choice seems correlated with language choice, in my experience.

> One of my clients recently replaced a major chunk of their C++ code with Python and achieved a speedup from multi-hour to sub-millisecond.

Code rewrites are often more than a literal translation. They often eliminate old code, and allow to rethink the architecture. So it's really not a surprise that Python code would be faster here. A rewrite to C++ would probably be much faster still.

Perhaps a C++ rewrite would be faster, but it depends on how many of the dependencies you revise to be optimal for this particular task and how much more clever you are than the NumPy authors and a jit-compiler.

Depending on how heavily the Python code leveraged Numpy.

Oops, typo: sub-second, not sub-millisecond.

It's a funny way of phrasing it, but you're not wrong.

C++ libraries should allow people to squeeze every bit of performance juice out of them (e.g. custom allocator).

Python libraries should be intuitive to use.


Naturally, there are exceptions, like numpy.

C++ may let you "squeeze every bit of performance," but in a lot of cases it does so by handing you a lot of rope.

You might use it to make a rope bridge across that performance gap, but how good your bridge is depends a lot on you - and if you just manage to hang yourself with it, well, that's on you as well.

With Python it may take longer to get across because you detour a mile over to a more solidly constructed bridge, but you still get there and for most cases you get there fast enough.

"you can't go around sticking the entirety of a parsed file in a std::string"

Sure you can. C++ is at best programmed like python. Just throw the working first version together - and once you hit a performance gap, you can optimize it.

Using a profiler.

Premature optimization is mostly pointless because for most non-trivial performance critical apps the bottleneck wont be where you expect it.

The faster you find the actual problematic hotspots with production content the better. The way to get there fast is to treat C++ like filthy python.

Don't trust anyone trying to goad you into premature optimization unlesa they understand the specific problem you are trying to solve, and only then with skepticism.

Well, in my own code I would, but you would never get some file handling / parsing library into boost, or the standard, that did that. And then (another C++ issue), installing libraries can be fiddly.

"but you would never get some file handling / parsing library into boost, or the standard, that did that."

Why would anyone want to write code that would go into either of those? In the general sense, I mean.

Good production code is simple and is implemented preferably verbosely and understandably than cleverly.

Speaking of good software patterns, I'm not sure boost nor STL are great examples of good software architecture. Boost especially is very templatey-bloatey. The biggest merit of STL is that is exists.

"installing libraries can be fiddly."

Does anyone really need to do that? Install C++ specific libraries I mean. In 10+ years I've never developed C++ software that would not store all of it's dependencies part of the project hierarchy or sub-structure, except for the standard library.

The worst part of C++ is that is has these warts that subtly guide developers into copying all sorts of patterns for their own sake because they are seen "as the standard way to do it". Object orientation and templates should not be considered the basic building blocks of the archictecture of a C++ program, but rather escape valves when some usage pattern would end up more complex without them.

For example, when someone spends long production time creating an elegant template based solution for whatever, generally it is an indication that something is wrong somewhere. There are of course exceptions to this rule, but just saving some boilerplate is never a good reason to do it.

> The biggest merit of STL is that is exists.

The STL is a library of common algorithms and data structures that are decoupled from one another, and the complexity of operations in the STL has been enshrined in to the C++ standard. These two achievements alone warrant granting it huge respect. Imho, it's the essence of great design and it should be the very first thing taught to programmers new to C++.

> copying all sorts of patterns for their own sake because they are seen "as the standard way to do it".

Yeah, Python programmers would never do anything like that. If they did they might all run around calling it being 'Pythonic' or something.

> Object orientation and templates should not be considered the basic building blocks of the archictecture of a C++ program

Just tools in the toolbox, and the amount of mileage you can get out of just these two features, along with other staples like RAII and function overloading is pretty astounding.

Part of it is just that Python has an awesome, supportive community. Sounds like the C++ community is kinda critical and demanding, in your experience.

> python is so fun is because it is slow

C++ pushes you to conflate optimization with the higher level meaning of what you're trying to implement. In Python you can write the higher level meaning in a more 'pure' way, which is fun of course. The really interesting question is if you can program at the higher level while still running fast, perhaps by specifying optimizations separately?

(BTW, for Python code that can be as fast at C++ see https://kratzert.github.io/2017/09/12/introduction-to-the-nu... - it doesn't sacrifice clarity, but is limited to a subset, for now).

Part of the reason is same side network effect. It makes sense to create new libraries in languages having great existing libraries.

Julia is fast and more fun than Python.

> It is stupid fun to program with Python

Around 2009 I started questioning if I even enjoyed coding anymore. I was primarily using PHP and Java. Rather than quit the profession I tried Python and suddenly it was fun again. 8 years later, it still is. Python just spoke my language.

My career is based around using Python because it's the way I enjoy the work. Everything else, like the actual properties of Python relative to other languages, is irrelevant to me if using another language would make the work unenjoyable again. (Though if Python didn't measure up very well compared to other languages it wouldn't have become my go-to in the first place.)

Same thing here. I worked with PHP and Java apps and it was just pain having to deal with the awkwardness of PHP's designs and Java's boilerplate. I actually went into sysadmin roles because of that. Python gave me hope that there's fun development roles to be had.

Nowadays the panorama for good languages, thankfully, is much better. PHP is no longer that much of a dumpster fire with modern frameworks, and Java has decided to adopt better defaults and drop the XML madness. Not to mention all the good languages that have come up (Elixir, Clojure, Rust, etc).

Try Clojure and I promise you will have even more fun.

Maybe you should try F#, to me both are kind of fun compatible (plus the functional focus that is kind of fun by itself).

Are you suggesting that we ignore the article's data-driven conclusion that it is because of pandas and the rise of data science and machine learning?

No they're explaining why Pandas was written in Python and why data science and machine learning projects choose Python over R or Java or Haskell, etc.

That is exactly what I am thinking myself.

I think you're reading the article wrong—pandas is a symptom, not a cause.

That said, it seems foolish to ignore the clear data science use case. But that's also been a theme for at least five years now.

This is an honest question, what's the size of projects you have written in Python? For me, once a project gets beyond some size, the aggravation only grows of debugging old code and adding new. Mostly, I see this is as a lack of compilation and types in Python. How do others overcome this and enjoy large Python projects?

OK. Bear with me here. This might be overstating my case...

In some sense - there are no large projects. Only projects that have failed at being modular.

Maintainability involves keeping clear boundaries between functional units so that you can keep enough code in your head at any one time to reason about it.

If we assume that Python is capable of handling more code than you can keep in your head at any given time, then surely the problem is simply one of making sure your modules are nicely self-contained from each other.

Agreed, modularity definitely helps us minimize the mental load of understanding all the objects at once. But how does it handle this situation:

Write function in module A that expects some complex data structure that comes in. In a static typed language, you put a type on the argument and thus guarantee, to some degree, the data in the object can be safely manipulated.

Over in module B we call that function. Because Python doesn't enforce types, I can pass in some type of data structure that might behave well in many situations, but in some that function is going to fail because I didn't properly construct the data structure.

How do other Python projects avoid this? Do they avoid large data structures of multiple fields?

> Write function in module A that expects some complex data structure that comes in. In a static typed language, you put a type on the argument and thus guarantee, to some degree, the data in the object can be safely manipulated.

Let's question some of the assumptions. Do you NEED the entire thing to be passed around? If your functions are small and only do one thing, can't they be fed the portion of the data they need and only manipulate that? If they are getting unnecessary data, soon someone will like the convenience and that piece of data will now become necessary.

I have seen Java projects with humongous "DAO" objects being passed across all apps. Even if Java had Haskell's type system, at some point it breaks down.

So, you do not enforce that on Python itself. Rather, you have to apply due diligence and ensure your data structures are clean, and functions do what they are supposed to do and no more.

> I can pass in some type of data structure that might behave well in many situations, but in some that function is going to fail because I didn't properly construct the data structure.

Then you construct the structure in a single place, and that should have some sanity checks and sane defaults.

Even if you are using a static typed language, there's no guarantee that what's inside the data structure even makes sense, only that the right type of structure is being passed on. That's helpful, but not as much as people would think.

And you write tests. Lots of tests. It is amazing the amount of Python code that gets written without proper testing. And Java, for that matter. Only Python programmers have less excuses, it's easier to replace what you need for testing.

>Even if you are using a static typed language, there's no guarantee that what's inside the data structure even makes sense.

You should always put validation in constructor so you are 100% sure that the object created is valid.

How can you be sure your validation covered all cases?

Say you have a Circle class, you need a center point so you validate that the center is not null and you check that radius is greater then zero. Do you have an example where is impossible to validate all cases? if so how will your unit tests validate/test all those cases?

In your example, you can't anticipate that center points approximately but not quite equal to pi will cause a ... whatever bug in ... whatever downstream use of the class. One can never really know what states are acceptable.

The circle is valid, what you say is a bug in other section that crashes with valid circles,so types could prevent invalid inputs to functions. What you describe is a bug in an algorithm I don't think that is possible for a compiler to prevent bugs in algorithms, division by zero or precision errors

Type systems can do some pretty miraculous things. For example, you could distinguish between a connected and disconnected socket with types so that you don't accidentally send on a disconnected socket.

I guess I'm arguing both sides now. My point is that one needs to root out the weird cases somehow and that it's often hard to have "100%" validation of inputs. One's initial understanding of what is the valid and invalid state space is often incorrect.

Twisted is a massive codebase, and uses zope.interface. There are definitely options for stricter guarantees, but it's optional, and generally you'd rather be treating input as ducks.

Also, even without actual type checking (mypy), IDE support + type hinting (PEP 484 or reST et al) means you can find most issues before the interpreter/compiler is involved. PyCharm is fantastic at this.

>How do other Python projects avoid this?


Here's one way. Those asserts will typically pick up on errors during spikes, development and tests and turn a head scratching problem into a straightforward issue.

A couple of times on a disastrously tech debt ridden project I used it to pick up obscure configuration errors in prod, too (shouldn't see it in prod on a relatively well written project tho).

You might consider this too much of an edge case but what if I want to pass something that quacks like an EventLoop to Process init ?

Unless it's a subclass (which would pass), I don't imagine that would be at all likely.

Never been much of a fan of duck typing. It only really makes sense when you have objects that approximate built in types.

Take a random object with a .run() method, for example, and you don't have a fucking clue what it might be doing so there's no point interpreting that as a quack.

Everything's a server, er, microservice, passing JSON back and forth!

Microservices are great, because you can enforce static typing within their interfaces. \s

> Mostly, I see this is as a lack of compilation and types in Python. How do others overcome this and enjoy large Python projects?

Divide code into modules.

If you are worried about types, use object oriented programming to define the classes you want and then use the assert() instruction to check variables to comply with said classes/types.

A good IDE like PyCharm will take advantage of such assert() statements.

A combination of:

* Realistic integration tests that cover as many user stories as possible.

* Raise exceptions ASAP for any kind of invalid data or state (could be checking types, but could also be checking for file/directory existence). Could also be pip installing and using 'schema'.

* Do not build on poor quality modules and strive to decouple and uninvent poorly reinvented wheels.

* Just in general: loosely couple everything.

I don't see this as a static typing issue at all. Static typing only helps with a small part of the problem and it usually does that at some expense (usually more verbose & less flexible code).

I'm a little skeptical that super-strong type systems like haskell necessarily help all that much either. They clearly help eliminate classes of bugs, but the overhead is very high (the amount of everyday software written in haskell is seemingly small relative to its apparent popularity).

That description doesn't explain why there has been such a sudden growth in recent years. Python is an old language. The real reason it is growing so much just recently is of course due to ML.

> The real reason it is growing so much just recently is of course due to ML.

This is certainly a potential explanation. But I don't think it is self-evident that the trend of recent growth is entirely due to this.

Edit: The article supports the claim - I should have read it before commenting.

It doesn't need to be self-evident, as it's what the article is showing with data.

My bad, this is what I get for not reading the article. I added a note to my comment, thanks.

> Once you grok the virtual environment thing

it's incredible how hard it is to get some people on board with this. I've gotten used to good project-local package management in other langauges, I can't imaging working with anything that depended on stuff happening to be available in a global namespace ever again.

It is a bit more of a pain to work with than similar features in other languages. Like as far as I know you can't just move a venv, you have to rebuild it in the new place. Whereas `node_modules` is just a dir. Also, there is a weird layering where venv abstracts away installed eggs/wheels/whatever they're calling them these days, but then since that's still painful to work with, you need another layer on top like virtualenvtools or burrito or whatever it is. (Sorry, it's been a couple years so my memory is fuzzy)

"a bit"? honestly I would say it's much closer to a lot more pain

I was being kind. :)

Yeah, Python's environment/import logic is really convoluted. It evolved organically with little forethought, and only got a minimal pruning with Python 3.

It was essentially created by one dude no?

As someone who has largely moved from Python to Node: it's not the concept of project-local packages that is the problem. It's virtualenv's awful implementation and UI.

Maybe I've got Stockholm Syndrome but I never have any trouble with pip. I don't find it any worse than npm for my usage patterns.

I'm one of those people who are really confused about the whole Python project management situation. Is there a resource where I can learn the equivalent of common operations found in other project management tools such as npm / yarn / cargo?

E.g. is there something like 'cargo run' that just checks that the required python version is installed (I got the impression that this can be specified somewhere), installs deps if they're not installed yet and then runs the main module? Or something like 'npm install --save' that adds the latest version of a package you already know the name of as a dependency and installs it?

Tried conda or pipenv?

Have them try Pipenv!


While virtualenv management isn't that difficult, Pipenv is VERY intuitive.

There are still widely used packages that don't work with pip, like PyQt.

I think a big part of the answer for SO is the popularity and inscrutable nature of pandas. 'Popular, useful and incomprehensible' make for excellent Stack Overflow growth.

Got to agree. I used to google pandas problems for remotely complex stuff because it wasn't really intuitive how to do it in pandas. Now I have a better context for how simple such things can be because of data.table in R.

I recently worked with Python for a couple of months, for the first time. Even though reading code and writing simple parts is fairly easy, I found several details to be quite confusing, like variable scope, or the various underscore conventions/variables. Also, I struggled to find a good free IDE (there was no money for PyCharm), which made debugging a hassle. Also, dependency management with pip feels a bit quirky compared to e.g. Maven or npm.

All in all, I found it neither much more or less fun to program with than e.g. Java, C# or TS. It might be a better fit for small projects than others, though.

> variable scope

Which parts did you find confusing? Was it the lack of a block-level scope, such as within for loops and if statements? I personally find the LEGB (Local, Enclosing, Global, Builtin) system to be pretty intuitive.

[0]: https://stackoverflow.com/a/23471004/3023252

I think one thing I found particularly confusing is how class and instance variables relate to each other. It takes a while to understand how they get masked and in what case which one is referenced.

Free excellent Python IDE is Visual Studio Code with a Python Extension installed by Don Jayamayne. VS Code is a solid Python experience.

Pycharm and visual studio have free editions. Vim, emacs, Atom, VS Code, sublime - there are many options.

I think it was the remote debugger I was missing, and as far as I could tell there was no such feature in any free editor/IDE, probably because it's not a language-native feature afaik.

IDE? Visual Studio Code.

Python is not perfect but the final code is simple to write and understand years later.

I use sublime for python development, don't use fancy debug features, just print statements

I generally use Notepad++. Python lets you get away with it, for the most part. Other languages would be a pain in the butt to write whole programs in it.

> Once you grok the virtual environment thing

I wish this would die, it just scares off newbies. In my twenty years of Python I encountered conflicting reqs once or twice. It's easy to direct folks to the documentation in those circumstances rather than cramming it down every newbie's throat---YAGNI.

Now that pip can install with --user and containers are popular, the problem isn't even possible in production in many circumstances.

Virtual environments are not just for conflicting requirements. I write code for clients on my laptop and don't want to clog up my packages directory with project specific libraries that I'll never use myself. Once the project is done and I turn code over to a client, I can delete the virtual env and all the installed libraries. Virtual envs also allow me to have multiple versions of Python installed on my machine.

This discussion is a lot like version control. How many people made daily (hourly) duplicates of project folders before learning the benefits of version control?

A use case that newbs need not concern themselves with.

You normally won't get a version control lecture during every programming language tutorial either.

Java could have libraries almost as easy to use as python, but doesn't.

However, some things can't be as easy as in Python, such as sympy, which does symbolic processing of in-python expressions (differentiation, integration etc).

I actually don't find python easy to use - every time I use it, I have to look up how to do for loops (what? range? zero-based? does it include the last one?), and the docs don't have links for arguments (even though they require specific types), because no static types.

But the libraries absolutely are easy to use. And because many are written in C (unlike most of Java's librries), you're arguably running C.

I haven't overcome the virtual env thing. There are a few virtual environment libraries, not sure which one to use. Then it is awkward to have to change your entire environment over by activating it and for it to be global. What if I want to work on two different projects in two different terminals? Why can't I activate an environment by just CDing into a directory with my project?

Then there is the GIL. I still don't know if multithreaded python apps are really MT or just faking it. And yes, I know you can do event driven stuff and that does look very interesting.

Use "python3 -m venv"

You can activate two different environments in two different terminals. Why wouldn't you be able to?

You can automatically activate venvs when you cd into a directory with virtualenvwrapper.

Rule of thumb: if a thread goes into C it's releasing the GIL. Not always, but often. Like reading a file, or a library like like numpy or Pillow.

I actually love forcing people into a fairly consistent coding style, but everything else about the language irks me compared to my previous extensive use of Perl (on a large enterprise software product).

One particularly big thing, that it doesn't have much of a distinct compilation phase. You'll write some script you want to run to do some one-off task that will take the next hour, only to find it crashed after 20 minutes when it hits some boneheaded syntax error you made.

For that though you can use linting tools such as "pylint"; catches a lot of errors that might otherwise be runtime exceptions.

Take a look at mypy and pytype - there is a drive in the Python community to add a verification system to the language.

    only to find it crashed after 20 minutes when it hits
    some boneheaded syntax error you made
I find mypy to be pretty useful for this.

For this reason, I generally test scripts like that with a small slice of the data, or small set of dummy data. Then one can also verify that the output is reasonable -- which is good to do even with a compiled language that would catch all syntax errors ahead of time.

+1 for PyCharm. With it I realized how much I was missing with Vim/Emacs/Sublime (even though I configured a lot in those editors).

for whatever reason, the word "grok" annoys me to no end

The origin of the term is fascinating: https://en.wikipedia.org/wiki/Grok

In other words, you don't grok "grok"?

I thought it was just me.

For me it's the fact that no English words end in "ok".

Heh. There was a python web framework called grok.

The real answer is: data scientist everywhere

They usually don't know better

What would they be using if they knew better? R?

It's ok

They'd make a mess with every language...

A beautiful mess, don't get me wrong

There are better languages but there isn't enough "science cargo cult" around them

And honestly there aren't as many code snippets to copy from

My hopes are in luna[1]

p.s. distributing load or using multi cores is still shitty in python, the more dataset will grow in size, the more python will struggle

[1] http://www.luna-lang.org/


Mostly because the whole Data Science and ML stuff is catching on. Everyone and their grandma now wants to do something related to AI, ML etc. The easiest language for these is? Python. Hence the growth.

Once upon a time it was all about cloud, SaaS and building a webapp. Hence the growth of JS.

The chart makes a pretty compelling case that this is the right answer. The top trending libraries are pandas, numpy, and matplotlib (both flask and django are probably trending at the same baseline rate of any top 10 popular language). These are all libraries heavily used in data science.

We also know that ML and AI are new macro trends in the tech industry. Python's elegant syntax isn't really evidence for why Python is trending now as opposed to any other time in the last decade.

> These are all libraries heavily used in data science.

These are all libraries heavily used in any technical computing field. The cost of MATLAB, the simplicity of Jupyter, and the community around Python have caused Python to slowly be a language used for general technical computing, not just data science.

I don't like how Numpy, Matplotlib and Pandas are portrayed as "Data Science". There is nothing particular to "Data" in these three packages. All I see is "Science".

All scientists need numerical simulations or evaluations, plotting, and statistics. That's what these libraries do. Not "Data Science", but "Science".

The correlation with TensorFlow/Keras suggests Data Science, maybe. But I bet there is a similar correlation with scikit-learn, Scipy, scikit-image, basemap, and heck, Matlab, Julia and R--and those each suggest different scientific endevours.

TL;DR: In my opinion, Python is becoming the language of Science, which includes, but is not limited to, Data Science.

I'm still at a loss as to what type of science there is that doesn't have data.

"Data science" to me seems like a buzzword that has connotations of a field or industry starting to use science where they previously didn't. Over here in academia it's just "science", or possibly "data analysis" if you want to emphasise that that's the part of your project you're working on right now as opposed to data collection.

Theoretical physicists don't always use data. Many scientists sometimes use case studies instead of systematic data collection.

Eh. Case studies is still data. And the theoretical physicists that don't use data are criticised as not really doing science. Most theorists I know (I am in physics) are heavily data-driven.

You seem to treat "data science" as a subset of science, but it's really not - it generally is used to refer to the application of scientific methods to data analysis outside of "the field of science" in various business environments, and that is a much wider field with more people than those actually working in science.

Python has a reputation for being a good "beginner" or "scripting" language. Describing it as such is really a disservice.

One of things that makes Python a killer language (apart from its readable syntax and straightforward programming model) is how it enables you to jump in at any skill level and be immediately productive, without hampering your ability to get things done quickly and elegantly as your skills grow.

There are faster, trendier languages out there, but I suspect Python will still be thriving when many of them have become footnotes.

This has been such a frustration of mine when mentoring people through learning to code.

Python is a great beginner language, but it isn't a language that is only for beginners.

The interesting thing, to me, with Python is it can fall out of favor but it seems very good at coming back. Look when Ruby got big and became the most popular scripting language. yet Python kept trucking on, until the data science explosion that made it having things like Numpy and Scipy matter so much (and then from there libs like Tensorflow).

Python is what BASIC was in the 1970s and 80s, in a good way.

It enables people that normally would not program to do things that they otherwise could not be done with traditional software with a GUI, that being web based, apps or desktop applications.

It is a stepping stone into computing that otherwise would have been inaccessible for many. Unlike BASIC, the design of the language is good enough that it can be used in professional applications, and not only being a stepping stone.

because readability is one of the most, if not the most, important quality for mainstream programming language adoption.

As much grief as Python gets for significant whitespace, I think it really helps the readability. Not only does it enforce what should already be good indenting habits, but it eliminates the noise of braces.

I've always had the opinion that computers should serve the needs of humans, not the other way around.

In languages where white space is insignificant, it is pretty much universal that we use syntactic elements to communicate block structure to the compiler, and white space to communicate block structure to humans. Having two different ways of communicating block structure is: a) redundant and wasteful, and b) a source of bugs.

So, if you buy into the idea that having two different ways of communicating block structure is somewhere between an annoyance and a language design defect, and also buy into the idea that computers should serve the needs of humans, then it follows that the compiler should extract block structure from source the same way that humans do. That being: significant white space.

The {} languages seem very primitive to me any more.

You could argue, using the same logic, that when using {} languages, the computer can serve you by automatically doing the indentation anyway. This is the case in practice: I never have to worry about indenting my code in C, as support for that in most editors is great.

{} languages have the advantage of allowing you an immediate restyling of the code when the original author (who can be of varying skill level) wrote something you don't like looking at, or when you want to change the style of your own code for various reasons.

I find that Python-style blocks generally cause me more mild annoyances than {} blocks. A frequent thing that happens is copy-pasting/moving a line from some place to another with a different depth: In C, I paste and run my automatic indentation, while in python I have to specify manually by how much I want to indent my new code so that it can know within which block I want it. ... And it's not like you could do so automatically: if I want to paste something after the code of an "if ...:" block, the editor cannot know if I want it inside or outside the block.

On a more abstract level, I personally prefer a language where the end of a block is explicitly specified by the presence of a visible character rather than by the absence of an (arguably) invisible one, and where the form of the code does not affect its function in any way...

I'm not trying to argue that much against the python logic though, because the potential for disastrous style is indeed limited, it has its specific advantages, but I don't think delimiter-based languages can be qualified as more primitive just because of this.

I agree that modern tooling largely erases the tedium of getting indentation right. Whether C++ or Python, selecting a block and hitting either tab or shift-tab fixes indentation. (Which beats the heck out of re-indenting on punch cards, which I'm old enough to have used for CS homeworks...)

And, yes, it is hard to argue that presence of {} makes a language objectively more primitive, which is why I used the word "feel"... the syntactic sugar necessary to keep the block structure straight just looks like noise to me any more. But as with many things, it comes down to personal preferences.

I find that the amount of bugs I tend to get when I copy-paste code far outweighs any potential aesthetic pleasure I get from seeing no begin-end block identifiers. Also, I don't want to let python have credit for not having those block identifiers because python has the colon begin identifier, which is completely redundant.

I always figured the people complaining about the whitespace thing are people whose code I really don't want to be reading anyway

Why is that?

My code has whitespace that reasonably closely matches Python's rules regardless of the language. I only say "reasonably closely" because I don't spend enough time in Python to really be sure about any potential gotchas.

But like PEP 20 says:

> Explicit is better than implicit.

Brackets are explicit. Whitespace is implicit.

Indentation is explicit.

Indentation is implicit, because you can't see the characters that cause it without turning on whitespace rendering. You only can tell it's there by either moving the cursor around or by observing gaps between characters and other characters or characters and the left margin.

And the end of a block, the lack of indentation that denotes that, is absolutely implicit.

The problem isn't whitespace period, its that if you aren't careful the significant whitespace can bite you, for example people sometimes accidentally mix tabs and spaces, which can get... ugly.

Personally I prefer {} blocks so I don't have to know what sort of white space is creating the indentation, even though I do miss using python sometimes and may come back to it for the right project.

> accidentally mix tabs and spaces, which can get... ugly.

This is trivially solved. You can find and replace with sed[0] if you want to be fancy, or just use your favorite text editor/IDE. The teams I've worked on have never had a problem with this. There is always an established convention in a project, and if you're breaking the convention -- e.g. because you prefer tabs -- you're doing it wrong, just by virtue of breaking the convention. If your project is new and you need to set a convention, do that. (also, PEP8 recommends spaces, so there is an authority to reference as a tie-breaker)

[0]: something like: find ./ -type f -exec sed -i -e 's/ / . /g' {} \;

Furthermore, there is no need for anyone to care what spacing scheme their team uses, after configuring their editor on day 1 with that team. I have Vim convert Python code to tab indentation when opening files, and have it convert to 4-spaces indentation when saving a Python code file. No one should be wasting even a flicker of brain activity over indentation in the 21st century, regardless of how backwards everyone else's preferences are.

While you're at it, please also s/\s+$//g on save. Those unintentional trailing spaces really clutter up Git history, and that comprehensive, clean history is what enables the team leads to do their job without bugging you on Slack. It also signals carelessness, or at least a lack of care for personal tools, and has your name attached to it.

people sometimes accidentally mix tabs and spaces

I just tried with three different editors, and literally could not manage to get code in a file indented with a mix of tabs and spaces despite actively attempting to. Who are your co-workers who actually personally have had this happen to them in, say, this decade (not nebulous stories from the internet of "I heard once about this"), and how did they do it?

I know I ran into it at one point, but I haven't slung python in anger in like a decade, so maybe the editors most people are using have started preventing this sort of thing better.

Though I do know in Visual Studio (what I do my day to day coding in these days) I've had situations at work where tabs and spaces got mixed together and at some point it finally started complaining but not right away.

Lack of braces is visually appealing, but I really miss brace hopping in Vi - % to bounce between open and close brace. That also applies to languages with do/end instead of braces.

Good point, I use that feature in C++ quite a bit. But I find that my Python programs tend to use smaller blocks, I don't know if it's just because I'm solving smaller problems or because it's easier to break things down in Python.

I've had mixed feelings about it, but I think one of the better things about it is that it forces less experienced scientist and data analysis programmers to use consistent whitespace.

It does cut a large chunk of issues off. Also helps "canonicalize" idioms. That said you can still have mess from time to time. But that's alright.

But significant whitespace means that I can't write code like this:

   sub (@) { return map { "@{[ $_[0] . $_ ]}" } @$_ };

Disclaimer: The majority of the programmers that I have known in real life that have a significant (and very vocal) dislike of Python's whitespace requirements have all written code like that above (this is my from-memory reproduction of actual code that one of them wrote). Python sucks because it's "restricting" them from being "expressive" with their code.

Phew -- it's an automatic filter.

Sadly, you can write code just like that in Python. I don't know what that perl snippet does, my caches have long since flushed ;) -- but here's a Python snippet that's relatively dense:

   def foo(y): return lambda x: {str(i) : '_{}'.format(i**y) for i in range(x)}

For statistical programming in particular, readability of your code by others is the most important part of the coding style.

That's absolutely true of any programming.

Yep. An order of magnitude more time is spent reading and maintaining a line of code than is spent writing it. Readability adds a tiny amount of time to the writing, but makes reading and maintainability so much better.

Glad I read your disclaimer before responding. :P

Was just reviewing colleague's code with them the other day. Hit some blocks like this (not quite as offensive, but tackling a bit in one line), and hell if they knew what was going on in their own code.

I've never posted about whitespace sensitivity before today, so maybe I don't fit in the group you're pointing at, but I don't like Python's whitespace sensitivity either.

Not because I want to write everything on one line, but because it's an extra step when trying to understand what code does.

For instance, a piece of code is misbehaving, not being called as part of the block it visually appears part of. In a bracket/paren language I can highlight one bracket and my editor shows which one matches. In Python I have to figure out what the indentation is built from, spaces or tabs, and make sure there's no ambiguity there.

We have developers here that are claiming they wrote things in Python for their PHD, and yet when I look at their code in any language there are tabs and spaces mixed in the same line, with inconsistent indentation. Multiple people have tried to express the issue or teach ways to deal with it, to no avail.

The fact of the matter is that whitespace is by definition invisible, and that makes it hard for people to grasp its importance. Basing control flow on invisible characters is asking for trouble.

To quote PEP 20:

> Explicit is better than implicit.

Brackets are explicit, whitespace is implicit. You only can see whitespace by the gaps between visible characters, and then you have to guess. Is it an actual U+0020 like you see 99% of the time? Is it an nbsp? It it a tab that happens to stop after the width of a space? You don't know until you encounter an error or turn on whitespace rendering.

Was it copied and pasted from www.python-help-7575.com, which uses a platform that favors punctuation spaces? Was it copied from a site that didn't wrap their code in <code> or <pre>? (I see this all the time in comment systems on programming blogs) Should we claim that copying and pasting code in a beginner targeted language is something you do at your own risk?

> In Python I have to figure out what the indentation is built from, spaces or tabs, and make sure there's no ambiguity there.

The following program throws a TabError in Python 3:

  def demo():
  	print("1 tab")
          print("8 spaces")
It does work in Python 2, but I'm pretty sure the majority of developers have 4 space wide tabs by now, meaning they won't be able to mix tabs and spaces in Python 2 code either.

EDIT: So Hacker News has code blocks, but doesn't strip the indentation that is required to introduce them. So you'll have to lead the 2 leading spaces in each line if you want to try this out for yourself.

> So Hacker News has code blocks, but doesn't strip the indentation that is required to introduce them. So you'll have to lead the 2 leading spaces in each line if you want to try this out for yourself.

Your example is good, but the technical issues experienced in posting your example are perfect.

Alternative syntax for short one liner functions (e.g. As recently added to C#, always in scala) can reduce a lot of this friction without sacrificing formatting guidelines. However, I'm not sure if it's in python's philosophy to explore such things.

Have you ever looked e.g. at the distutils code or tried to figure out how to use it without using Stackoverflow?

That function is easy to parse if you take 20 min learning the syntax. Object oriented monstrosities aren't.

What does that do?

So let's break it down:

It's an anonymous sub

It has a prototype that says it accepts a list of arguments (this is the default in perl though).

Prototypes are mostly for giving hints to the compiler on how to unambiguously parse calls to your function later. For an anonymous sub, it's pointless because you assign its value to a variable, and then brackets are required when calling it.

It returns the result of the map (this default in perl is to return the result of the last expression & everything is an expression though)

same as python map(code, list)

    { "@{[ <some stuff> ]}" }
This is a particular perl goof for interpolating array lookups into a string. Usually, perl's sigils let variables be interpolated without any issue, but since we're using $_[0] to use the first element of the @_ array, it's ambiguous with $_ . '[0]'.

The @{} dereferences the contents of the braces as an array, and [] creates a new arrayref. The arrayref has one element, and perl's default stringification of an array (join them all with no separator) means that it… iterpolates the thing inside the @{[]} into the surrounding string.

This is pointless though, since it's the only thing in the string. The author might have been trying to give a string context to force stringification of the thing, but the dot operator already provides that context.

    $_[0] . $_
This is a string of the first element of … hmm. Not of the currently-mapped element (that's $_). $_[0] refers to the first element of @_. This makes a new value which is the concatenation of the first element with the current element.

This is a short way of writing @{ $_ }: dereference the arrayref that's stored in $_. But $_ is not set anywhere, so I assume this is a typo of @_

I would have written this as:

    my $prefixer = sub { my($prefix , @rest) = @_; return map { $prefix.$_ } @_ };
And it would work like this:

    say $prefixer->('Hi_', 1, 'str', 7);
    > 'Hi_Hi_Hi_1Hi_strHi_7'
The 'best' way to write it would have been:

    sub { $_[0].join($_[0], @_) }

The python equivalent is

    lambda *lst: ''.join([str(lst[0]), str(lst[0]).join([str(_) for _ in lst])])
Which is uh… begging for an initial line of "lst = map(str, lst)"

If I follow correctly, the equivalent in K would be

Flatten (,/) the first element of the list (*t) joined with each element of the list (a,/:b), where t is the input x cast to a list of strings (t:$x). Curly braces make the expression an anonymous function, and since no argument names are specified explicitly the name x is the single input argument.

A more idiomatic solution:

    lambda *lst: ''.join(['{}{}'.format(lst[0], _) for _ in lst])

Wow, that was a journey. Thanks.

It's syntactically correct, but pretty much nonsensical.

(I guess the original code did something useful and looked very similar to this.)

With Python even the code I wrote as a junior developer basically stands up five years later, looking back as a senior developer. There are definitely things I would have done differently, but there isn't any code where it would take several weeks just to figure out what the hell was going on and then another several weeks to make some modest improvements.

While better than perl, as someone who has to maintain others python code the tendency to want to write stuff compactly makes it difficult to maintain than other languages in my opinion (the 5 others I work with). I've found other languages easier to read.

It might just be the code base I'm working with..

I think there are two forms of readability: localized readability and whole-program readability. Python only satisfies the former. Your statement about readability being one of the most important qualities I believe is true, but more for whole-program readability in my experience.

Agreed. Working on a large Python codebase, I find myself greatly missing Java's improved static analysis, even for simple things like "find the definition of this function".

If you aren't ideologically opposed to the use of an IDE then PyCharm solves this for the vast majority of cases. I get ~90% of the editor functionality I'd get in a statically typed language in PyCharm. Automated refactors and all that jazz...

Why do us engineers have a tendency to oversimplify everything to one big major cause like this? I see it all the time. Obviously in this case this is not the only reason. The real reason is a mix of many influences.

Golang forces readability; might want to give it a shot.

Just the other day I started looking back into Golang. Syntactically, and with Gofmt, it is quite readable.

But then they have to go and ruin it by making single letter variables a common thing.

based on Stack Overflow question visits

Is Python use really growing? Or is this a Stack Overflow thing? I thought it was being displaced by Javascript Everywhere.

Python had a 5-year setback from the Python 3 transition, but we're now mostly past that.

Well, most devs I know would rather use python for server-side and scripting, but you're not wrong. Python never found a way to do cross platform UI as well as react native; it's on every browser; you don't have to unlearn brace syntax.

That said; the idea of something like numpy in javascript is years off and laughable.

It's certainly growing here in the UK. From being level with Ruby 5 or 6 years ago the Python:Ruby jobs ratio is now 2.9. I've also watched Django recently edge ahead of Rails for the first time.

I think if anything the SO stats underestimate the pervasiveness and popularity of Python, even outside the data science world.

Javascript isn't displacing Python or any other language in data science. The web doesn't encompass all of computing.

Really simple answer...

University undergraduate classes in a variety of courses use it as their default language, so it is a bunch of students trying to complete their homework assignments.

OK Stackoverflow, that's enough charticles about this language or that, and which one's growing and which one gets used by whom at what time of day and in relation to which other query etc. It's like they didn't expect the attention they got for these, and now they're addicted to it. Nothing against Python... because if anything my underlying point might be that language doesn't ultimately matter that much.

But besides, there still remains the same question as always: Doesn't the posing of Stackoverflow questions tend to reflect more who's starting to use a language, than the sum total of who uses it per se? (Since presumably questions taper off with time.) If you're measuring growth or adoption, as in this case, sure, it might seem like a reasonable approximation, but it doesn't tell you anything about long-term use (by people who become experts and stop visiting Stackoverflow) or about attrition (people who quit using it and move on to something else). Thus any trend Stackoverflow can see, probably overemphasizes short-term phenomena. Which is great if you're doing algorithmic trading on the stock market and make your money by responding to the market's every short-term boom & bust (and other investors' overconfidence & diaper-crappings respectively), but not so great if you're choosing a language to which you'll devote months of study.

Python is a great language, despite its quirks. Imagine how popular Python would be if it had decent basic documentation and a single active version? The confusion caused by Python 3 caused at least a decade of slow adoption and the docs, well, that's pretty much a lost cause. (Compare https://docs.python.org/3/library/os.path.html to https://nodejs.org/api/path.html - the latter has a nice summary at the beginning, an index of methods at the top of the page listing everything below, method inputs and return values are noted for each method clearly, concise non-interactive terminal examples that you can copy/paste for each method is given, the exact version the method was added, a collapsed history of changes and any error types thrown. My only assumption is that the Python maintainers must have a side deal with O'Reilly to keep the docs in the worst format ever as a source of continued revenue for both.)

Python's documentation is terrible! I do like Python (only been using it for about a year), but the docs make it so hard to figure out how to do something.

I'd say that, over the last twenty years, Python's docs have gone from excellent to good to middling to excellent to good to middling (dropping as the language expanded and the generally expected standard went up, and improving when they did one major revamp).

So I agree that another revamp would be good now, but I historically the docs have been a plus as much as they've been a minus, relative to Python's alternatives.

Much plain reference data is missing from the docs (like return types). Its neat to describe a function, but if you don't hand out the API, then the docs are not all that useful.

I'm doing mostly Ruby now and I miss the Python docs.

Many have listed out the technical reasons, but I think there's a cultural aspect: I think many people are turned off by the community of languages like Ruby. Sometimes you just want a great tool, not unicorns and kittens.

What exactly are you talking about here?

Are you referring to Github, Shopify, Basecamp and many other functional businesses scaled and built on the shoulders of ruby "unicorns" and "kittens" ?

Having worked with both languages, I can confirm that they simply have different view points with tradeoffs, and that in itself is a matter of personal opinion.

But claiming one community does not build great tools because you view them as "kittens" is ridiculous.

Also please do not generalize things like "most people" prefer X.

[EDIT] I sincerely hope this comment does not come off as rude. I am just trying to understand OP's point of view

From my outsider perspective, equivalent Ruby tools advertise themselves a the Most Awesome Thing Ever(TM), with well-designed pages, funny characters and exceedingly strong developer personalities.

In comparison, big Python projects have a lower profile, a more serious, adult tone, and generally don't brag too much about their capabilities.

In practice I've seen that the Python community's greater skepticism towards new packages means that libraries have evolved more slowly but with more certainty, whereas Ruby libs have been swapped out as fashions come an go. Witness the fact that Rails decided to package Coffeescript at its core, whereas Django has no strong opinions on frontend matters. Also look at Django's mature deprecation policy compared to pretty much everything else.

I think he means those intro to ruby tutorials where they make ruby (and programming in general) sound all magical, where it's so easy to program and programming is just unicorns and kittens or whatever. It becomes tiring and dumb after a while.

Add in ducks and geese and a whole slew of other animals and I end up wondering if I'm programming in a zoo with ruby lol.

Your comment is a little bit snarky. But I have to admit that stuff does put me off

The most interesting part of the post is the conclusion where the author says that he's a data scientist who uses R and doesn't feel the need to switch to Python!

He also has very good thoughts, and very important for beginners to know, about how there isn't a single right way to learn programming.

> In any case, data science is an exciting and growing field, and there’s plenty of room for multiple languages to thrive. My main conclusion is to encourage developers early in their career to consider building skills in data science. We’ve seen here that it’s among the fastest-growing components of the software development ecosystem, and one that’s become relevant across many industries.

How it that interesting? Statisticians and economists everywhere use R instead of Python

The fact that the author wrote a sizable post about how big Python is and is growing, but then says he's more of an R type of person instead of Python. Obviously that's interesting to see when authors aren't giant fans of what they're trying to promote.

What's particularly interesting is that you can tell someone's academic pedigree from whether they use R or Python. As you mentioned, R is very popular in stats, while Python is very popular in computer science. In fact, you can tell whether other academic fields are more influenced by stats or comp. sci by this measure as well. R in econ is a great example of this, while fields that build simulations to do research tend to skew Python.

My simple answer to this, apart from the arguments laid down in the article, is that the Python core devs (and community) have found a unique balance between sobriety and adventurousness. I think this explains why the language was adopted by data scientists and ML researchers. My complaints with other languages usually center around them being too preoccupied with theoretical or academic concerns (e.g. Haskell) or being too loose in adopting new features or opaque syntax (e.g. Javascript).

Also, Python is probably the most concise and easy-to-read imperative language that has gained wide acceptance.

Besides the obvious answer: pseudocode alike, also don't forget: worse is better.

Of all major languages is by far the slowest, hardest to extend and by now easily surpassed PHP and JavaScript with its number of design quirks.

I don't think SO questions is the best way to gauge the popularity of a language and especially not a package.

I big reason why Pandas is so popular is because its API is so complicated. I probably have 10x more experience using Pandas than Django. And yet, when I'm doing something in Pandas I'm visiting the documentation far more often than Django. Let's compare the top level APIs:

>>> import pandas as pd

>>> len(dir(pd))


>>> import django

>>> len(dir(django))


Also, I find the data manipulation Pandas provides is far inferior to the R Hadleyverse and is one of the few things that R does better than Python.

Django has a very deep package tree, there are tons of stuff in there — a lot of things you wouldn't expect.

Django admin could fill a section by itself.

Because programming in Python is great for productivity, has lots of useful libraries, is rather bearable/enjoyable depending on your view point and for 90% of people/companies you don't need the speed of anything more than PyPy or Cython because, believe me, you're not at Facebook or Google scale, and even those companies often use Python (Google most famously).

Btw, for those of you who dislike the dynamic typing (e.g. myself) of Python, check out Cython, it's really heaven.

I'm a Python answerer on Stack Overflow, I teach Python at NYU, and whaddaya know, I write Python for my day job.

Why is it growing so quickly? I speculate:

Lots of people want to learn to program - and Python is easy to learn and teach.

I think it blows other languages out of the water. I'm going to compare it to its closest competitor, Ruby. Ruby is also open-source, high level, popular relative to other languages (which means you can probably get help if you need it), it has lots of libraries (meaning you can do a lot with it), ok tutorials, and a decent package manager. On the downside, Ruby is also a very large language, which makes it much harder to learn to mastery.

I think Python blows Ruby out of the water in terms of how easy Python is to learn relative to Ruby.

I don't encourage people to learn Ruby. I spent a long time (months) trying to learn it. I still don't tell people I know it.

I used to think that Ruby's syntax is twice as much to learn - now I think it might be even more than that.

Ruby's YACC file is ~918 lines long[0], while Python's grammar file is 149 lines long.

    ~$ wc cpython/Grammar/Grammar 
     149  879 6472 cpython/Grammar/Grammar
Even if 1/2 of Ruby's grammar output is whitespace, that's more than twice as much grammar to learn as Python's grammar[1].

I could compare Python to other languages, but I don't think they come close to matching the upsides of Python.

I still want to learn other languages, like Haskell, lisps, and C, to a level of mastery - but since I can be so productive so fast with Python, the other goals are something I work on in my spare time without a bunch of urgency.

[0] https://stackoverflow.com/a/16629318/541136

[1] https://docs.python.org/3/reference/grammar.html

PS I can see that this comparison to Ruby isn't popular due to the downvoting - but nobody is contradicting my point.

Cough. Python's cpython/Grammar/Grammar uses a compact EBNF notation. Converted to yacc it is about 850 lines long.

Now that powerful machines are commonplace, the biggest limiting factor of software creation is developer productivity.

Python's so popular because it maximizes developer productivity.

I say all the following as an avowed Rubyist

1) Computational Linguistics libraries

2) Machine Learning libraries

3) Still one of the blessed languages internal to Google? and BFDL was employed for a while there

4) Jupyter Notebooks

5) Coder Dojo approved language

I used to think that all dynamic languages (Ruby/Python/Javascript/Perl/…) were of a par but I have to admit that Python has an awful lot of mindshare and it's beginning to exhibit strong network effects.

>I used to think that all dynamic languages (Ruby/Python/Javascript/Perl/…) were of a par but I have to admit that Python has an awful lot of mindshare and it's beginning to exhibit strong network effects.

Ruby is a wonderful language with a really unfortunate standard implementation. It's easily 5 times slower than Node at practically anything, which is sad because Ruby is infinitely better than Javascript. I really think this is what it comes down to and why Rails has been dying off.

That's why I have high hopes for Crystal. Yes, it isn't exactly Ruby, but it is a compiled language that has a very Ruby-ish feel.

Agreed on Ruby as a language, but Node also has the advantage of being isomorphic. It's hard to beat JS when it's the browser language, and then you provide a really nice server side platform for it.

Plus there has been a reaction to Rails size and complexity. I kind of hate that everyone thinks of Rails when Ruby is mentioned. But I thought Sinatra was a really nice microframework.

> but Node also has the advantage of being isomorphic. It's hard to beat JS when it's the browser language

This has repeated often but so far I haven't found a really convincing argument. All my friends, who know C, C++, C#, Java (and other languages) had no problem in picking up Javascript/ES7 for programming the browser using whatever you need.

I think this argument was exciting for the front-end developers who wanted to try doing things on the server but couldn't be bothered into learning a new language. So now with Node.js they are happy and satisfied, until they realize that the problem is not the language; the problem is that server programming is a different matter altogether, no matter what the language is.

All that beautiful meta-programming flexibility has come at a price. The research projects sponsored by the likes of IBM and Oracle to give Ruby a much needed speed boost are very slow in the offing – it feels like V8 was developed overnight in comparison. Aesthetically I think Ruby beats Python/Javascript/Perl but unfortunately that's not enough to move the needle on the dial.

It'll be interesting to see whether Python or Ruby or Ruby get to WebAssembly first – I'm correct in saying that none has of yet, yes?

Common Lisp found a way to be performant. Is Ruby really that much harder to optimize than other dynamic languages? JS had the advantage that there was strong incentive to make it fast in the browser, and V8 had Lars Bak working on it.

Apologies to everyone for turning this into a Ruby thing on a Python topic.

You can check out the Oracle Labs code and progress of Graal/TruffleRuby here: https://github.com/graalvm/truffleruby

Relevant page: https://labs.oracle.com/pls/apex/f?p=labs:49:::::P49_PROJECT...

And IBM's Ruby+OMR code and progress is here: https://github.com/rubyomr-preview/ruby

Relevant post: https://developer.ibm.com/open/2017/03/01/ruby-omr-jit-compi...

Can Matz somehow in the nicest possible way kidnap Lars Bak and pay him in gummy bears or something?

Ruby needs Matz's 3x3 yesterday as it's haemorrhaging mindshare all over the place. Node is eating its lunch in web development, Golang is encroaching on the niche held by Chef and Puppet and Elixir offers the missing concurrency. Consequently Ruby is slipping, little by little, in the job statistics. There are just too many alternatives for it to grow unless Truffle Ruby or Substrate VM come up with something soon.

> All that beautiful meta-programming flexibility has come at a price

Scheme, Common Lisp, and Julia all have good metaprogramming facilities (i'd argue, for Scheme & CL, way beyond what Ruby offers), and they are highly performant, indeed much faster than Ruby, Python (even under Jython) and Javascript under V8.

Ruby 2.4 and Python 3.6 are the same speed unless you're using a super-optimised library such as numpy. In fact, for string parsing Ruby is faster than Python due to its superior regex implementation. PHP 7, on the other hand, is much faster than Python or Ruby. I don't know how Zend pulled it off.

> I used to think that all dynamic languages (Ruby/Python/Javascript/Perl/…) were of a par

Try Lua, Common Lisp, Julia, and Racket, for a surprise.

I have mostly left Python and moved to R, Racket and Lua for my needs. It is hard to see how Python won't stay the predominant first language for decades to come just due to momentum and the good reputation.

My first language was Basic and my second language was C64 Assembly and Pascal. I would have killed for Python as a kid.

It's the libraries I think. that's why I started using it. I don't think it's particularly different syntactically from most other languages. The built in data types are nice, though why they insist on using different names from other languages (lists, dictionaries ...)

Better editors help too.

I may be biased, but I much prefer "dictionary" to "hash table" and "list" to "array." They provide much better metaphors to real-life things, instead of requiring people to learn and use additional jargon.

It's dead simple and people complain about speed but I use it for massive distributed backends handling hundreds of messages each second and scaling without problem.

And this was way back when twisted was the only asynchronous game in town. So I'm not surprised it's growing.

How do people create and manage internal python packages that work together as an ensemble? I've found the packaging system with setup.py dependency_links to be pretty cumbersome and not well supported. How do people version these internal packages and manage them? Running your own pypi seems nontrivial and using github eggs/releases isn't very functional either.

My experience with python has been positive but I don't understand how people who want to create separate libraries for specific functionality tie these codebases together without using deprecated features or containerizing everything.

Gemfury provide a service which might be useful.

If you need to keep things on premise, is running your own pypi particularly hard? I've never tried so maybe there's a bunch of hidden complexity but there are some packages to run one and even a docker image: https://hub.docker.com/r/codekoala/pypi/builds/bwe6cdn4swgyi...

pip2pi makes it really easy to run your own PyPI. All you need is something that can serve static files (Nginx, S3, etc.)


The obvious answer to me is the slow trickle down of people from universities as CS programs have migrated from Java/Scheme to Python. This also overlaps with Stack Overflow usage, so I think that explains a lot.

Python has been continuously becoming the entry point of computer programming. I think the language reached that goal and now people who started with python are contributing in different fields.

Personally I met with python thanks to course assignments. It helped me to finish my projects way faster than any other language. I loved its simplicity and now even in my complex projects, I tend to solve my problems in "python way".

Also modularity, almost non-existent boiler-plate code, etc kind of things attract data scientists without computer science background.

#1 - it comes installed on most Linux and on Mac (even though different versions can be a bit of a pain sometimes)

#2 - it's dead simple to do things; the syntax is, to me at least, about as direct and uncomplicated as you can get (and without being overly wordy!)

#3 - so many handy libraries... it's good for full projects, and it's great for integrating and gluing things together

I wouldn't call programming in Python fun, per se, but I do call "getting things done quickly and easily" fun.

It's not only because of data science of machine learning. Big, high profile projects are being done in Python (eg. Openstack).

I am at a big company, and it seems most new projects are being done in Python. No one asks why, it is just accepted.

Now the question is why. I have no explanation, other than maybe that's because universities have been teaching python more often, and it happens to be practical.

I just wish everyone were using Python 3.

As someone introduced to Python and C two semesters ago, I'm not surprised at all. If I want to be really nerdy and crank some performance, I write C. If I want to test ideas, write to a text file, or basically anything else, I use Python. Python just makes it all super easy and nice. I respect the performance gains of C, but ease of use will always win.

I've started to use python a lot for 3d scripting on Blender and Maya. On Blender this is the only option.

Python has been in the top 3 of hiring posts within "Who is hiring" threads for at least 4 years now[1] so interesting to see other datasets noting Python's popularity as well.

1. https://hntrends.com

Why is Python So Aggravating?

That's a question that bugs me. I can't believe the level of obviousness of the things that are wrong. That is because we write complex code. Things like direct access to data members (which is 'pythonic'!) are cruel blows to software productivity for complex code. Ugh, Ugh, let me count the ways: No compiler to check the code. The other day I dumped 800 extra lines into a file as a typo. Many, many duplicate functions. Of our whole test suite, how many tests failed? One!!! Ugh. This is not even to mention the topics of Black.Holes.In.Your.Code because you can't tell what types the functions receive. I really think python is driven by non-programmers who consume rather than produce complex code and the teachers who teach these people. I believe the problems will increase over time. Does anyone know the story of Hack at Facebook? I heard the lead developer give a talk once. Be afraid.

Keep in mind that there's python development practices which might be adequate to solve your problems.

> Things like direct access to data members

I think you're complaining about (lack-of-) encapsulation. Between methods, functions, and properties, there's ample opportunity for encapsulation. The convention is for internal-use-only members to be prefixed with an underscore.

> No compiler to check the code.

In fact there are many static checkers. pylint, pyflakes, etc.

> This is not even to mention the topics of Black.Holes.In.Your.Code because you can't tell what types the functions receive.

In fact, you can use type hints [1] to do this.

> The other day I dumped 800 extra lines into a file as a typo. Many, many duplicate functions. Of our whole test suite, how many tests failed? One!!! Ugh.

This isn't quite explicit enough for me to understand. You unintentionally redefined a function multiple times in the same file, and your test suite caught the error. You think that instead Python should've refused to execute/import this module w/redefined functions? Hmm, I suppose it's a very uncommon need to redefine functions but it's not uncommon to assign a function to a file-scope name. Also keep in mind that both pylint and pyflakes detect this mistake.

[1] https://docs.python.org/3/library/typing.html

> I'm not sure that you've taken the time to understand python development practices.

I think this tone is misplaced here.

I think you are inferring more tone than was intended but I will edit it because it could be considered rude.

> This is not even to mention the topics of Black.Holes.In.Your.Code because you can't tell what types the functions receive.

Use aptly named parameters, always, add assert() statements for type checking whenever you feel mistakes could happen.

I can bet, based on my experience, that it's much "safer" (less error-prone) to use a language that allows calling functions with named parameters (where the order of parameters doesn't matter and you know exactly which parameter are you sending a value into), than a language like C++ or Java which has a static typing system but has no named parameteres, plus depends on argument order to call a function.

You should try using Python in this way.

We do that. Try developing a complex code base with a medium sized team. It becomes very hard to make changes!!! You just need compiler support, IMHO. Python offers ease and fun, and that has led to an explosion of functionality. But there is a tradeoff. And at some point we will hit that wall, as an industry.

Just use kotlin or golang. You get a modern language and you get types. And you still have java libraries with kotlin.

I totally understand the lack of typing & static compilation making things painful once you reach a scale where you can't put the codebase in your head. Asserts, unit tests and variable names are poor substitutes. As an interm solution have you tried converting your internal python code to be fully typed with 'optional types'?

>golang. You get a modern language

Go is far from a modern language, being more like a mutant, crippled subset of Algol-68, which, as the name says, dates from 1968...


Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact