
Ask HN: What about Python makes it popular for ML? - ccajas
I don&#x27;t have experience with Python, but from my only superficial point of view I don&#x27;t see it much more powerful or flexible as a VHLL than, say, Ruby or Perl. What made Python the high level language of choice for machine learning? Was there just a couple influencers at the industry that decided to build ML libraries for it because Python is already a familiar language to them? Would the language choice been immaterial if they were working with a different high-level language?
======
eesmith
My two bits. A bit pedagogical, so apologies for covering things you might
already know.

Python effectively started in the early 1990s. Perl was the hot new language
for "scripting", but Perl 4 was lousy as an embedded language and didn't
support extensions. This was fixed in Perl 5, which also made it possible to
do complex data structures. I think early on the extension mechanism was also
Unix-centric, and require a Makefile?

Tcl was amazingly easy to embed and extend. It was designed with that in mind.
It wasn't so good at complex data structures. OO programming was an add-on,
with several different available extensions. I also recall some problems with
garbage collection of extension objects ... it's been a while.

Python was possible to embed, and easy to extend. It had support for more
complex data types. It was a better fit than Perl 4 or Tcl for 2D and higher
arrays. As such, it was picked up by the more numerically oriented developers,
who used Python to, using the popular phrase from then, 'steer'
computationally intensive jobs.

Python-the-language also changed to make it easier to support multi-
dimensional array indexing, specifically for numerical computing.

(Perl at the time was very popular in bioinformatics, which at that time
involved a lot of RE-style pattern matching, a lot of format munging, and a
lot of making different pieces work together - things that Perl excels at. But
not numerical computing.)

As a result, Python had several iterations of Numeric/numarray/NumPy while
Perl, Ruby, and Tcl - languages which were used more by non-numeric subfields
- did not.

Then came the web. First Zope, and later Django, helped make Python a popular
language for web development. The easy extension mechanism made it easy to add
all sorts of C extension modules.

And Fortran extension. And R extensions through adapters.

A lot of the early ML code was written in C, C++, Fortran, or R. The natural
solution is to put the data into a NumPy array, pass it via some bindings to
the other language, and continue to work with it in Python.

I don't think Perl and Ruby had the same level of numeric support +
extensions, making it more difficult. Python got critical mass, and continued
to grow.

People have been doing machine learning with Python since the 1990s. I can't
think of who the "couple influencers" might be.

------
jimnotgym
1) Libraries: scikit-learn, Pandas, numpy, make it easy to work with data.

2) Easy to learn but powerful, lots of examples online, lots of good books,
great community.

3) IPython/Jupyter workbooks make it really easy to work through a solution
and share it.

~~~
stephen82
You just saved me from typing the same things more or less! hahaha ^_^

------
lordCarbonFiber
The huge thing for python is it's ability to call out to lower level (faster)
languages. From a machine learning point of view the ability to stuff data in
to buffers cheaply to be consumed by C/fortran is what's important (all of
those fancy math and data science libraries do the heavy lifting in
c/fortran).

[https://www.python.org/dev/peps/pep-3118/](https://www.python.org/dev/peps/pep-3118/)
for more of how this was accomplished to get a bigger picture of the story.

------
srinathkrishna
Simplicity is one of the main traits of Python. It’s conducive to prototyping
and GTD in a quick fashion. Libraries like scikit and pytorch have also helped
developers build larger solutions using smaller building blocks without
worrying too much about implementations.

Python suffers from serious performance constraints nevertheless and
productionizing an ML service that requires real-time analysis is going to
take some effort. For such systems, folks usually tend to lean towards a
hybrid stack.

