
Myths of Enterprise Python - rbanffy
https://www.paypal-engineering.com/2014/12/10/10-myths-of-enterprise-python/
======
plinkplonk
This is a fairly weak article, full of deflections around the real weaknesses
of python. I'm surprised it is on the Paypal Engineering blog.

Like all languages, Python has strengths and weaknesses, and there is no shame
in that. An honest article would address the negatives head on, acknowledge
them as potential negatives,(not skip around them) and provide alternatives, .

The strawman "Python has a weak type system" is a good example of such
deflection.

No one (who understands type systems) would complain that "Python has a weak
type system"

A more common "criticism" would be "Python does not have a _static_ type
system, which is handy for large codebases worked on by large teams".

Address _that_ upfront, and you have a decent article. This is just a fanboy
(who just happens to be at Paypal) praising his favorite language and ignoring
its weaknesses.

~~~
TorKlingberg
Or the criticisms of Python that he has encountered are just different from
your criticisms.

I have not seen anyone complain that Python is not compiled for at least a
decade, but maybe the author has.

~~~
TheOtherHobbes
I always read "not compiled" as a synonym for "slow."

I'm not worried about Python's type system. At worst it's manageable, at best
it's expressive for prototyping.

But when I see benchmarks that suggest Python is 10 to 100X slower for
critical server code, I have to wonder why anyone would use it for enterprise
development.

Which is why there are so many Java and C++ code jockeys working in
enterprise. Neither language is pretty or fun or interesting from a CS point
of view. But there's no arguing both consistently run faster than anything
this side of assembler.

I would have expected critical industrial infrastructure code to pay some
attention to that - because speed isn't an abstraction. When you're running
giant data centres, extra cycles consistently cost _real money_.

Dev costs are relatively small compared to operating costs. So it's well worth
spending extra time getting good, fast compiled code working.

~~~
crdoconnor
>But when I see benchmarks that suggest Python is 10 to 100X slower for
critical server code, I have to wonder why anyone would use it for enterprise
development.

Because CPU cycles are cheap and bugs are not.

>Dev costs are relatively small compared to operating costs.

Uhh, not in my experience.

~~~
aetherson
It really just depends on the project. Sometimes, all you want to do is, like,
take data, type-check it, maybe do a couple of simple transforms, and then
store it. But you want to do it 50,000 times per second.

In that case, it may very well be the case that ops costs absolutely dwarf dev
costs.

Similarly, it may be that what you want is to take data and run it through
super-complicated algorithms depending on a lot of business data, and massage
it all over the place... but you only need to do this 10 times per second. In
which case your dev costs may absolutely dwarf your ops costs.

------
mapgrep
If you want me to trust what you say about a language — or any technology,
actually — be forthright about its deficiencies.

Because I don't believe you really, truly understand a language until you can
tell me what sucks about it. It takes significant time (in a reasonably decent
language) to discover the corner cases, performance bottlenecks, quirks, big-
deficiencies-hidden-in-plain-sight and outright bugs in a language like, say,
python.

Something as "rah-rah" this, which goes so far as to basically call the GIL a
source of unicorns and rainbows, is convincing almost in inverse proportion to
its stridency. Suddenly I'm wondering if all these "myths" about python might
be smoke pointing toward a fire. That's probably a bit unfair, but it's hard
to know what to take seriously when you're listening to a voice that's less
than credible.

~~~
crdoconnor
>Something as "rah-rah" this, which goes so far as to basically call the GIL a
source of unicorns and rainbows

The GIL is definitely no source of unicorns and rainbows, but I think the case
against it is usually overstated. There are numerous ways of sidestepping it
(multiprocessing, pypy, C extensions, etc.), and it _does_ serve a useful
purpose.

~~~
w0utert
>> _The GIL is definitely no source of unicorns and rainbows, but I think the
case against it is usually overstated. There are numerous ways of sidestepping
it (multiprocessing, pypy, C extensions, etc.), and it does serve a useful
purpose._

That's definitely all true, but the article brushes it's implications for
multithreaded Python off as if they don't exist, which is what one of the
posters above me was probably referring to.

Yes you can do multiprocessing, but if I have lots of shared, volatile state
that's not what I want. Yes you can use PyPy, but if that doesn't work for
some python framework I use, or if I can't control my deployment environment
and it only has CPython I can't use PyPy. Obviously you can write C extensions
for about any language, using that as an arguments why the GIL is not a
problem for multithreaded Python is disingenuous. Maybe I don't know or don't
like to program in C? Green threads are not a substitute for multi-threading
either, as they still don't allow full utilization of multiple cores and are
really only a solution for I/O bound processing.

Of the 'numerous ways to sidestep the GIL' none are satisfactory if you have a
CPU bound problem operating on shared state, that lends itself well to
parallel execution, which are many. I wouldn't use Python to write a video
codec or to do DNA sequence processing for example. It's not a fatal flaw for
Python-as-a-language, but it's a flaw of CPython nonetheless, and not an
insignificant one.

~~~
crdoconnor
I wouldn't write a video codec in python either, nor would I write code that
requires a huge amount of shared volatile state, but these things are not
common coding tasks in general, particularly not in enterprisey-type
programming.

> Maybe I don't know or don't like to program in C?

If you want to write a video codec or highly performant multithreaded code,
you should probably give it a go.

>Of the 'numerous ways to sidestep the GIL' none are satisfactory if you have
a CPU bound problem operating on shared state, that lends itself well to
parallel execution, which are many.

You mean exactly like the matrix calculations done in the C extensions of
numpy?

~~~
Lofkin
Also numba. It is a multithreaded JIT compiler for a subset of python code.
Just need a decorator.

------
jrochkind1
Meh.... I think he's right that there's no good reason _not_ to use Python.

But I think he overstates the case.

Calling Python "compiled" doesn't seem right -- he's more right that this
doesn't matter, there's no reason not to use something because it's "not
compiled".

Calling Python's typing "strong typing", eh, really? He provide a link to a
wikipedia article on typing in general, how about a link to anyone else making
the case that Python should be considered 'strongly typed', for a particular
definition of 'strongly typed'? I'm dubious that it would be any definition of
'strongly typed' that those who think they want it... want. Again, I think a
better argument might be: So what, you don't need strong typing for success,
because of a b and c.

And the concurrency discussions is just silly. Sure, you can do _all sorts_ of
things without multi-core concurrency, you may not need it or even benefit
from it, and scaling to multi-process can sometimes work just fine too. But
there are also _all sorts_ of cases where you do really want multi-core
parallelism, and Python really doesn't have it -- and trying to justify this
as "makes it much easier to use OS threads" is just silly. I guess it makes it
'easier' in that it protects you from some but not all race conditions in
fairly unpredictable ways -- but only by eliminating half the conceptual use
cases for threads in the first place (still good for IO-bound work, no longer
good for CPU-bound work).

I think there's a reasonable argument to be made that Python will work just
fine for 'enterprise' work. There are also, surely, like for all
languages/platforms, cases where it would be unsuitable. By overstating the
case with some half-truths and confusions, he just sounds like a fanboy and
does not help increase understanding or accurate decision-making -- or
confidence in the Python community's understanding!

~~~
dgemm

      >>> 1 + "0"
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      TypeError: unsupported operand type(s) for +: 'int' and 'str'
    

This is what strong typing refers to.

~~~
reinhardt
It's not so clearcut. If this is evidence Python is strongly typed, would the
following be evidence to the contrary?

    
    
      In [1]: 1 + 2.5
      Out[1]: 3.5
    
      In [2]: "foo" * 3
      Out[2]: 'foofoofoo'
    
      In [3]: [1, 2] * 3
      Out[3]: [1, 2, 1, 2, 1, 2]

~~~
agentultra
It's not contrary, it's just convenience. One of the weaknesses in Python (if
you can call it that) is that everything is an "object." The classes
implementing string, list, and integer simply have methods that respond to
those operators and rhs types.

    
    
      In [1]: "foo" + 3
      ---------------------------------------------------------------------------
      TypeError                                 Traceback (most recent call last)
      <ipython-input-1-21582e79f06e> in <module>()
      ----> 1 "foo" + 3
    
      TypeError: cannot concatenate 'str' and 'int' objects
    

That's because the "special" method "__add__" implemented by the string class
will raise TypeError if it gets any object that isn't an instance of a
subclass of string.

In a way it is kind of funny to call it, "strongly typed," but it does work.
Maybe it should be called, "instancely-typed."

~~~
jrochkind1
Okay, with this definition of 'strongly typed', can anyone come up with an
example language that _isn't_ strongly typed?

It seems to make 'strongly typed' pretty meaningless, and this definition is
probably _not_ what anyone who says they want a "strongly typed" langauge is
using, so it hardly counters them to say that python is "strongly typed" under
another definition, it's just confusing them with semantics.

(Of course, the people who say they want 'strongly typed' may have no idea
what they're actually talking about, but wouldn't it be better to educate them
then to take advantage of their ignorance to push your pro-python agenda?)

~~~
Retra
C is not strongly typed.

In general, when someone is talking about strong typing, they are talking
about silent failure for unintuitive or ambiguous constructs. For example, if
the expression

    
    
        x = '1' + 1
    

results in an error, you are probably using a strongly-typed language. In C,
this is equivalent to

    
    
        x = 32;
    

In javascript you get

    
    
        x = '11'
    

In PHP you get

    
    
        x = 2
    

These are examples of weak typing.

~~~
agentultra
Strong, as opposed to weak, implying that there is type-constraint checking;
it's just done at run-time in Python.

You could also describe Python's type system to be dynamic in that instances
of the type meta-class define the constraints and objects (instances of a
class) can have constraints added and removed at run-time. Python is still
fairly strong in this regard in that the built-in classes are immutable (ie:
it is a TypeError to assign a bound method to an attribute on a built-in class
such as str).

I suggest "instancely-typed" because categories, unions, and type theory. I'm
only coming to grips with that in that OCaml's type solver can be both
awesome, annoying, and cryptic. And at the end of the day I'm still not sure
what it's buying me other than proving exhaustive pattern matches in certain
conditions, fast pattern dispatching, and requiring specialized operators (+,
-, _, / for ints... +.,-.,_.,/. for floats... etc). I'm sure the enlightenment
will come when it stops becoming such a PITA to write a basic program.

Sometimes not having to satisfy the constraints up-front makes exploratory
programming (where the constraints are not specified and known up front)
easier. Python is going the annotation route in newer versions of the language
which is rather useful so that tools could be written to verify consistency
up-front (or at least provide hints).

------
gamesbrainiac
As a pythonista, the only issue I had is with types. A lot of the time, I
don't know what to expect from a function, it could be None or a int (legacy
code). In a typed language like go, you would not have the ability to do this.
So, although powerful, it opens the door for potentially bad decisions.

Python's proposal for type hinting will dampen the effect of type
inconsistency I feel.

~~~
mox1
Yes, in theory this is a problem in lots of languages which are weakly typed.
There are lots of ways of handling this, some generic and some language
specific.

In python I would suggest wrapping your call in a try / expect block or using
"isinstance()".

If you are using a publicly available library or popular piece of code that
can return None or an Integer, I would argue that piece of code was written
incorrectly. Newbie python users might do this, but I think experienced python
devs would see the problem.

Finally, some documentation or justification for the reasoning behind this
decision at the top of the function or on a web page somewhere would help as
well.

As with most powerful , full featured languages it is pretty easy to shoot
yourself in the foot if you are not careful, I don't think this is a python
specific problem...

~~~
chucksmash
I would counter that isinstance() checking is - at least in the general case -
an antipattern.

Better to check that the object you've been passed implements the interface
you need rather than reasoning about its inheritance chain. Duck typing!

~~~
marcosdumay
Some times one wants classes to be semantic, not only structural. That means
that when the developer writes "class C(Foo)" he means something different
than writting "class C(Bar)" even if both classes implement the same methods.

------
robmccoll
Python is great for relatively small projects written by a single person or
for glue code, but large Python projects are scary. Diving in and reading
someone else's code can be quite difficult (what are the types of these
parameters? they're being used like dicts, but what if they're really a
completely different class with different behavior? how can I trace this down
when it goes through multiple IPC indirections since they chose to use
multiple processes to get around the GIL? oh i'm missing packages and they
don't seem to be in pip...). The bigger the project, the more places there are
for runtime errors that a statically compiled language would have caught or at
least has better tooling to find (syntax errors in an error path block that
rarely gets hit, type errors, uncaught exceptions, missing methods,
accidentally fat-fingering a variable name and instantiating a new variable).
You end up leaning really heavily on tests that wouldn't have to be written in
other languages and documentation that would otherwise be statically enforced
in many other languages. Even when using Python for glue code, be wary that it
will need to be in-charge (i.e. embedding a Python interpreter in your code
versus embedding your code in Python especially with CPython).

Not saying it's not possible and doesn't work well for a lot of companies, but
I do think you take on technical debt to get up and running fast when you
choose Python. It's a choice I would personally think twice about.

~~~
mpdehaan2
This is an argument for unit tests, not for a statically typed language.

Type errors tend to be very few and far between compared to logic errors.

Multiple processes in most web applications are best handled by a pre-forking
webserver in front of something like mod_wsgi, and using an engine like Celery
for asynchronous and long-running operations on the backend, often launched
with something like supervisor.

I do take issue with some parts of the article, namely saying there is a good
type system (there's not, but it's all fine, duck typing, etc) and that
twisted is a good framework.

It is definitely true however that Python is a great fit for all kinds of
serious applications, but with all things, it takes discipline.

~~~
Retra
What is the benefit to using a dynamically typed language if you have to spend
all your time writing tests and verifying the static properties of your code?

~~~
spdy
Mainly the boilerplate other languages force upon you. Whenever i look at Java
Code i start to chuckle.

Its so much more information that you need to load into your head.

To quote a picture from the post: [https://www.paypal-
engineering.com/wordpress/wp-content/uplo...](https://www.paypal-
engineering.com/wordpress/wp-content/uploads/2014/12/cpp_py_medium.png)

And its not like you need static typing everywhere just in some places its
better to enforce it for your own sanity. Type hinting looks like a good way
to solve this problem in python.

~~~
Retra
That image isn't really a fair comparison (in that C++ / Java aren't the only
statically typed languages.)

I imagine a statically typed Python dialect would only have an extra 15 or so
lines, and the benefits would be numerous.

I love python as much as anyone, but having to keep a reference manual handy
just to use someone's library because they had the audacity to use a variable
is not my idea of a fun time.

EDIT: I'm probably exaggerating too much.

Would I use Python in a large, complicated, multi-developer project? Yes, I
would. And my only real complaint would probably be that I like static typing
so much that I'd miss it.

~~~
viraptor
> I imagine a statically typed Python dialect would only have an extra 15 or
> so lines

Actually you can have a statically typed Python if you want. Have a look at
[http://docs.cython.org/src/quickstart/cythonize.html](http://docs.cython.org/src/quickstart/cythonize.html)

All type declarations are optional. It compiles modules to versions completely
interoperable with the rest of Python code. They're importable, behave as
you'd expect, etc.

------
jammycakes
I've found over the years that there's a fairly widespread prejudice among
many enterprise developers in general and .NET developers in particular
towards Python. Mention to a few random .NET developers that you're using it
on a side project and perhaps about half of them will give you a look of
puzzlement at best.

The main problem isn't anything to do with technical failings of the language,
but a misconception that it's an obscure niche language that nobody uses
outside of academia, with an ecosystem bordering on the nonexistent. People
seem to think that if you start using Python you're going to end up with an
incomprehensible codebase that's impossible to maintain because you can't hire
developers who know it. They're generally quite surprised when I tell them how
widely used it is and what all it gets used for.

~~~
meddlepal
A .NET developer recently was telling me the .NET community tends to be very
insular, a lot of interesting ideas are ignored unless Microsoft makes them
first-class citizens of the ecosystem. That's unfortunate IMO.

~~~
jammycakes
That's the most common criticism of the Microsoft ecosystem by far. It's not
just being insular, it's that huge swathes of the .NET community insist on
being spoon-fed by Microsoft. The attitude is that if something isn't included
out of the box with Visual Studio, you've no business whatsoever paying the
slightest bit of attention to it.

You see it in spades in the Silverlight community. They're all blaming
Microsoft for the decline of Silverlight, even though it's largely due to
factors outside Microsoft's control.

~~~
pnathan
This is my experience of the .NET world: spoon-fed is, oddly, preferred. I
don't get it, myself.

------
frownie
I have a fairly large code base (80K LOC). When the size grows, lack of typing
can become a problem. 95% percent of the time, the code is explicit enough.
However, when you end up with meta code, then it can become really difficult
to track the types down in the n-th level of recursion...

Python2 to 3 migration is not easy. There are tools but the problem is that
they don't see everything and therefore, you end up with 95% of your code
converted. Then it's up to you to figure out the last 5%, which is an order of
magnitude harder that the first 95%... So I ended up having a fairly long
transition of migrating Python2 code to Python2+3 code.

For bothe these issues, the common discourse is : have proper test coverage.
But well, we live in a real world, and maintaining a code coverage strong
enough to allevaite the problems (around 90%) is just very hard. If you're
coding alone, that may be just too much (my case). In a team setting, with
money to spend, that may be possible, but you'd need a very disciplined team.

But anyway, AFAIC, working with Python is just super productive (I compare to
Java). It also feels much more battle tested than, say, Ruby.

For me python is not a scripting language but a "glue" language set in the
middle of a _huge_ libraries ecosystem.

Now, I didn't do XA transaction stuff, high performance stuff, etc. For me
it's more alike a well done Visual Basic : you can achieve a lot very quickly.
Contrary to VB, the language and its ecosystem are really clean.

I'm lovin' it

~~~
jeffasinger
I work on a team of three that has a project slightly larger than you
100k-110K LOC (and growing by about 5k LOC a week), we've managed to keep test
coverage at about 95%, and have found it's worth the investment upfront, as it
makes refactoring so much easier.

Looking back, I don't think even for a personal project, I would ever do
something that wasn't a one-off without good test coverage. It's essentially
taking on technical debt, as it makes you much more afraid of fixing anything.

------
jpolitz
Re the typing point:

Statements like "Python is more strongly-typed than Java" can mean too many
different things without a precise definition of what "strongly-typed" means.
The Wikipedia page linked from the article even supports the position that
there isn't an accepted definition for the strong vs. weak!
([https://en.wikipedia.org/wiki/Type_system#.22Strong.22_and_....](https://en.wikipedia.org/wiki/Type_system#.22Strong.22_and_.22weak.22_type_systems))

These terms are not very illuminating, and I don't understand the post's
argument about types as a result, especially w.r.t None vs. null.

One argument that the post might be making, and I'd like to see fleshed out,
is "Python's expressions and built-in operators check their inputs at runtime
in a way that gives useful and effective error messages in practice." That
seems like a lesson from experience that could be backed up with anecdotes and
provide some useful feedback on the language. That avoids the terminology
debate about types, which is more about picking definitions than about the
quality of the language for certain purposes.

It does require defining "useful and effective" for error messages, but I'm
more interested in that debate :-)

------
bmoresbest55
Python is very much an established languages and myths like these need to go
away in order for people to feel comfortable using python for serious, well
designed projects that scale and do every and any thing that a person may want
or need. Great post.

~~~
baldfat
I agree BUT... Python 2 and Python 3 migration is still a major issue. It has
come a LONG way in the last 18 months.

Also Python is good at thing but

> feel comfortable using python for serious, well designed projects that scale
> and do every and any thing that a person may want or need

Might be a little to bold of a statement :)

~~~
coldcode
As a non Python programmer, is 3 the way to go for all new projects? Or is
there some advantage to 2 still?

~~~
bjacobel
There is no advantage to 2 unless you're using a legacy environment or a
legacy dependency.

Edit: the "Python 3 Wall of Superpowers"[1] is a good resource for seeing how
far the ecosystem's conversion to Py3k has come. Many of the "red" packages
even have Python 3-compatible forks (e.g., Supervisor).

[1]: [https://python3wos.appspot.com](https://python3wos.appspot.com)

~~~
crdoconnor
There are still a _lot_ of libraries which haven't moved yet.

I periodically check my requirements.txt's and there's still a few big
holdouts left.

~~~
rspeer
Which are they?

I know that python3wos is out of date regarding the libraries people actually
use. When I look at the py2-only libraries on there, I see:

\- the big "legacy dependencies", _Twisted_ and _gevent_

\- libraries that are ported but the site doesn't know it, such as _protobuf_

\- highly specific code to extend a particular system, such as
_tiddlywebplugins_

\- system utilities that it doesn't matter what language they're in, such as
_supervisor_ and _Fabric_

\- libraries that have been abandoned (in all versions) and superseded, such
as _MySQL-Python_

I'm not saying you're wrong, I'm saying that we need a better python3wos. It
makes it look like "unless you bet on a massive legacy asynchronous framework,
you're fine".

I think that to some extent this should be the case, but to some extent, there
are things that should be ported that we're not seeing because of the
unrepresentative set of packages on python3wos.

~~~
crdoconnor
Twisted is one, and it's there partly because a lot of code relies upon it,
not because I like it. I also use a few niche libraries which haven't been
bumped and there's a few libraries like mechanize too - not so niche yet still
not bumped.

Honestly, if there were some big incentive at this point I _would_ go through
the hassle of upgrading, but there isn't. The gains on python 3 seem
relatively minor and incremental.

------
kzhahou
I _love_ python, and this is a terrible article, as many others have said
here.

You have to understand that these articles are written for a number of
reasons:

* The company (here, paypal/ebay) needs to recruit people.

* Often, the language or technology is under siege internally, and these external posts strengthen its position.

* The company wants to get _some_ message out there, but it doesn't actually have any interesting technology of its own, so it goes on and on about some known tech.

------
cdnsteve
Python is crème de la crème.

I was using the Requests library last night to work with some APIs, it's an
absolute gem, it's so easy to use. The Python library manager (pip) just works
and the idea to isolate your dev environment with virtualenv is fantastic.
PEP8 for a universal style guide is underestimated in large projects.

~~~
Cyther606
You can do worse than pip and virtualenv, yes, but I wouldn't call pip the
cream of the crop by any means. NPM is way easier to use for starters, and Go,
Rust and Nim go one step beyond NPM by compiling language dependencies down to
a single binary file which can be shipped to users. It's very succinctly done.

Python really needs to step up the game to stay ahead of up and coming
languages like Nim, which looks like this:

    
    
        import rdstdin, strutils
    
        let
          time24 = readLineFromStdin("Enter a 24-hour time: ").split(':').map(parseInt)
          hours24 = time24[0]
          minutes24 = time24[1]
          flights: array[8, tuple[since: int,
                                  depart: string,
                                  arrive: string]] = [(480, "8:00 a.m.", "10:16 a.m."),
                                                      (583, "9:43 a.m.", "11:52 a.m."),
                                                      (679, "11:19 a.m.", "1:31 p.m."),
                                                      (767, "12:47 p.m.", "3:00 p.m."),
                                                      (840, "2:00 p.m.", "4:08 p.m."),
                                                      (945, "3:45 p.m.", "5:55 p.m."),
                                                      (1140, "7:00 p.m.", "9:20 p.m."),
                                                      (1305, "9:45 p.m.", "11:58 p.m.")]
    
        proc minutesSinceMidnight(hours: int = hours24, minutes: int = minutes24): int =
          hours * 60 + minutes
    
        proc cmpFlights(m = minutesSinceMidnight()): seq[int] =
          result = newSeq[int](flights.len)
          for i in 0 .. <flights.len:
            result[i] = abs(m - flights[i].since)
    
        proc getClosest(): int =
          for k,v in cmpFlights():
            if v == cmpFlights().min: return k
    
        echo "Closest departure time is ", flights[getClosest()].depart,
          ", arriving at ", flights[getClosest()].arrive
    

And performs like this:

    
    
        Lang    Time [ms]  Memory [KB]  Compile Time [ms]  Compressed Code [B]
        Nim          1400         1460                893                  486
        C++          1478         2717                774                  728
        D            1518         2388               1614                  669
        Rust         1623         2632               6735                  934
        Java         1874        24428                812                  778
        OCaml        2384         4496                125                  782
        Go           3116         1664                596                  618
        Haskell      3329         5268               3002                 1091
        LuaJit       3857         2368                  -                  519
        Lisp         8219        15876               1043                 1007
        Racket       8503       130284              24793                  741
    

[http://goran.krampe.se/2014/10/20/i-missed-
nim/](http://goran.krampe.se/2014/10/20/i-missed-nim/)

~~~
viraptor
> Go, Rust and Nim go one step beyond NPM by compiling language dependencies
> down to a single binary file

And the person in security hat now says: so how do you deal with library
upgrades? If you need to go back to original app developers to provide you
with a new version just to update one library, then you've got a problem.

~~~
kibwen
Rust gives you the option to dynamically link, and I expect Nim does as well.
As for Go, I believe dynamic linking is somewhere on their roadmap, though I
don't know how high of a priority it is.

------
kreshikhin
>> Python has great concurrency primitives, including generators, greenlets,
Deferreds, and futures.

It's controversial statement.

Generators, greenlets, deferreds, and futures, it's all not great concurrency
primitives definitely. It's okay for typical "python" tasks, but many
applications have needs in more powerful solutions like golang channels for
example.

~~~
rubiquity
Go's channels aren't a concurrency primitive, they are a concurrency
abstraction. Goroutines are the primitive. I also don't think it would be hard
to implement Go channel semantics in Python.

~~~
kreshikhin
In my opinion the difference between abstraction and primitives lies in fact
that abstraction cannot be instantiated.

Channels can be instantiated, goroutines cannot.

Correct me if I'm wrong please.

~~~
rubiquity
Your definition seems to have its roots tightly coupled to abstraction vs
primitive in the sense of Object-oriented programming. Your use of the word
"instantiation" makes me feel that way. Let me know if that is misguided.

In the world of concurrency you could probably even call goroutines an
abstraction but goroutines are at least closer to the fundamental concurrency
primitives such as threads, locks, mutexes and tasks/coroutines/whatever.

------
raymondh
Mahmoud makes a spirited defense of Python with ten general themes, but the
most important thing he had to tell us about PayPal's experience with Python
was this:

"Our most common success story starts with a Java or C++ project slated to
take a team of 3-5 developers somewhere between 2-6 months, and ends with a
single motivated developer completing the project in 2-6 weeks (or hours, for
that matter)."

------
sago
I don't understand why the GIL gets so much derision from the cool kids, where
node.js remains trendy.

~~~
dragonwriter
I expect that "cool kids" aren't actually a homogeneous group, and the subset
of them deriding Python for the GIL are disjoint, or nearly so, from the
subset making node.js trendy.

Or, they aren't actually deriding Python for the GIL, they are noting that,
given the GIL, you need to use an evented approach, so you might as well use a
platform designed for that as its central model, rather than one that's
designed around the threaded model but without the ability to use it
effectively.

~~~
xorcist
Wouldn't use expect the cool kids to all be using Twisted then?

I get the impression that most web devs wants to keep up with the latest tech,
and by most counts Python is old now.

~~~
dragonwriter
> Wouldn't use expect the cool kids to all be using Twisted then?

No, because Python with Twisted isn't the same as a platform built for the
evented model from the ground up, its a library for a traditional threaded
platform built to handle the evented model.

> I get the impression that most web devs wants to keep up with the latest
> tech, and by most counts Python is old now.

I don't think the desire of devs to be working in something that they perceive
to be in demand and growing moreso is restricted to web devs.

------
jerf
This article overshoots. Yes, Python is slow, and yes, all four runtimes cited
are slow. As I've said before, I no longer believe the idea that languages
don't have performance characteristics, only runtimes do, and as it happens,
the decades-long efforts to speed up Python and their general failure to get
much past 10-20x slower than C are a big part of why I believe that. NumPy
being fast doesn't make Python fast; it's essentially a binding. A great
binding that, if it meets your use case, means you can do great work in
Python, but does nothing to help you if you don't need that stuff.

As for PyPy's "faster than C" performance, people _really really really_ need
to stop believing anything about a JIT running a single tight loop that
exercises one f'ing line of code! Follow that link and tell me if your code
even remotely resembles that code. In practice, PyPy is, I believe, "faster
than CPython" with a _lot_ of caveats still, but "faster than CPython" isn't a
very high bar.

(Similarly, though another topic, Javascript is not a "fast language". People
seem to believe this partially because of JIT demonstrations in which integers
being summed in a tight loop runs at C speed. But this is _easy mode_ for a
JIT, the base level of functionality you expect from one, not proof that the
whole language can be run at C speeds.)

There is no version of Python that will run you at anything like C or C++ or
Java or Go or LuaJIT speeds on general code. It can't; for one thing you have
_no choice_ but to write cache-incoherent code in Python, to say nothing of
the numerous other problems preventing Python from going fast. (Copious hash
lookups despite the various optimizations, excessive dynamicness requiring
many things to be continuously looked up by the interpreter or verified by the
JIT, etc.)

I've drilled down on this one, but there several other "debunkings" here that
are equally questionable. Python does not have a "great" concurrency story...
it has a collection of hacks of varying quality (some really quite good,
though; gevent is _awesome_ ) that get your around various issues, at the cost
of some other tradeoff. The definition of "strongly typed" that Python
conforms to is almost useless, because everything is a "strongly typed"
language by that definition. In practice, it's a dynamically typed language,
and yes, that can cause problems. Another collection of hacks are available to
get around that, but they're add-ons, which means the standard library and
other libraries won't use or support them. Yes, Python is a scripting
language, it's just that it turns out "scripting languages" are a great deal
more powerful that was initially conceived.

Wow, I must hate Python, huh? Nope. It's a _fantastic_ language, certainly in
my top 3, and still probably underutilized and underrespected despite its
general acceptance. When I hear I get to work with it, I generally breath a
sigh of relief! It is suitable for a wide variety of tasks and should
certainly be in consideration for a wide variety of tasks you may have to
solve.

 _But_ , it is always _bad advocacy_ to gloss over the problems a language
has, and all languages have problems since no one language can solve all
problems perfectly. If you are doing something really performance sensitive,
stay away from Python. If you've got something highly concurrent, think
twice... your problem needs to comfortably fit on one CPU using one of the
existing solutions or there's hardly any reason to prefer Python. (Yes, Python
_can_ sort of use more than one CPU but if you're going to do that you'll
probably be happier elsewhere. "Just throw lots of processors at the problem"
isn't generally a good solution when you're starting from a language that can
easily be ~50-100x slower than the competition... that's still an expensive
answer, even today.) Yes, it is dynamically typed and there are situations
where that is OK and situations where it is counterindicated. In the long
term, you don't help a language by trying to minimize the problems... you help
by making it clear exactly what it is good for, when you should use it, and
when you shouldn't. Otherwise, you convince a hapless programmer to pick up
your solution, pour a year or two into discovering it doesn't actually do what
you said it did, and now you've made an _enemy_ for your language. Better for
them to never pick it up because you truthfully told them it wasn't suitable.

That said, though, be _sure_ your task is performance sensitive before just
writing Python off... modern machines are really hard to understand the
performance of and most people's intuitions are pretty bad nowadays. Dynamic
typing has its problems, but so does static. Etc. etc. No easy answers, alas.

~~~
njharman
The need to be 10-20x faster than CPython is a niche. The General User in
General Case just does not care.

I've had projects fail cause of not getting done, being buggy, but never for
not being fast enough.

Finally, slowness, until you get really low-down is relatively easy problem
with many solutions.

~~~
jerf
To be clear, in general, I agree. However...

"Finally, slowness, until you get really low-down is relatively easy problem
with many solutions."

There is a barrier that you can hit in Python/Perl/Ruby/Javascript where
you're trying to do something, you've optimized the Python/etc. to within an
inch of its life, and it's still just too slow. I've hit it twice now in
pretty serious ways. Once you've removed all the slowness-that-has-easy-
solutions, you're _still_ using a very slow language... the 50-100x number I
cite is with the slowness already removed for optimal code, though, to be
fair, this is in comparison to fairly optimal C/C++ as well. Well-written
Python can be competitive with poorly-written C, and that is also not even
slightly a joke, since it's generally easier to get to the well-written
Python. But you can still run out of juice on a modern machine.

But ultimately this is just something you want to know and understand, and not
be too bedazzled by claims that everything's hunky dory in every way.

------
geekam
In Python world deployment is still not as easy as dropping a jar in a
container. Right?

~~~
vetinari
It easier, just drop egg/deb/rpm/msi into your pythonpath.

~~~
emidln
Sorta. It's not that easy in practice (although not terribly, hard since you
can package up your python project with setuptools and then deploy it to an
internal (or actual) PyPI followed up by using distribution packages for a
wsgi server like gunicorn, mod_wsgi, etc). In reality, you typically care
about third-party modules enough and building rpms/debs is typically not fun
enough that you just have the normal pip/virtualenv story.

There is a solution from Twitter that my devops team has been flirting with
called PEX[1]. It builds all of your dependencies into a zip file similar to a
jar and sets it up to work by just putting it on your pythonpath. This would
in practice be very similar to an uberjar.

[1] -
[https://pantsbuild.github.io/pex_design.html](https://pantsbuild.github.io/pex_design.html)

~~~
vetinari
Actually, building rpm/deb is quite easy, setuptools handles bdist_rpm by
itself, for bdist_deb there is a plugin. For windows folks, bdist_msi is by
default part of setuptools[1]. As a backup plan, you can still build eggs with
bdist_egg and let pip handle the package manager duties for you.

The real complexity comes elsewhere - both in java and python world, setting
up the java application server or wsgi server for python is more involved than
just dropping an app there. And then there comes the debugging the
exceptions... there I would pretty much prefer the python world.

Also, be careful with zipping python projects. While .zip is a valid member of
pythonpath, packages can have problems with finding their assets (if they have
any). For example, you cannot zip django this way.

[1] It even handles setting up vcvars when building native extensions. I was
impressed, it was easiest building of windows binaries for free software I've
ever seen.

~~~
emidln
Except that bsdist_rpm only builds the current package. That's not really any
different than pip. You'd need to recursively build packages for all of the
dependencies and have some magic to autodetect the provides/requires inside
your little package ecosystem to be reasonable for any non-trival app to be
deployable via rpm. This isn't impossible, but it's a far cry from PEX or
uberjar.

------
teleyinex
I'm the project lead of Crowdcrafting (powered by PyBossa) with 21.7k lines of
code written in Python.

Our stack in Crowdcrafting is fairly simple in terms of hardware (we've two
servers with 2 GB of RAM each and only 2 cores, while the DBs have 4 GB of RAM
with 4 cores). You can read more about it here:
[http://daniellombrana.es/blog/2015/02/10/infrastructure.html](http://daniellombrana.es/blog/2015/02/10/infrastructure.html)

In my experience the problems that we've had are always related about how we
develop, not the technologies. One of our ways of solving problems is avoiding
increasing the resources of the hardware as it will "hide" the real issues of
your software. Thanks to this approach we were able to store more than 1.5
records per second over a period of 24 hours in 2014 with a DB running on less
than 1GB of RAM and our servers with 2GB of them. It worked really well!
(Actually after that our provider called us to expend more on our hardware).

Our platform is designed to scale horizontally without problems. We use a very
simple set of technologies that are widely used, and we try to use very simple
libraries all the time. This has proven to us to be very efficient and we've
managed to scale without problems.

We heavily test and analyze our software. This is mandatory. We've almost 1000
tests written covering 97% of our code base. We also analyze the quality of
our code using a third party service and we've an score of 93%, and this helps
people to help sending patches and new developers joining the platform.

Right now our servers are responding in less than 50ms in average, and we are
not even using Varnish (check out this blog post:
[http://daniellombrana.es/blog/2015/03/05/uwsgi.html](http://daniellombrana.es/blog/2015/03/05/uwsgi.html)).
Thus, yes Python is a good language, it can scale, and if you usually have a
problem it will be basically in your own code. Hence debug it!

Cheers,

Daniel

------
rubiquity
As it pertains to "Enterprise" and Python, I think the most interesting
project is OpenStack. I don't use OpenStack but it has been interesting that
in conversations with Enterprise devs/managers OpenStack has come up several
times. It turns out Enterprise is still really skeptical of the public cloud
and are going through the efforts of running their own IaaS/PaaS internally.
In my experience OpenStack seems to be coming up much more frequently than
CloudFoundry.

And for the record, by Enterprise I'm not talking Silicon Valley Enterprises.
These are strictly non-tech Enterprises that traditionally view anything
software related as a cost center.

------
slantedview
The section on concurrency is a hilarious joke that deftly dodges the obvious
drawback of Python's concurrency model while peppering the reader with
distracting solutions built on top of multi-processes or single threads. What
is missing here? True, multi-threaded (at the OS level) concurrency.

We're engineers here so let's be real and recognize that different tools are
suited to different tasks. Limiting oneself to concurrency without OS threads
is, to be polite, not necessary. Obviously you CAN build pretty much anything
you want with pretty much any tool, as the examples in this article show, but
that doesn't mean you should.

(written by someone who writes Python daily)

------
thearn4
I'd also add to this that the government is pretty heavily invested into
python (at least NASA, DoD, and DoE based on direct experience). It's kind of
becoming the scientist's and engineer's new Fortran.

~~~
fa
It's a matter of ongoing surprise to see many scientist-engineers in my niche,
in the USAF-sphere, who are jumping from Matlab to Julia, leapfrogging Python.
Kudos to them.

~~~
thearn4
Julia is pretty cool, I made a point to experiment with it a bit last year.
What we work on is pretty heavily object-oriented, and at the time it seemed
to me that the types system seemed a bit too awkward to use to for our
purposes at the time. I'll probably look at it again this year (probably
through Jupyter/IPython notebook)

------
pnathan
I've found Python to be a rolling disaster and a source of technical debt for
any source code which starts to get beyond a single file or 150 lines. (5
years professional Python experience). I hope to transition somehow to write
Java or another language with better design framing.

edit: (I know a boatload of different languages - Java is the most comparable
and least shocking for developers who aren't FP nerds. Go is inadequately
expressive.).

------
diminoten
> and while it has excellent modularity and packaging characteristics

If only... In actuality, the majority of python's package management is being
redone by pypa, and it's not complete.

That's been my biggest problem with python; the circuitous path to deployment
the Right Way in an enterprise, where one doesn't simply publish to pypi.

It'll get there, but to throw around the word "excellent" hides a lot of the
current pain.

------
theophrastus
Does anyone else get a consistent ssl error visiting that site?:

    
    
        Secure Connection Failed
        An error occurred during a connection to www.paypal-engineering.com. Cannot communicate securely with peer: no  common encryption algorithm(s). (Error code: ssl_error_no_cypher_overlap)

------
Xion345
This is such a troll. This article mixes valid points with obvious
inacurracies.

Ok, Python is a fine language, even for building large-scale systems. No, it
is not free from drawbacks.

#1 : Ok, Python is not new and is mature #2 : "Compiled" is almost a
meaningless word. The general definition of compiler is a program that
transforms a document written in a source language into a document written in
a destination language. TeX is compiler, a CSV to XML parser is a compiler. As
regards security/reverse engineering, Python and Java bytecode files can be
very easily decompiled into the original program. It is not the case for C/C++
and thus reverse-engineering is much more difficult. Anyway, this issue can be
mitigated using obfuscation. So ok for this myth. #3 : Again speaking about
the "security". I don't see why Python would be more "secure" than Java. I can
see why it would be more secure than C/C++(runs on a virtual machine, so less
buffer overflows, bounds checking etc.).

#5 : Yes very valid point for Python. And then you start comparing the JVM (a
platform, a runtime, a virtual machine) with Python (a language). This is like
comparing apple to oranges. And no, Java is not dynamically typed. Python is.

#6 : This is the wrongest point of your article. Python is slow, very slow.
Excuse me, CPython, the runtime that virtually everyone uses is very slow (as
all non-JIT virtual machines). Usually 5-10x slower than a native program and
2-4x slower than a JVM program. You can't completely decorrelate the language
and the platform. Speaking about Java performance means discussing the JVM
(although Java can be compiled). Speaking about Python means discussing
CPython.

Performance-wise, Jython is not much better (and sometimes a lot worse than)
CPython, and they only implement Python 2.5. Pypy does an excellent job at
optimizing Python and often yields on par performance with the JVM. However,
it is incompatible with C extensions/modules written for CPython, which means
it can rarely be used in production (goodbye databases etc.)

There are countless stories in HN of start up developers rewriting their
systems from Python/Ruby to Java/Go because performance was an issue.

#7 : I think some of your examples are contrived (e.g. Youtube) as we don't
know where exacly where Python is used is the infrastructure. I only use
Python to draw graphs does it mean my infrastructure relies on it ?

About GC pauses. Yeah, there are no pauses in Python because the GIL allows
only one thread to run concurrently.

#8: The GIL is a performance optimization ? Really ? The GIL is here because
it eased the development of the CPython VM but is terrible for performance.
Because of the GIL a CPython cannot run more than one thread concurrently. In
the era of multicore processors, this is terrible for performance. Yes, there
are workarounds for _I /O operations_ (greenlets etc.), but for processing
intensive tasks, it is impossible to exploit a multicore processor with
Python. And no multiprocessing is not a solution, as it does not allows to
share memory between processes. Also remind that Python is often 4x slower
than Java on a signle core...

#9: True but this looks like a strawman. Nobody ever said Python programmers
are scarse.

#10: I am not having this debate again, but a strong type system is useful for
big projects. I don't say it's mandatory but it reduces development time.

~~~
bkcooper
_And no multiprocessing is not a solution, as it does not allows to share
memory between processes._

You actually can. multiprocessing defines a few classes (e.g. Array) that
store their data in a mmap kept in the module. Data there will be visible to
all processes, and you can even have a numpy array as a view to this data.

That said, my attempts at working with this recently have been pretty painful.
multiprocessing tries, but at least in Python 2.7 does not succeed in
abstracting away the differences between Unix forks and Windows spawns. This
results in a lot of weird issues: things running differently in command line
vs. console IPython vs. IPython notebook; the need to structure your code to
avoid pickling errors; etc. This is probably the biggest portability issue
I've encountered with Python thus far (using it mostly for scientific
problems.)

I'm aware that there are solutions to some of these issues out there, but it's
too bad that the standard library implementation has these issues.

~~~
slantedview
The mmap solution, along with various others, are ridiculous workarounds for
folks dead set on jamming a square peg into a round, single-threaded hole. If
a technology doesn't support something that you need, natively, don't rely on
lame workarounds, use a more appropriate technology.

------
ChikkaChiChi
I predict this will become a reference link for pro-Python arguments in the
same way that insufferable 'fractal' article has become a boondoggle whenever
PHP is mentioned.

~~~
yen223
Eh. As a Python fanboy, a lot of his points are pretty weak. His examples of
Python being fast includes libraries which call into non-Python code and a
non-standard implementation.

Besides, there are way better links for pro-Python arguments. I would point to
Norvig's spell-checker in 21 lines of Python for example
([http://norvig.com/spell-correct.html](http://norvig.com/spell-
correct.html)).

------
kokey
When I read the title, I thought it actually referred to an enterprise version
of python. Something along the lines of ActivePython, where certain versions
of the language and packages are supported. It's strange that the post doesn't
meant versioning or packaging at all. This is a hugely important for
enterprise software. Hopefully the work Nick Coghlan is doing around packaging
will address this to some degree.

------
kevinaloys
Guido has been conceiving the idea gradual typing in python and wants to
introduce this into python 3.5. I think this will be a great mix of both weak
and strong type checking.

[https://www.python.org/dev/peps/pep-0484/](https://www.python.org/dev/peps/pep-0484/)
[http://baypiggies.net/](http://baypiggies.net/)

------
boothead
Picking out BAML and JPM as example of successful adoption at scale has got to
be a joke!

Giving a large number of "enterprise" developers a language like python means
that you end up with ~10m lines of poorly designed, un-pythonic mess.

My personal opionion is that using python for any kind of large scale
infrastruture is a very bad idea.

------
andars
I recently heard someone advise for a friend to learn python2.7 instead of
Python3. As someone not in the industry, this was surprising to me? Are
companies/people still writing new code in python2?

~~~
goostavos
I don't think the issue is _new_ code, so much as it's all the old code that
needs supporting.

Also, library support still isn't 100% there. So, if one critical piece of
your stack isn't on 3.x, you've gotta throw out the whole thing that stick
with 2.x.

~~~
yen223
The biggest issue is that there's no real compelling reason to migrate away
from Python 2 now. It's not meant to be an attack on Python 3 - Python 2.7 is
still well-supported, and is plenty good enough for a lot of us.

That said, if I were starting out now, I'd start with Python 3.

------
stefantalpalaru
> Python is in fact compiled to bytecode

When we talk about compiled languages, we usually mean "compiled to machine
code". It's wrong to put a non-JITed "compiled to bytecode" language in the
same category as C/C++/Haskell/Rust/Nim/etc.

> Furthermore, CPython addresses these issues by being a simple, stable, and
> easily-auditable virtual machine.

[http://www.cvedetails.com/vulnerability-
list/vendor_id-10210...](http://www.cvedetails.com/vulnerability-
list/vendor_id-10210/product_id-18230/version_id-92056/Python-Python-2.7.html)

> Each runtime has its own performance characteristics, and none of them are
> slow per se.

Unfortunately, they all are slow when compared to the most efficient
language/compiler available on the same platform. And no hand waving will fix
this. Nor the small and not so small differences that render non-CPython
implementations unusable in practice. The only valid argument remaining is
that not all programs are CPU bound, but how many Python programmers are
willing to say "I can only write non CPU-bound software in this language"?

> Having cleared that up, here is a small selection of cases where Python has
> offered significant performance advantages:

> Using NumPy as an interface to Intel’s MKL SIMD

The fast parts of NumPy are written in C. Python doesn't get to claim any
performance advantage here.

> PyPy‘s JIT compilation achieves faster-than-C performance

I doesn't, they are just comparing different algorithms.

> Disqus scales from 250 to 500 million users on the same 100 boxes

Yes, thanks to Varnish that caches those slow Django requests, according to
the linked article. Can you guess what language is Varnish written in? In the
mean time, Disqus has also moved its slow as hell services to Go:
[http://highscalability.com/blog/2014/5/7/update-on-disqus-
it...](http://highscalability.com/blog/2014/5/7/update-on-disqus-its-still-
about-realtime-but-go-demolishes.html) (a case where a shitty compiler that
produces machine code beats a bytecode VM by so much it isn't even funny).

> It would be easy to get side-tracked into the wide world of high-performance
> Python

This depends on what you're smoking.

> One would be hard pressed to find Python programmers concerned about garbage
> collection pauses or application startup time.

Not garbage collection pauses, but in more than one occasion, when doing data
migrations with processing done in Python, I had to call gc.collect() manually
each N iterations to keep memory usage under control.

> With strong platform and networking support, Python naturally lends itself
> to smart horizontal scalability, as manifested in systems like BitTorrent.

The BitTorrent library of choice nowadays is
[http://www.libtorrent.org/](http://www.libtorrent.org/) \- written in C++.

> The GIL makes it much easier to use OS threads or green threads (greenlets
> usually), and does not affect using multiple processes.

The devil is in the details. You can only run OS threads in parallel if only
one of them is Python code. To parallelize multiple instances of Python code
you need to start multiple instances of the Python interpreter. Or you can
pretend that you don't really need parallelism after all and cycle green
threads on the same core while bragging about millions of requests per... day.

> Here’s hoping that this post manages to extinguish a flame war [...]

:-)

