
Instagram Makes a Smooth Move to Python 3 - fjordan
https://thenewstack.io/instagram-makes-smooth-move-python-3/
======
dmalvarado
“Yeah, Python is great in so many ways, too bad it’s not really scalable.”

I'm not even sure what this means anymore. I guess I'm just not sure how any
language, when used correctly, could be inherently unscalable. My guess is
statements like this came from a time when monoliths were the application
design of choice? Now, assuming Instagram has just 1,000 photo handling
servers, each one is only responsible for 95,000 photos a day.

Of course, that's not to say that Instagram doesn't have CAP issues. It does,
especially in the "C" area, but again, not a problem inherent in the language.

~~~
mahyarm
The threading story isn't great. It's not statically typed, which can make
working on a large codebase with more people more painful. It also forces you
to write more unit testing code coverage to make up for the lack of a static
compiler checking things for you. Raw performance is not good compared to
golang or java.

When you start getting to a certain scale, developers are cheaper than your
server costs in some cases. That is when something being performant is more
worth it.

~~~
bpicolo
And at that scale you can make additional microservices for the critical path
where it matters.

~~~
mahyarm
Not really. You'll notice a pattern in most tech bigcos where they move from
dynamic lang X to a statically typed language that can multithread properly.

A few examples: Ruby on rails to java (twitter). Java & c++ (google). Java
(linked in). Python/Node -> Java/Go (uber). Or they start doing silly things
like make a new VM (facebook).

~~~
stuartaxelowen
I heard through the grapevine that Go was only chosen at Uber to help with
hiring.

~~~
stock_toaster
That seems silly. There are far more Java devs out there than Go devs.

Also, I would assume Uber probably has hiring problems in general at this
point.

~~~
sidlls
> That seems silly. There are far more Java devs out there than Go devs.

I wouldn't underestimate the level of hype-driven development that exists in
this area. "Chasing the new shiny" seems like it could be a line item in a
resume, these days, sometimes.

------
Dowwie
Members of the Py3 transition team (the authors of that article) gave a talk
about the project at PyCon 2017:
[https://www.youtube.com/watch?v=66XoCk79kjM](https://www.youtube.com/watch?v=66XoCk79kjM)

------
bifrost
I can't say I think this has ever been really true: "Performance speed is no
longer the primary worry. Time to market speed is."

Performance has been a concern but programatic loadbalancing has been around
for decades. When I worked at MSN/Linkexchange back in the late 90's we never
really worried heavily about the performance of the language we used (Perl)
because we could scale out servers. Perl isn't that speedy but it sure was
easy to develop in. We served a billion and a half clicks per month with 8-10
machines from a single datacenter before I left, with Perl.

------
mixmastamyk
Right, you hear a lot of griping about moving from Python 2 to 3 but I
personally didn't have as much trouble as expected. Some of my projects just
worked. One small tip I don't think they mentioned. Start using the logging
module instead of print and it will eliminate one class of potential issues.

~~~
vhost-
I've heard people complain about print being a function now more than a hand
full of times and my response is always "are you really using print that much
in your code base?"

~~~
MatthewWilkes
You still use it in debuggers and in the REPL. Using it seldom makes it harder
to relearn the muscle memory, and only hitting it when you're debugging makes
it more likely that you're already frustrated when it happens.

------
passive
I've been writing mostly small python projects for 15 years, starting with
2.2.

I've had no issues in the migration to 3.

I did some building with it around 3.3, starting to commit around 3.4, and
with 3.5 I build everything in it.

I don't have the performance challenges Instagram has, but my experience with
application development in general is that 98% of performance challenges can
be solved with (not-too) clever engineering. This applies to projects in every
language.

There are a vanishingly small number of scenarios where the performance of
your runtime actually dictates your performance limits.

If you're working on something and are worried about Python's performance, or
which Python to use, don't. Use 3, optimize later.

------
MatthewWilkes
I watched the PyCon keynote on this topic, and while it's nice to hear they've
moved to Python3 their approach probably shouldn't be copied.

For example, in their codebase they had ambiguity between bytestrings and
unicode strings. As Python3 tries to prevent you doing this, to resolve a big
footgun from Python2.

The right fix here is to be consistent in your use of strings. Sometimes that
is tricky because of how third party libraries have decided to implement their
2/3 compatibility, but it helps prevent shooting yourself in the foot with
unicode bugs down the line.

Instagram did not do this. They created utility functions to force their data
into the format they wanted at the point it is used. In other places they used
tuple() to make sure that map calls that had side effects were fully iterated
over.

In short, they had bad Python2 code and now have had Python3 code. Sometimes,
at large scale, it's your only choice. But to smaller companies looking at
this it's a bad idea. You're setting a precedent in your code that it's okay
to make the same mistakes that Python3 tried to prevent.

------
richard_todd
So their server needs are growing faster than their user-base to the extent
that they considered switching languages. PHP didn't seem to perform much
better, so they stuck with python and got a ~12% CPU usage improvement by
moving to python3. It doesn't seem like a 12% one-time improvement actually
solves the original problem, though. Perhaps pypy would have been better?

~~~
williamstein
Instagram also uses Cython a lot (so I've heard from a talk), so switching
away from CPython might not provide as much of a speed up as one might
otherwise expect, as one can get C levels of speed (and concurrency) with
significant effort using Cython. Also, Cython and PyPy might not play so well
together...

------
eggie5
types in python 3.5?! I had no idea -- that's exciting.

~~~
joobus
Note they are type _annotations_. They are for tooling/development only; the
runtime doesn't care about the types at all.

~~~
ice109
which language's run time cares about types?

~~~
nrinaudo
Any dynamic language - by definition, they check types at runtime.

Or, unfortunately, a lot of static languages. Any static language that allows
type casting, for instance - that's the only way they can check whether a cast
from a type to one of its descendants is valid (eg, in Java, casting an Object
to a, say, URL).

------
nyangosling
> _It made sense that, if we were going to stay on Python for the next ten
> years, we should invest in the latest version of the language._

I don't know this stands out to me in particular, but the 10-year commitment
is definitely a big decision. I suppose I've never had to make a similar
decision so perhaps this is more common than I think.

~~~
treve
Most successful application I've worked on have either already been around for
10 years, or have lived onto become 10 years old. The ones that didn't, were
usually not successful.

It's a good reason to be careful to avoid the latest and greatest, but go for
tech that has (some) track record of being maintained and used for a few
years.

