
Profiling Python in Production - mau
https://nylas.com/blog/performance
======
mau
Here is a profiler developed by dropbox using the same technique described in
the article:

\- [https://blogs.dropbox.com/tech/2012/07/plop-low-overhead-
pro...](https://blogs.dropbox.com/tech/2012/07/plop-low-overhead-profiling-
for-python/)

\- [https://github.com/bdarnell/plop](https://github.com/bdarnell/plop)

------
sitkack
Reducing CPU load also

* reduces power usage, wear and tear on hardware

* gives more capacity for traffic surges

* gives more headroom for new features

* enables running on a smaller instance

This isn't an argument against slow code, more of a suggestion to tune out
unnecessary work.

~~~
valarauca1
Slow code is just paying for developer productivity in CPU time instead of
labor time.

It's great in theory until the AWS bill arrives. Often times it's still fine,
just depends where the bottom line is.

~~~
sitkack
Agreed, I love slow interpreted languages like Python and writing _fast_
Python is a different kind of optimization that one would do C. In cases like
the article, the biggest gains are in just not doing something, or doing it
less often.

------
sanxiyn
Here is another Python profiler intended for live use:
[https://github.com/what-studio/profiling](https://github.com/what-
studio/profiling)

------
iamspoilt
This question might be bit naive but how is this approach any better or
different from monitoring tools like NewRelic which does the profiling for
you?

~~~
emfree
Author of the post here. That's a good question. I don't know if this approach
is objectively better, but it has a few nice features.

* We generally favor free/open source solutions where practical.

* It is quite a bit cheaper in dollar terms.

* The actual code to make this work is very lightweight. By doing it yourself, you have total control, and can extend or tweak to get exactly the data you want. Being able to easily add bespoke instrumentation is really powerful. To give an example from one of our use cases (IMAP sync), let's say you wanted to cohort your data by mail provider. I.e., you suspect that the workload profile when syncing against server A is significantly different than syncing against server B, and you want to know for sure. It's pretty easy to take your codebase and your instrumentation, and add that by inspecting some thread-local context at runtime. Might be hard to do with an off-the-shelf commercial tool.

~~~
iamspoilt
I completely agree about the bespoke instrumentation you have discussed here
and this approach has started making a lot more sense to me now. I am gonna
use it for one of our micro-services soon. Cheers for sharing this amazing
post.

------
moonchrome
>It’s a large Python application (~30k LOC) which handles syncing via IMAP,
SMTP, ActiveSync, and other protocols.

In what context is 30k LOC a large application ? 30k LOC is small enough that
one programmer can write and easily have an overview of the entire codebase.
Maybe it's a typo and it's 300k LOC

~~~
jonesb6
I'd also consider a poorly written and documented 30k LOC to be a "large"
codebase, and it would probably take multiple people to wrestle with it.

