
Show HN: Py-spy – A new sampling profiler for Python programs - benfrederickson
https://github.com/benfred/py-spy
======
hathawsh
Many thanks for building and releasing this. It's ridiculously easy to install
(especially in virtualenvs) and very powerful. When I push '3' or '4', I get
informative, stable output.

Minor feature request: an explicit 'pause' button would make it easier to copy
file paths from the output. Ctrl-S is a reasonable alternative, but it's a
little hacky.

Also, it would be nice to somehow eliminate time spent in poll() from the
results. I'm profiling a server process and 99.9% of the time is spent in a
poll function. Perhaps there could be an option to disregard time spent in
system calls rather than user code. Most of the time I'm interested in
profiling only user code.

(Actually, I've been looking for a reason to play with Rust code. Maybe I'll
try to add these features myself!)

~~~
benfrederickson
Thanks! Both of your suggestions totally make sense. I've created an issue to
track the poll() issue here [https://github.com/benfred/py-
spy/issues/13](https://github.com/benfred/py-spy/issues/13) \- I think that
should be an easy fix.

~~~
toxik
I think a more robust solution might be to have counters for in-Python
samples, outside-Python, and in-syscall.

------
gnufx
The adaptations of HPC-type performance tools to Python and called non-Python,
specifically parallel, libraries might be of interest:

TAU:
[https://www.cs.uoregon.edu/research/tau/docs/newguide/ch03s0...](https://www.cs.uoregon.edu/research/tau/docs/newguide/ch03s09.html)
Extrae/Paraver:
[https://www.researchgate.net/publication/317485375_Performan...](https://www.researchgate.net/publication/317485375_Performance_Analysis_of_Para\\)
llel_Python_Applications Score-p/Scalasca:
[http://score-p.org](http://score-p.org)
[https://github.com/score-p/scorep_binding_python](https://github.com/score-p/scorep_binding_python)

------
maxmcd
The localhost talk for rbspy (the inspiration for this project) is awesome:
[https://www.recurse.com/events/localhost-julia-
evans](https://www.recurse.com/events/localhost-julia-evans)

Imagine it should also provide relevant insights into the structure of this
tool

------
TTPrograms
This is really fantastic. I just managed to find a 3x speedup on a compute
heavy job I run using Py-spy - there was an unneeded hotspot in a library I'm
using that I didn't previously suspect. It would have taken a long time using
kernprof to dig in through the call stack to find the issue.

------
jpeanuts
This is a really great tool - the kind I didn't even know that I needed until
I was given it!

One question, does anyone here know how to interpret the GIL (Global
Interpreter Lock) percentage display in the top-left? In my code, "Active"
sticks nicely at 100%, but the GIL jumps around from 1% to 100%, changing on
every sample.

edit: Now that I think about it, my code spends a lot of time in C API calls -
maybe the GIL is released there?

~~~
toxik
Wow, it tracks GIL contention? Major feature for me. I use a lot of Numba, and
it releases the GIL if you want — so threading is actually useful.

------
lathiat
I've been enjoying pyflame from Uber - which the author quotes in their info

> The only other Python profiler that runs totally in a separate process is
> pyflame, which profiles remote python processes by using the ptrace system
> call. While pyflame is a great project, it doesn't support Python 3.7 yet
> and doesn't work on OSX or Windows.') > Py-spy works by directly reading the
> memory of the python program using the process_vm_readv system call on
> Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows.

I think ptrace is fundamentally letting you do the same thing in terms of how
pyflame is using it.. and the same ptrace access permission governs whether
you can use process_vm_readv

The real win for this project is a real-time "top" or "perf top" style UI
instead of only generating flamegraph output. I love that feature, and will be
particularly good for quick shot "what is this process doing" type info as
opposed to specifically profiling some timeframe to analyse the resulting
flamegraph (which is all pyflame let you do)

Nice work!

------
samstave
"top" for python programs. Thats pretty awesome - not sure if this has existed
in other traces, but the output is great.

~~~
_verandaguy
If you're talking about the flame graphs, they're a fairly common feature of
modern profilers. The oldest implementation I know of is at
[https://github.com/brendangregg/FlameGraph](https://github.com/brendangregg/FlameGraph).

~~~
gnufx
I've never understood why flame graphs are better than the normal presentation
of inclusive and exclusive timings in performance tools, even if they're not
"modern", but embody some decades' experience. Anyone care to explain?

~~~
detaro
I'm far from a performance expert, but my impression is:

It shows the call paths to the functions and what part each path took, that's
not so obvious from the typical table. On the other hand, finding functions
that are called quite a lot all over the place and add up is easier in the
table, so it's not become useless.

~~~
gsteinb88
The latter can be accomplished with inverted flame graphs (sometimes called
icicle graphs) which show the call stack inverted

------
SketchySeaBeast
Oh that's great - I had it up and running in 30 seconds.

------
fake-name
Does it support python multiprocessing?

Basically nothing out there does that I've found, and it's a really major
pain-point for me.

~~~
marmaduke
yappi?

[https://bitbucket.org/sumerc/yappi/](https://bitbucket.org/sumerc/yappi/)

~~~
fake-name
> If you want to profile a multi-threaded application, you must give an entry
> point to these profilers and then maybe merge the outputs.

It basically boils down to (currently) doing multiprocessing profiling is a
giant pain in the ass, you have to manually attach the profiler yourself if
you ever launch another process, and every profiled process produces it's own
output file.

It's not _impossible_ , it's just very annoying. I've been vaguely meaning to
write a thing which attaches to the fork() call and automatically starts the
profiler in the child-process, and handles aggregating all the results back to
a single output when all children exit.

~~~
wrmsr
As a heads-up if you hadn't already seen it 3.7 added fork callbacks for stuff
like this -
[https://docs.python.org/3/library/os.html#os.register_at_for...](https://docs.python.org/3/library/os.html#os.register_at_fork)
\- much nicer than patching-and-praying.

~~~
fake-name
Yeah, saw that while looking about for atfork stuff.

Ironically, I was involved in
[https://bugs.python.org/issue6721](https://bugs.python.org/issue6721) which
is one of the major bugs leading to the acceptance of
[https://bugs.python.org/issue16500](https://bugs.python.org/issue16500),
which is the patch including atfork().

------
craftyguy
This is great!

I also took the liberty to add this (and setuptools_rust) to Arch Linux's AUR:
[https://aur.archlinux.org/packages/python-py-
spy](https://aur.archlinux.org/packages/python-py-spy)

------
kawsper
I would love something like that for Ruby :)

I am envious, well done!

~~~
benfrederickson
Check out rbspy
[https://github.com/rbspy/rbspy](https://github.com/rbspy/rbspy) (rbspy was
the inspiration for this project =)

------
pvg
Wonderful. Can the data it produces be munged into something KCachegrind can
show?

~~~
benfrederickson
Not yet - but I'm hoping to have a version that supports this next week. Will
update this issue when it's done: [https://github.com/benfred/py-
spy/issues/3](https://github.com/benfred/py-spy/issues/3)

~~~
pvg
Great, thanks!

And as a slightly different take than that of the person posting the issue -
interfaces like kcachegrind are a pretty clunky (if powerful, in their clunky
way) - the profiler coming with some built-in presentation and reporting of
its own like the flamegraph and the realtime display is a big win and a
serious deficiency in most python profilers.

------
nevon
Does anyone know if there is something like this for Nodejs? Of course you can
enable profiling, but it would be nice to be able to look at running
processes.

------
bobwaycott
Gave it a try, and wasn’t expecting to have to execute via sudo on macOS. I
almost never use sudo, so this stands out as quite unexpected for a developer
tool. Is there something off with my system, or are sudo privileges always
going to be required on macOS?

Edit: I pip installed it into a virtualenv, if that matters.

~~~
MetricMike
[https://github.com/benfred/py-spy#when-do-you-need-to-run-
as...](https://github.com/benfred/py-spy#when-do-you-need-to-run-as-sudo)

tl;dr - yes, it's a limitation(?) of macOS syscalls.

------
AdamM12
Dangit. I was thinking of writing an AST inspector name pyspy.

------
ant6n
why the dash, py-spy vs rbspy?

~~~
A2017U1
Just a guess but remove the dash and read it out loud.

~~~
ant6n
?

~~~
yesenadam
It would rhyme with _crispy_.

