Minor feature request: an explicit 'pause' button would make it easier to copy file paths from the output. Ctrl-S is a reasonable alternative, but it's a little hacky.
Also, it would be nice to somehow eliminate time spent in poll() from the results. I'm profiling a server process and 99.9% of the time is spent in a poll function. Perhaps there could be an option to disregard time spent in system calls rather than user code. Most of the time I'm interested in profiling only user code.
(Actually, I've been looking for a reason to play with Rust code. Maybe I'll try to add these features myself!)
Score-p/Scalasca: http://score-p.org https://github.com/score-p/scorep_binding_python
Imagine it should also provide relevant insights into the structure of this tool
One question, does anyone here know how to interpret the GIL (Global Interpreter Lock) percentage display in the top-left? In my code, "Active" sticks nicely at 100%, but the GIL jumps around from 1% to 100%, changing on every sample.
edit: Now that I think about it, my code spends a lot of time in C API calls - maybe the GIL is released there?
> The only other Python profiler that runs totally in a separate process is pyflame, which profiles remote python processes by using the ptrace system call. While pyflame is a great project, it doesn't support Python 3.7 yet and doesn't work on OSX or Windows.')
> Py-spy works by directly reading the memory of the python program using the process_vm_readv system call on Linux, the vm_read call on OSX or the ReadProcessMemory call on Windows.
I think ptrace is fundamentally letting you do the same thing in terms of how pyflame is using it.. and the same ptrace access permission governs whether you can use process_vm_readv
The real win for this project is a real-time "top" or "perf top" style UI instead of only generating flamegraph output. I love that feature, and will be particularly good for quick shot "what is this process doing" type info as opposed to specifically profiling some timeframe to analyse the resulting flamegraph (which is all pyflame let you do)
It shows the call paths to the functions and what part each path took, that's not so obvious from the typical table. On the other hand, finding functions that are called quite a lot all over the place and add up is easier in the table, so it's not become useless.
Basically nothing out there does that I've found, and it's a really major pain-point for me.
It basically boils down to (currently) doing multiprocessing profiling is a giant pain in the ass, you have to manually attach the profiler yourself if you ever launch another process, and every profiled process produces it's own output file.
It's not impossible, it's just very annoying. I've been vaguely meaning to write a thing which attaches to the fork() call and automatically starts the profiler in the child-process, and handles aggregating all the results back to a single output when all children exit.
Ironically, I was involved in https://bugs.python.org/issue6721 which is one of the major bugs leading to the acceptance of https://bugs.python.org/issue16500, which is the patch including atfork().
I also took the liberty to add this (and setuptools_rust) to Arch Linux's AUR: https://aur.archlinux.org/packages/python-py-spy
I am envious, well done!
SBCL's sampling profiler is pretty finicky, and I haven't figured out yet how to use Linux-level profiling tools to get something useful out of a CL image.
So I second the OP, well done!
And as a slightly different take than that of the person posting the issue - interfaces like kcachegrind are a pretty clunky (if powerful, in their clunky way) - the profiler coming with some built-in presentation and reporting of its own like the flamegraph and the realtime display is a big win and a serious deficiency in most python profilers.
Edit: I pip installed it into a virtualenv, if that matters.
tl;dr - yes, it's a limitation(?) of macOS syscalls.