Interesting testimonials section. Two out of five are "this looks interesting" and "this might be useful." Another two seem unclear on if they've actually used it. The last one does seem to be from an actual user (and is very positive, for what it's worth).
For those Python programmers out there, if you don't mind sharing your experiences, do you spend any time at all on the REPL? If so, what fraction of the time, approximately? Using IPython, JupyterLab, or something else? Or do you just run it directly from an VS Code or PyCharm? Anything you may want to add about your routine would be appreciated.
Oh, if you (experienced programmer or not) happen to know about a good site or YouTube channel to see Python programmers in action (as opposed to tutorials), please share.
Thanks in advance, and apologies for the digression.
I use Jupyter and vscode together. I'll be write new snippets of code in Jupyter then move it into .py stand-alone files when I'm happy it. I'll use vscode to work on already established code. The extension for auto reloading in Jupyter is super helpful.
Notebooks are just plain awesome. Whenever I use a new api or service, I'll make a notebook with cells showing how to call/run each operation and commit as a sort of executable documentation.
But I normally just create two terminals (I have a tiling window manager) and in one I open a python file under /tmp/ write my code and execute it in the other terminal.
I would probably use a REPL if it was integrated in my favorite editor ( https://helix-editor.com ).
But everything else I tried was to "clunky" for me.
Though I work with data scientists, and they love to do everything inside jupyterlab.
A its spooky reading someone with basically the exact same workflow as me!
I use helix in the terminal, regularly opening up a split pane in tmux to either breakpoint in, or test out bits of code interactively. I'm not quite as organized as having two regular panes- I'll close and open them pretty quickly. Often just to try some toy example of reorganising a duct or something before writing it out into code.
Haha great to hear I'm not the only one! I just miss the speed of Helix when using Jupyterlab et al, so I just do it this way
Yeah I'm definitely not that organized, I also don't keep both open all the time, my fingers are sometimes to quick and close one of the terminals without me thinking about it. But I kept it simple in my example, so others could get the idea, as this is basically the "concept" behind my workflow.
I usually develop locally with the vscode debugger. On staging servers I often do remote vscode sessions also. On production I often use the REPL since I don’t want to install additional tools, but still need to inspect the state of a pipeline in a more step-by-step fashion.
Work with a ton of ETL and web APIs, and build internal tools. Almost everything new starts off in a notebook and then either moves to a standalone .py, ends up in aws lambda (typically Zappa flask projects), or in an airflow dag.
The reason is that Jupyter environment is lightyears more powerful than REPL. Feels like REPL is for those who don't code / only those who don't code would use REPL. I don't even use that after the first day.
for python: 90/100 times, I just run the code/tests/debugger, 8/100 times I'll pull out the code and step through it with my own inputs, 2/100 use a repl and break on areas of interest to debug because the debugger just isn't cooperating. It's just too easy to use the debugger in something like vscode to run a module, especially in python since you can just right click run any old module a lot of the time just make a __main__ and feed some parameters, step through it, unlike with a static language.
I used this tool recently to debug some issues where one of our batch tasks was using more memory than it should and sometimes OOMing in an environment where it shouldn't come close to hitting a memory ceiling.
Found the issue almost immediately with the high watermark analysis which provides visibility into the part of the codebase/call stack that allocated every bit of allocated memory that was still live at the high watermark time. Made diagnosing the issue (an unexpectedly large amount of memory required when using a method from a third party library -- xarray.merge) and remedying the issue incredibly easy.
I was very impressed by the quality and utility of this tool and am now a huge fan!
I have used memory profilers in PHP. In my experience, it is difficult to analyze why application uses lot of RAM just by looking at stack traces. It is better when you have a graph of references: which variable references which memory block. So that you can see, for example, that `cache.users[23].comments[56].attachments[2].binary_data` uses 10 Mb of RAM.
When it comes to profiling in Python, never underestimate the power of the standard library's profiler. You can supply it with a custom timing function when instantiating the Profile type [1], and as far as the module is concerned, this can be any function which returns a monotonically-increasing counter.
This means that you can turn the standard profiler into a memory profiler by providing a timing function which reports either total memory allocation or a meaningful proxy for allocation. I've had good results in the past using a timing function which returns the number of minor page faults (via resource.getrusage).
Ive used Memray and it works great locally. But when I deployed my application over long running processes (i.e. in production) because I want to see memory usage over a long period of time, the profiler outputs get really large, like hundreds of gbs. They cause disk outages and also take forever to download and visualize with the flamegraphs. What do people use to understand memory usage of long running workloads in production?
I have a hard time listening to him knowing how unsavory he tends to be in response to GitHub comments and issues. He has made some good tools for sure but his interpersonal comportment is quite off putting.
Currently I actually need a Python memory profiler, because I want to figure out whether there is some memory leak in my application (PyTorch based training script), and where exactly (in this case, it's not a problem of GPU memory, but CPU memory).
I tried Scalene (https://github.com/plasma-umass/scalene), which seems to be powerful, but somehow the output it gives me is not useful at all? It doesn't really give me a flamegraph, or a list of the top lines with memory allocations, but instead it gives me a listing of all source code lines, and prints some (very sparse) information on each line. So I need to search through that listing now by hand to find the spots? Maybe I just don't know how to use it properly.
I tried Memray, but first ran into an issue (https://github.com/bloomberg/memray/issues/212), but after using some workaround, it worked now. I get a flamegraph out, but it doesn't really seem accurate? After a while, there don't seem to be any new memory allocations at all anymore, and I don't quite trust that this is correct.
Somehow this experience so far was very disappointing.
Side node, I debugged some very strange memory allocation behavior of Python before, where all local variables were kept around after an exception, even though I made sure there is no reference anymore to the exception object, to the traceback, etc, and I even called frame.clear() for all frames to really clear it. It turns out, frame.f_locals will create another copy of all the local variables, and the exception object and all the locals in the other frame still stay alive until you access frame.f_locals again. At that point, it will sync the f_locals again with the real (fast) locals, and then it can finally free everything. It was quite annoying to find the source of this problem and to find workarounds for it. https://github.com/python/cpython/issues/113939
Another side node: Bloomberg has a couple of nice open source projects. E.g. I just realized, PyStack (https://bloomberg.github.io/pystack/) is also by Bloomberg.
It is still in experimental stages, but would you be willing to test the memory mode of Echion? (https://github.com/P403n1x87/echion). It is essentially tracing of all memory allocations. Mind that, if an allocation is followed by a matching de-allocation, it won't be reported; so if you profile from start till end, and all the memory is released, the profile should be almost empty (in reality this will almost never be the case); this might explain why the data doesn't seem to make sense from the tools you have tried already (assuming they do a similar thing).