Also, I have noticed that she takes simple things we don't usually think deep about and drill down to it opening up underlying complexity and pointing a metaphorical magnifying glass at it. (An example I quickly picked from the blog http://jvns.ca/blog/2014/09/28/how-does-sqlite-work-part-1-p...)
I've done this to debug OpenStack in the past and it worked very well. There are many similar projects for Python, I used this one since it's in the RHEL repo.
Useful summary on StackOverflow: https://stackoverflow.com/questions/4163964/python-is-it-pos...
That has been good enough for the problem of "wtf is this ruby process doing for _minutes_ at a time?" That doesn't get you flame graphs, but you can take a few snapshots and get an idea of what is happening.
For more involved perf debugging I've used ruby-prof.
“I'm constantly surprised by how many people don't know you can do this. It's amazing.”
I'm probably nitpicking, but sad to see this in the article. One of the things I love about Julia's writing otherwise is that it is free of this sort of 'I'm surprised that you don't know this simple thing.' expressions.
I'll fix it when I get home.
I'm still puzzled that there seems to be no simple gprof-style graph profiler included with the JVM.
Last time I needed this I knocked something up that could use the internals of Mission Control or Visual VM to turn their profile formats into flame graphs, but I doubt it would keep working across versions, and it really needed more work to be something anybody else could use sensibly.
It was however very easy to do (maybe half an hour's work) and produced very useful results. If you're using an invokeDynamic based language implementation however you will want to filter a lot of internal stack frames out of the graph, or you'll have trouble seeing past the LambdaForms to what's really going on.
Linux-only seems perfectly acceptable, especially as more and more development moves to Docker containers.
Great write up, looking forward to seeing a completed tool!
I think this is why you shouldn't run this on production server itself. Each call is very resource intensive on the production server.
I believe the right way to analyze memory is to use "gcore", dump the memory, download it to the local machine's VM instance that's running the same OS as the production using scp. Also download the same ruby binary that production is running, and use gdb on the VM to analyze memory dump.
And that's exactly what this blog post is about.