Hacker News new | comments | ask | show | jobs | submit login
Netflix FlameScope (medium.com)
389 points by brendangregg 10 months ago | hide | past | web | favorite | 74 comments



I’ve done something like this at Microsoft in 2002. Except in my case it was a multi-user system and each row represented a logical user, or a range of users in a range partitioned space. You could vary granularity, click through to see what’s busted, zoom into user ranges, discover correlated failures, discover cases when something is failing for just one user consistently, correlate all of it to performance counters, etc, etc. No one gave a shit.


What did you use to build it?


IIRC the standard MS stack at the time: C#, ASP.NET (which had just come out), MS SQL, IE, some in memory caching and bitmap rendering (canvas didn’t exist), AJAX. Pretty bleeding edge, but it all worked OK.


Given when you were using that then-bleeding-edge stack, I wonder if you used XML (ie, the "X" in the then-acronym AJAX).

later de-acronym-ized to "Ajax" w/ JSON replacing XML.


Of course. JSON wasn’t popularized back then. But mostly it was about reloading parts of the page through DOM manipulation.


Yeah, I was there too. (Web dev for a living since 1998.) I remember sending HTML fragments over the wire too...


If data were in XML format, XSLT could be used from javascript (or vbscript) in the browsr for the DOM generation. Was pretty slick up to a point, but since it was a blocking operation anything beyond a certain size started to noticeably lock up the browser.


I’ve done some of that as well, in different contexts. Those were the days. Virgin territory, uncomplicated, no cross-browser concerns. I haven’t touched frontend since 2005, and looking at it now it’s horrifying. Even the hairiest C++ backend is easier for me now, just because once you write it it doesn’t need to be “updated” for the next decade. Frontend seems to change every 6 months for no apparent benefit.


> "no cross-browser concerns"

wat.

the original browser wars? ns vs ie? 2 incompatible implementations w/ no tooling to help?


It was a Microsoft tool for Windows only. There wasn’t even a question that other browsers should not be supported.


Uncomplicated in those days meant IE5, everywhere.


yes but we also change every 9 months based on the analysis of what happened with the 6 month no benefit change. So there's that!


This is lovely. An interesting thing to think about here is the layout of the sequential time-series data. This visualisation breaks the sequence out into columns, which means that squares that are close to each other in the sequence don't cluster visually.

I've played with using space-filling curves for this to visualise binaries:

https://binvis.io/

You can play around on there with the scan layout (similar to the Netflix layout) and compare it to the Hilbert curve. You can click and drag to select regions in both (similar to the video for this visualisation).

There are tradeoffs - the Hilbert layout is not intuitive, and you have to play around to select regions accurately. On the other hand, the clustering lets you visually pick hotspots in much greater detail, e.g.:

https://binvis.io/#/view/examples/tcpview.exe.bin?colors=ent...

I've found it useful to be able to toggle between both scanning and clustered layouts, and plan to explore some more interaction paradigms when I get time to turn back to the project.


I did binary visualizations many years ago too:

https://github.com/brendangregg/Dump2PNG/wiki

So with the subsecond-offset heat map, you get clustering from one column to the next for events that happen at the same time offset. Which is common with periodic tasks: see the example patterns in the post.


It looks like we've traveled a similar path. The precursor to binvis is scurve, a (slow, dodgy) command-line tool to do basically what dump2png does:

https://github.com/cortesi/scurve

I have a blazing fast command-line version of binvis in Rust that I plan to release in the next month or so. I really need make some time to turn back to this.


Wow I haven't seen your name in years, I had forgotten! I just want to thank you for having an incredible blog and visualization tools. sortvis.org was frequently a resource I used to learn sorting and then later as a tutor I shared it with other students to help them learn. I've shared sortvis and the posts you did on hilbert curves with young curious family members to inspire them, and it always did! So thank you!

And Brendans blog has been more recently inspiring and educational, seeing you guys chatting on HN is too cool. Thank you both for the awesome and freely accessible material.


My pleasure! Sortvis is down at the moment, but will be back up soon. I've been swallowed up in other projects, but I have a perpetual new year's resolution to write more blog posts. Keep an eye out in the next few months. :)


binvis.io is pretty great!

So I was more interested in core dumps, which gets tricky since they can be 10s of Gbytes in size. Probably the most useful thing I figured out was this palette type:

  x86		grayscale with some (9) color indicators:
    green = common english chars: 'e', 't', 'a'
    red = common x86 instructions: movl, call, testl
    blue = binary values: 0x01, 0x02, 0x03


Building more understanding of file formats into binvis is perpetually on my todo list. This would let you interpret different segments of files differently, and correctly interpret instructions and other higher-level structures. It would also help with things like entropy calculation over sliding windows (where you don't want this to span file segments). There are so many interesting avenues to explore here.

While I'm thinking about it, you're of course quite right that spotting periodic patterns in a Hilbert curve layout is really hard. However, I've found that being able to spot them in a scanning layout is highly dependent on your column width matching/resonating with the period of the pattern you're looking for. Another thing on my todo list is to combine Hilbert layouts with Fourier transforms. There's a very suggestive paper on this that I'd love to explore:

https://arxiv.org/pdf/1508.03817.pdf


Link to the repo -- https://github.com/Netflix/flamescope

From a quick glance, it's interesting to see both Flask and Node.JS be used within the same project repo. Is this becoming more commonplace?


Not Node.JS specifically, but just npm for dependencies and task running to build the React client from source. If you're not looking to build the client from source, the distributed client bundle should suffice and only Python is required.


Even if you're using flask, there are plenty of reasons to have a package.json or other npm resources in your repo. Webpack is a good example.


> a) could not reliably capture a one minute flame graph covering the issue, as the onset of it kept fluctuating; b) capturing a two or three minute flamegraph didn’t help either, and the issue was “drowned” in the normal workload profile. Vadim asked me for help.

One interesting thing here is also the statistical nature of this problem: i.e. there are spikes but their duration are not exact or known when to start. Here[1] is some reference work done on Neural Spike Trains that seems relevant.

[1]http://www.stat.columbia.edu/~liam/teaching/neurostat-spr12/...


Wow, great visualization! Thanks for sharing the source.

A minor suggestion: For the first part of the demo i was thrown of by the fact that each second as laid out bottom->top. Maybe moving the time labels below the plot would provide a visual cue for the direction.


I think this is intentional, as the norm is for 0 on a Y-axis to be at the bottom and increase going up.

I can understand why it's confusing, and perhaps could be a UI option to reverse it, but I do think that's the reason.


Yes, that's the reason! It means we get a t=0 origin for both the x- and y-axis. But yes, we can also add UI options to reverse it and other things.


Yes, I agree with it being that way. I just think my brain parses the origin as being the corner that is closest to the "0" label. Therefore, if the labels where below the plot, there would be no parse error :)


Can you open a issue/feature request on Github?



Does anyone have any good tools or processes for generating the kind of performance counters and information needed by this tool with a running python application?

I'd love to be able to generate a flamegraph for a django app, for example, and visualise it with flame scope.

https://mail.python.org/pipermail/python-dev/2014-November/1... discusses the problems with python and the `perf` tool. There may have been some patches applied recently to help, but it doesn't look like there was a full resolution.


Not a direct answer, but there's Python support of some sort available for at least TAU <https://www.cs.uoregon.edu/research/tau/home.php>, Extrae <https://tools.bsc.es/extrae>, and Score-P <https://github.com/score-p/scorep_binding_python>.


I wrote https://github.com/akx/abyss for Django once upon a time. I'm not sure it works these days, but you could give it a shot.


What's the purpose of visualizing it as grid? I thought it might be good for debugging periodic events (e.g., every 16ms for rendering at 60fps) but it seems like this is fixed to one sec for one column.

For example in the flame graph in chrome's performance tab, you can select/slice/zoom/pan arbitrary time ranges. If you want the high level view you just zoom out.


Supporting different y-axis ranges is something we can add, and I was already thinking of adding a 1ms range for visualizing low-level CPU cycle activities (which span nano and microseconds). I like the idea of 16ms for 60fps analysis, where every column is one frame. Could help game development. :)


Actually, seeing the diagonal stripes in the examples [1], maybe it'd be useful to have a mode that lets you rescale the axes to line the stripes up? If you're using this to inspect patterns, it seems to me that the most useful time scale would not be a nice round human number, it'd be one that matches well with the patterns you're trying to inspect - one period, not one second.

Way out in the distance, maybe even a variable-interval mode, where you can click-select blocks and each one is given its own column, so you can directly compare them without the dead space in the middle and starting from the same baseline?

[1] Particularly striking: https://cdn-images-1.medium.com/max/2000/1*EypUnkPtayeKr9bJu...


You could autoscale the y axis to whatever makes the columns have the most similarity (minimize squared differences).


Rather than lsq, you could use autocorrelation within rows to line them up.


This thing is extremely periodic. You should do an FFT, there should be a peak that's more obvious than the others. That's your frequency and thus your period.


one second per column, fractions of a second per row, individual cells are fractions of that column's second.

In your example, I imagine you might see a periodic color shift along the Y axis as things aligned with that 16ms window. The example grid looks to have about 50 rows, so that would be ~20ms each, which is a bit too large for your use case. The example usage shows that you get to define the Hertz is samples at though, so you should be able to define exactly the period length you desire.


I think it's helpful for visualizing the seconds as it's a time based focused debugger.

Maybe open a pull request for a graph based view and we can A/B test ️


Nice. I had some search cluster optimization problems that this would have been very handy for. I laughed when I got to "Ah, that’s Java garbage collection," as that turned out to be one of the big contributors to variance in response time.


Longer video with more examples: https://www.youtube.com/watch?v=gRawd7CO-Q8


Awesome!

It'd be pretty sweet if you could send things sampled using Vector over to FlameScope. I know the flamegraphs PMDA is experimental but it's always worked fine for me :)

(It'd also be pretty sweet if there was a fleshed out PMDA for bcc tools. And Vector supported them. And, and, and... I have a really long wishlist.)


We'll get it done (although I'm not exactly sure who "we" will end up being yet!). I improved the flamegraph PMDA late last year, but how FlameScope works is really the direction Vector should go: being able to make an API call to get the flame graph stacks as JSON, and then render them in d3, along with buttons for transforms.


What's the advantage of this sort of thing over the trace and profile analysis tools that have been developed at least for HPC performance engineering over many years? (E.g. the BSC tools <https://tools.bsc.es/> amongst several others.) In particular, why are flame graphs preferred to the normal inclusive/exclusive views of profiles which seem more generally useful to me?


I know I am missing something, but passing -F 49 to perf will tell the kernel to adjust the sampling interval dynamically so approximately 49 events/s are recorded, right?

But then how does it make sense to plot the number of events per second in the overview? Or what exactly is being plotted in the overview?

Edit: Perhaps it is simply that idle events are discarded and multiple CPUs allow more than one event in the period?


Haven't followed up on this lately — is sampling java stacktraces with perf going well? what happened to the frame pointer issue? do we still have to use -XX:+PreserveFramePointer to get the full stack?

Awesome tool nevertheless. Always a pleasure to go through Brendan's utilities regarding performance analysis!


Yes, perf sampling of Java is still going well, via -XX:+PreserveFramePointer. Our one microservice that had high overhead with that option (up to 10%, which is rare) has improved their code, lowered the overhead, and now enabled this option by default.

But there's also a newer way that can do Java stack sampling from perf without the frame pointer: https://github.com/jvm-profiling-tools/async-profiler

We're not running it yet. I want to try it out. Note that the stacks with async-profiler are a bit broken -- Java methods become detached from the JVM -- but I'm hoping that's fixable (it should be with a JVM change, at least).


All the native linux thread stack visualizers I have used so far lack a way to show single-threaded bottlenecks in multi-threaded applications. Does this solve it?

In the java ecosystem this is solved with utilization timelines.


I don't know which of the normal parallel performance tools support Java, but at least TAU <https://www.cs.uoregon.edu/research/tau/docs.php> does; not that I've worked on Java, which should preferably be kept well away from compute nodes.


Not directly related, but all of your posts on this subject contain broken links - please add spaces between your link and your delimiter, if you are going to use a delimiter, as doing <link> makes the > part of the link, requiring manual removal.


Oh, my question wasn't about java. Quite the opposite. I'm missing the perf equivalent of some java tools I am accustomed to.


I don't know what "perf equivalent" means. Parallel performance tools usually use PAPI, which provides similar data, possibly a superset of perf data on Linux. Score-P has an actual perf plugin. http://www.vi-hps.org/training/material/ is one source of material on such things, though biased towards distributed (MPI) systems and OpenMP threading.

[Sorry I hadn't realized <> wasn't work as URL delimiters here, as I'm used to elsewhere since the RFC <URL:...> convention got ignored.]


Pretty awesome - I recall a long time ago a few companies selling software that would do similar (but different) things, so this can help us get closer.


Given the existence of tools like this or https://github.com/uber/pyflame, I get really amazed at the monopoly of NewRelic. Nobody seems to be even attempting a start-up in this space, which is amazing given the amount of money that everyone coughs up for NewRelic.


What was the root cause of the problem that motivated this visualization tool?


It's described in the blog post under "Origin", and the sample profile that we ship with FlameScope (examples/perf.stacks01) and which I used in the video is the first minute of the original problem.

It was intermittent latency that was hard to explain using existing tools.


Can this be used with Node.js CPU Profiles?


Yes, I've already used it on Node.js profiles. :) Get any instructions for doing CPU flame graphs with Linux perf (I've written several), and take the output stage of "perf script" and you can browse it in FlameScope.


Depends on what you mean by Node.JS CPU profiles, if captured by `perf script`, then yes, otherwise, not yet.


I wish they wouldnt host this on Medium, the pop-ups and layout are maddening and it almost seems unprofessional for a giant company like Netflix to use Medium instead of its own blog platform on their website.


Yes, Medium has terrible reading experience. I found this extension (https://github.com/thebaer/MMRA) a while ago. You should try it.


I keep reading comments like this, but I don’t ever see any particularly glaring issues when reading from my iPad. Is it a desktop / chrome thing that iOS / Safari is automatically mitigated against, or are my standards just so low that I don’t notice the issues?


I repeatedly highlight whereever I'm at to be able to focus on a particular part of the text and keep track of it.

Medium automatically adds a tooltip so I can share whatever I highlighted to Twitter/Facebook.

It's an accessibility issue, where they are privileging hypothetical social sharing of content over readers who actually need to consume the content.



Maybe they don’t do it on mobile but on desktop they have a pop-up demanding you create an account to keep reading.


Even when I open the link in an incognito window, I don't see this pop-up. Do you need to be on the page for X time to see it?


Second request in a certain period, I think. I opened this link twice. The second time I got the pop-up, which starts out with something like "We've seen you here before...". I just tried a third time and it did not create a popup.

As things go, I think they're actually handling that part fairly well.


Pretty sure it looks for an existing cookie to pop it up.


answer my question (in the other thread)...


Are you really stalking me to get me to respond to a thread I participated in two days ago? That’s odd.


Would have emailed you if you had one in your bio ;)


Some employers will block netflix, which is one reason not to put their technical blogging on their own site.


https://outline.com/hpckBV

Although you lose the images so not that helpful in this situation, but usually good for medium.




Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: