

Linux kernel performance: Flame Graphs - brendangregg
http://dtrace.org/blogs/brendan/2012/03/17/linux-kernel-performance-flame-graphs/

======
nitrogen
You know you've written a great article when it has tons of points on HN with
zero comments...

Having used a few profiling tools in the past (such as gprof, valgrind's
callgrind/cachegrind through kcachegrind, and of course clock_gettime()), I
think flame graphs are the best visualization of that kind of data I've seen
so far. I wonder what other performance visualizations exist that I haven't
seen. I'd love to see a 3D flame graph that takes multiple samples over time,
probably with overlapping windows.

------
nosequel
If you haven't read anything else from Brendan Gregg before, do yourself a
favor and bookmark his blog. I've learned more from his blog than anyone
else's.

------
SkyMarshal
Very cool. Exec Summary:

 _Conclusion

With the Flame Graph visualization, CPU time in the Linux kernel can be
quickly understood and inspected. In this post, I showed Flame Graphs for
different workloads: networking, file system I/O, and process execution. As a
SVG in the browser, they can be navigated with the mouse to inspect element
details, revealing percentages so that performance issues or tuning efforts
can be quantified.

I used perf_events and SystemTap to sample stack traces, one task out of many
that these powerful tools can do. It shouldn’t be too hard to use oprofile to
provide the data for Flame Graphs as well._

<https://github.com/brendangregg/FlameGraph>

------
xtacy
Interesting visualization, but it may be a tiny bit misleading. My first
thought was that the larger the horizontal bar, the more time spent in that
function, but that's not the case. Rather, it captures the number of times the
function was seen in some execution path.

So if you have:

    
    
        f():
            do_less_work()
            do_large_work()
    

f()'s width is inflated even though it's do_large_work() that does most of the
work.

The actual execution time of a function is its xwidth less the sum of all its
childrens' xwidths.

