
Where has my disk space gone? Flame graphs for file systems - brendangregg
http://www.brendangregg.com/blog/2017-02-05/file-system-flame-graph.html
======
ajnin
A flame graph does not feel like the best visualization for this kind of data.
You can't really make anything of the top spikes and there is a lot of blank
space that is wasted. I prefer tree maps as they make more efficient use of
the available screen space.

My favorite tool for this is SpaceMonger v1.4.0
([http://www.aplusfreeware.com/categories/LFWV/SpaceMonger.htm...](http://www.aplusfreeware.com/categories/LFWV/SpaceMonger.html)),
which has a very neat layout algorithm. It's a Windows app but it works OK
using Wine with only a few minor graphic glitches.

~~~
jamesrom
On Windows, WinDirStat is a great tool.
[https://windirstat.net/](https://windirstat.net/)

~~~
kraftman
And ADirStat for Android
[https://play.google.com/store/apps/details?id=org.flightofst...](https://play.google.com/store/apps/details?id=org.flightofstairs.ADirStat)

~~~
digi_owl
I'm a sucker for DiskUsage myself.

[https://play.google.com/store/apps/details?id=com.google.and...](https://play.google.com/store/apps/details?id=com.google.android.diskusage)

------
dewey
I usually always recommend
[https://dev.yorhel.nl/ncdu/scr](https://dev.yorhel.nl/ncdu/scr) for this
purpose. Doesn't look as colorful but the output is basically the same. The
SVG output is neat though!

~~~
brendangregg
Nice, but the output isn't the same. A flame graph can show multiple levels of
subdirectories at the same time, proportionally sized to their total bytes, up
to a maximum of the screen hight (say, 24). Looks like ncdu can only show 1
directory level at a time.

~~~
vthriller
The thing with flame graphs, pie charts and all that is that I rarely find it
that useful to see multiple directory levels at the same time. Unless it is
/var/log/omg-why-dont-you-rotate that occupies hundreds of gigabytes, I always
end up navigating through the largest near-the-root folders with baobab just
to make room for subdirectories on the screen, which is hardly different from
the typical ncdu (or even 'du -hs *, then cd') workflow.

------
m1el
I had this exact idea almost a year ago:

[https://github.com/m1el/du-flamegraph](https://github.com/m1el/du-flamegraph)

    
    
        find $1 -type f | xargs -n32 -d'\n' -- du -ab \
        | awk -F'\t' '{print $2" "$1;}' | sed 's#/##;s#/#;#g' \
        | flamegraph.pl --title "Disk usage" --countname "bytes" \
          --nametype "file" --width 1900 --minwidth 0.05
    

... and tested it using linux sources:

[http://m1el.github.io/du-flamegraph/linux-
sources.svg](http://m1el.github.io/du-flamegraph/linux-sources.svg)

[http://m1el.github.io/du-flamegraph/illumos-
gate.svg](http://m1el.github.io/du-flamegraph/illumos-gate.svg)

[http://m1el.github.io/du-flamegraph/dota2.svg](http://m1el.github.io/du-
flamegraph/dota2.svg)

~~~
brendangregg
I've done this before with awk, but with "find ... -ls". I don't know why you
want to use du with xargs (is it really to get du's measure of bytes rather
than the stat's count, for when they are different?) -- as that's calling over
a thousand du's to walk the linux tree, when find can print bytes built-in
with -ls (much more efficient).

~~~
m1el
Well, this could be explained by my ignorance in *nix tools.

------
throwawayish
KDE: Filelight,
[https://docs.kde.org/trunk5/en/kdeutils/filelight/introducti...](https://docs.kde.org/trunk5/en/kdeutils/filelight/introduction.html)

Windows: WinDirStat

~~~
janus24
Mac: Disk Inventory X [http://www.derlien.com/](http://www.derlien.com/)

~~~
joe_developer
Download link is broken.

Alternative ways to download:

a) Use brew: brew install Caskroom/cask/disk-inventory-x

b) Another download link (same link used in brew/cask)[0]:
[http://www.derlien.com/diskinventoryx/downloads/dev/DIX1.0Un...](http://www.derlien.com/diskinventoryx/downloads/dev/DIX1.0Universal.dmg)

0: [https://github.com/caskroom/homebrew-
cask/blob/master/Casks/...](https://github.com/caskroom/homebrew-
cask/blob/master/Casks/disk-inventory-x.rb)

------
jackyinger
For Linux Baobab is a great utility like this. It does the circular equivalent
of a flame though, which can be much easier to read as the circumference of
deeper directories gets larger by nature.

~~~
CarVac
And the name is amazing (a Little Prince reference).

~~~
grzm
I'd love for this to be true. Do you have a reference for this? Baobab trees
are spectacular in their own right. Might the usage in the Little Prince be
just a coincidence?

~~~
CarVac
Sorry, that's just my assumption (though I'd be willing to bet on it). The
circular graph reminds me of nothing more than the image of baobabs swallowing
up a miniature planet.

------
brendangregg
After all the comments on treemaps and sunburst layouts, I think many readers
might enjoy the following article if they haven't seen it already: A "Tour
through the Visualization Zoo", by Jeffrey Heer, Michael Bostock, and Vadim
Ogievetsky.[1] See the section on Hierarchies.

In that article, the icicle graph (same as the flame graph) has long
rectangles suitable for labeling -- although their example has the font
rotated 90 degrees and not making the best use of that space.

But as you can see, these are all related. Hierarchy visualizations. If I were
serious about building a tool to do disk space visualizations, I'd let the
user toggle between all three visualizations.

I should also note that I think both treemaps and sunburst layout have their
own cons, which I've mentioned here on HN.

[1]
[http://queue.acm.org/detail.cfm?id=1805128](http://queue.acm.org/detail.cfm?id=1805128)

------
jonny_eh
Very nice! On the Mac I recommend GrandPerspective:
[http://grandperspectiv.sourceforge.net/](http://grandperspectiv.sourceforge.net/)

~~~
wiiittttt
For Windows: [https://windirstat.net/](https://windirstat.net/)

~~~
hug
I prefer Wiztree (link below), especially on systems with much larger disks.
Since Wiztree works by scanning the MFT it's about an order of magnitude
faster than Windirstat or Treesize.

[http://antibody-
software.com/web/software/software/wiztree-f...](http://antibody-
software.com/web/software/software/wiztree-finds-the-files-and-folders-using-
the-most-disk-space-on-your-hard-drive/)

~~~
kermire
Hey, thanks for this. Was using WinDirStat. This is really fast! Plus I like
the minimalistic interface.

------
falsedan
When you have a hammer, you hammer out a reimplementation of `ncdu`.

I'm a die-hard fan of `du -kx dir | sort -rn | less`.

~~~
e12e
I prefer "du -h|sort -h" (for new versions of GNU sort, that support sorting
"human readable" sizes.

~~~
falsedan
Oh nice, time to update the muscle-memory firmware.

------
chronial
This tool for windows is not known well enough:
[http://www.steffengerlach.de/freeware/](http://www.steffengerlach.de/freeware/)

Screenshot:
[http://www.steffengerlach.de/freeware/scnshot.gif](http://www.steffengerlach.de/freeware/scnshot.gif)

------
LeifCarrotson
The height of the flame in this graph does not seem very informative.

In the example, the 3.88% used by the tower of
"linux-4.9-rc5/drivers/gpu/drm/amd/include" looks larger than
"linux-4.9-rc5/net" over on the right just because the tree is deeper.
Similarly, on my Windows machine at work,
"C:\Users\LeifCarrotson\AppData\Local\Microsoft\Outlook\Leif@Carrotson.com.pst"
(where my email is stored) would take up more screen space than
"C:\hiberfil.sys", just because it's taller.

I don't much care about the depth of the filesystem. The various tree views
seem more useful.

------
Laforet
There exists a 4 year old bug in Ubuntu LTS releases (not sure whether
mainline Debian is affected too but it is very likely) in which kernel
upgrades fail to remove old headers. Because the way headers are structured it
is possible to run out of inodes long before free space is exhausted if you
don't pay attention to inode use.

[https://bugs.launchpad.net/ubuntu/+source/update-
manager/+bu...](https://bugs.launchpad.net/ubuntu/+source/update-
manager/+bug/1089195)

------
adambrenecki
There's a bunch of other tools[1][2] that do this sort of thing already,
although all the ones I've seen display file size as a sort of flame graph/pie
chart hybrid (imagine a flame graph wrapped around a circle).

This flat representation is probably better because it doesn't exaggerate the
size of deeply nested files, but I find the example in the article a bit
harder to read.

[1]: [https://daisydiskapp.com/](https://daisydiskapp.com/) [2]:
[http://www.jgoodies.com/freeware/jdiskreport/](http://www.jgoodies.com/freeware/jdiskreport/)

~~~
brendangregg
You're right, the flame graphs (which is really an adjacency diagram with an
inverted icicle layout) doesn't exaggerate subdirectories like a sunburst
layout does. I wrote about problems with the sunburst layout before in
ACMQ[1]:

> The sunburst layout is equivalent to the icicle layout as used by flame
> graphs, but it uses polar coordinates.7 While this can generate interesting
> shapes, there are some difficulties: function names are harder to draw and
> read from sunburst slices than they are in the rectangular flame-graph
> boxes. Also, comparing two functions becomes a matter of comparing two
> angles rather than two line lengths, which has been evaluated as a more
> difficult perceptual task.10

(I should have mentioned that they visually exaggerate deeper slices, too). I
think they are pretty, but, more difficult to read.

The other app has a pie chart and trees. Both can't visually show everything
at once, all subdirectories.

[1]
[http://queue.acm.org/detail.cfm?id=2927301](http://queue.acm.org/detail.cfm?id=2927301)

~~~
masklinn
I agree with that criticism with respect to profiling, but for filesystems I
find the increased surfaces of the outer rings makes the evaluation easier as
the filesystem reaches smaller amounts of data, especially as FS tend to be
_relatively shallow_ in POI terms.

~~~
dkersten
The problem is that humans are not very good at comparing angles, which is one
of the reasons why pie and donut charts aren't good (and sunburst is kinda
like a donut, so the same criticism applies).

------
jaimex2
I recommend Baobab, its been around for ages and is on pretty much every
distro's repo already.

Shows you a radial chart of your disk you can explore and zoom into and even
better lets you delete directly from its ui.

------
cogs
We built Crab so we could run SQL queries over the filesystem. It's for Win
and Mac, free for personal use, and lets you slice this anyway you want and
plot the data how you like. I like the flexibility.

[http://etia.co.uk/win/about/](http://etia.co.uk/win/about/)

e.g.

    
    
      SELECT sum(bytes) FROM files
      GROUP BY extension 
      ORDER BY sum(bytes) DESC;
    

You can GROUP BY any file attributes

------
drinkjuice
Another oldie but goldie for Windows:

[http://www.win.tue.nl/sequoiaview/](http://www.win.tue.nl/sequoiaview/)

------
theoh
There was a simple X application called "xdu" from the 90s that did this job
adequately.

At the other end of the practicality scale was the SGI tool fsn, recreated in
open source by this:
[https://en.m.wikipedia.org/wiki/File_System_Visualizer](https://en.m.wikipedia.org/wiki/File_System_Visualizer)

~~~
ailideex
I use this ...
[http://xdiskusage.sourceforge.net/](http://xdiskusage.sourceforge.net/)

~~~
amelius
Me too.

------
chris_wot
A bit off topic, but Windows needs this baked in. And Windows needs hard links
to update ALL links when the underlying file changes, because right now if you
update a file from one hardlinked file the other hardlinked files don't change
their metadata.

Quite ridiculous, it's not like this hasn't been something Unix has been doing
since almost the beginning!

Right now we have a ridiculous situation where the winsxs folder gets out of
sync with the c:\windows\system32 folder. Nothing treeview or any other
utility can you daily do about it either. And until recently that winsxs store
was holding gigabytes of old and useless updates, because Microsoft's updates
never removes these components (recently - last year some time I think- they
updated the disk cleanup GUI to delete this stuff from Windows 7 upwards. IMO
they recognised a big stuff up caused by their product management team's
decision, which in turn caused this unheralded enhancement).

~~~
ghostly_s
Doesn't Windows still lack the ability to do several basic file-system
operations without resorting to goddamn DISKPART? FS management on that
platform in general is a disaster.

------
brendangregg
I posted a follow up:
[http://www.brendangregg.com/blog/2017-02-06/flamegraphs-
vs-t...](http://www.brendangregg.com/blog/2017-02-06/flamegraphs-vs-treemaps-
vs-sunburst.html)

------
beeswax
Can anyone recommend a tool for macOS that monitors disk usage changes over
time?

I usually use Disk Inventory X but I'd really like to correlate usage
increases to specific dates / app installs, so it'd be nice to see stats over
time, e.g:

\- Installed Android Studio on Feb 1st: Usage in /Applications increased by
850MB, usage in User folder increased by 10G (450MB for android-
studio-2.x.dmg, 8.4GB in /Users/name/.android, largest leaf in
/Users/name/.android/sdk etc)

I tried to do this with the 'du' tools once but simply writing the current
output to disk would take ages and diffs would need some heavy lifting to make
sense of.

------
Sukram21
For visualizing your Dropbox' contents as a pie chart, a friend and I
developed dgraph:
[https://github.com/joplapp/dgraph](https://github.com/joplapp/dgraph)

~~~
47_
wow that looks nice, I've always wondered why there isn't some tool like that

------
dioltas
This is really cool. I really like the cpu / profiling usage too for flame
graphs.

For disk usage I would normally use du and/or filelight which is also great.

This is a nice way to visualise all the sub directories too in one go though.

------
maxxxxx
On Windows I like Treesize [http://www.jam-
software.de/treesize_free/?language=EN](http://www.jam-
software.de/treesize_free/?language=EN)

------
annnnd
Nice - ideal use of flame graphs. When you are running out of disk space, you
want to know which parts are using what proportion of disk space and IMHO this
visualization is perfect for that. Kudos!

On a side note, when I'm on a server with disk space problems, I usually debug
it like this:

    
    
        # cd /
        # du -sm | sort -n | tail
          (lists the biggest space users)
        # cd <unusually big subdir>
          (goto step 2: du -sm...)
    

Works like charm, but can be a bit slow when applied to large and slow disks.

------
nobrains
It's nice that you created something on your own.

On Android I use DiskUsage:
[https://play.google.com/store/apps/details?id=com.google.and...](https://play.google.com/store/apps/details?id=com.google.android.diskusage)

Screenshot: [http://imgur.com/CrVcdEk](http://imgur.com/CrVcdEk)

------
djaychela
On Windows, I've always used HDGraph[1], but as no one has suggested it I'm
wondering if I'm missing something important as to why not? I've never really
liked the appearance of WinDirStat, but that's just a personal niggle, I know
it works well.

[1] [http://www.hdgraph.com](http://www.hdgraph.com)

------
partycoder
For me a tree map is clearer and more screen-space efficient than a
flamegraph.

qcachegrind renders profiling output as tree maps, and for filesystems there
are many alternatives, e.g: Baobab (aka Disk usage analyzer) does this, also
KDirStat.

------
gjkood
I am partial to DaisyDisk on OSX. Nice UI. It comes in very handy on the lower
capacity Flash Disks (64GB) on the early generation Macbook Airs. Needs
constant housekeeping on those.

------
akerro
KDE has this thing built-in, I think it's called kdirstat and you can get to
it by clicking on directory size (bottom right corner) in Dolphin.

~~~
pbhjpbhj
Wow - I have an add-on to add "open with k4dirstat" to the right-click menu.
Have used KDE5 since it came out and did not know about that. There is no
affordance there at all; it doesn't even highlight on mouse-over.

I get a menu with options for not-installed apps - filelight, kdiskfree and
two partition tools. 'sudo apt install filelight' and it works, but TBH I'm
not keen on filelight, would be nice to have k4dirstat as an option there
(sounds like a bug report is due).

------
ubercow
This is beautiful. Great stuff. I've always had tools on desktops to do this,
but never anything I could just run on a server.

------
hrjet
Interesting that "drm" occupies almost all of the "drivers/gpu" space!

~~~
Pengwin
Well in this case DRM means Direct Rendering Manager
([https://en.wikipedia.org/wiki/Direct_Rendering_Manager](https://en.wikipedia.org/wiki/Direct_Rendering_Manager))
What I find even more interesting is AMD takes over half of that.

------
emodendroket
I always liked WinDirStat.

------
jwatte
WinDirStat for the win!

------
ghostly_s
Conveniently, I no longer need to delve into any of these visualization tools,
because the answer to the question in the title is always "Debian's overly-lax
default logrotate.d"

------
foxhop

        cd / && du -sh *

~~~
brendangregg
I'd try that first, and if it didn't find my space, then flame graphs.

du's output does the first level of directories, but not subdirectories, and
also requires reading of text rather than visually comparing line lengths
(easier).

~~~
foxhop
You can quickly move toward the problem and continue to run 'du'.

~~~
brendangregg
If I'm at the command line already, that may well work.

Or, it may not (the initial problem I had, I already knew the high level
breakdowns, and was hunting for wasted 1%'s here and there -- the flame graph
made it easy to spy everything at once).

Another use of the flame graph approach is with automated build software.
Imagine automatically generating one with every linux version, to keep track
of where growth is.

