
How to find per-process I/O statistics on Linux - jawngee
http://www.xaprb.com/blog/2009/08/23/how-to-find-per-process-io-statistics-on-linux/
======
nailer
Shorter method: run 'iotop'.

This tool is packaged in every current Linux OS.

If you want to change IO priority, run ionice.

~~~
Dobbs
I call your bluff. 'iotop' may be included in many linux distro's but don't
say every. My Arch box doesn't have it, not by default anyway.

edit: What I'm trying to say, is that your statement is overly broad. Not all
linux distro's have 'iotop' installed. It is not a guarantee. Particularly on
specialized distros.

~~~
jacquesm
not by default for most distros means an

apt-get install iotop

or

yum install iotop

away. That took about 6 seconds on the two machines I tried it on. So even if
it is not installed by default it hardly is a hindrance that it isn't.

~~~
blasdel
iotop is a very thin ncurses frontend to taskstats in the linux kernel, and is
totally dependent on TASK_DELAY_ACCT and TASK_IO_ACCOUNTING being enabled when
the kernel is built.

They were introduced in 2.6.20, so the stupidly conservative distros that have
been frozen for years on 2.6.18 or 2.6.19 don't get to play.

On nice distros that don't force-feed you their kernel and litter their repos
with broken-out kernel modules, installing something like iotop is not
necessarily just a call to the package manager.

~~~
nailer
'litter their repos with broken-out kernel modules'

You don't seem to understand why modules exist. It doesn't matter how many
modules are available, they won't be loaded unless they're needed. Eg, for PCI
hardware, unless you have hardware that modules.pcimap matches to a driver.
There is no overhead from having modules available to load, just extra
convenience next time you, say, add a NIC or somesuch.

Also no distro force-feeds you a kernel. You're always able to build your own
in the rare event that you need to, or the more likely event that you just
feel interested in doing so.

For a business, most people can understand the benefit of using the same
software that a few million others do.

~~~
blasdel
When I said 'broken-out kernel modules', I was referring to modules that are
distributed as independent packages in the repository, instead of just sitting
in /lib/modules/$(uname -r)

I know exactly why modules exist, having written my own several times. The
real benefit for stuff distributed with the mainline kernel is not runtime
loading (you could just build them all statically), but _unloading_ and
_reloading_.

------
sprachspiel
Anyone has an idea if there is any way to distinguish random from sequential
IO?

In my experience sequential IO is never the problem. Instead, it seems to me
that random seeks are really the only performance problem nowadays. In the
most extreme case throughput is only ~100Bits/s instead of ~100MB/s.
Unfortunately random seeks are hidden behind abstraction layers and thus quite
invisible to programmers (until the system freezes). Maybe we just need to
wait for SSDs to become cheap.

~~~
gnaritas
SSD's are cheap.

~~~
blasdel
...and cheap SSDs are _pathologically bad_ at random writes -- at least an
order of magnitude _worse_ than an mediocre hard drive on average, with
regular latency spikes of _several seconds_!

~~~
wmf
When measured in $/IOPS, even good SSDs are cheap. I think the real problem is
the lack of caching software (but I would think that).

