

UNIX Load Average Part 1: How It Works - helwr
http://www.teamquest.com/resources/gunther/display/5/

======
Erwin
I prefer looking at output of "sar" which shows you nicely, in 10-minute
increments, how idle the system was today (and I think usually this data
captured is rotated daily for a month), and gives you also a good idea of
whether you have processes waiting excessively for IO.

It also has a bunch of other options.

On my own system, I also generally run "ps auwx" every 15 minutes, together
with a scan of what queries Postgres servers are doing and dump of web request
activity (read last 10k lines from access logs, find out how long ago the
first request was to determine rough hits per second and ares of application
they hit). That way when someone says "hey, the system was slow around this
time" I can go back and find out that some cron job had a dozen processes
taking up tons of memory or blocking on IO.

Some of those statistics also go into some RRD-based system which makes it
easier to follow e.g. number of users logged in or number of Apache children
based on weekday/time of day.

~~~
nailer
sar gets its info from /proc/loadavg (on Linux OSs), just like top and uptime
do, which is produced using the same code the article shows.

~~~
Erwin
We must be running different versions of sar then, as "sar" by itself here
(RHEL 5) shows information about the time split between
user/system/waitIO/idle -- that certainly does not come from /proc/loadavg.

If you run "sar -q" you could get the load average information, but that's not
particularly useful, as you can't see whether the 20 load avg an hour ago was
caused by heavy disk IO or a dozen CPU bound processes.

~~~
nailer
Nope, we're likely running the same version. The particular info you mentioned
comes from /proc/stat (you're right that it's a different file), but again
it's the same sources as top:

    
    
        # lsb_release -d
        Description:    Red Hat Enterprise Linux Server release 5.3 (Tikanga)
    
        # rpm -qf $(which sar)
        sysstat-7.0.2-3.el5
    
        # strace /usr/lib64/sa/sa1 1 1 &> results
    
        # grep open results                      
        open("/etc/ld.so.cache", O_RDONLY)      = 3
        open("/lib64/libtermcap.so.2", O_RDONLY) = 3
        open("/lib64/libdl.so.2", O_RDONLY)     = 3
        open("/lib64/libc.so.6", O_RDONLY)      = 3
        open("/dev/tty", O_RDWR|O_NONBLOCK)     = 3
        open("/proc/meminfo", O_RDONLY)         = 3
        open("/usr/lib64/sa/sa1", O_RDONLY)     = 3
        open("/etc/ld.so.cache", O_RDONLY)      = 3
        open("/lib64/libc.so.6", O_RDONLY)      = 3
        open("/etc/localtime", O_RDONLY)        = 3
        open("/sys/devices/system/cpu", O_RDONLY|O_NONBLOCK|O_DIRECTORY) = 3
        open("/proc/tty/driver/serial", O_RDONLY) = 3
        open("/proc/interrupts", O_RDONLY)      = 3
        open("/proc/net/dev", O_RDONLY)         = 3
        open("/proc/diskstats", O_RDONLY)       = 3
        open("/var/log/sa/sa06", O_RDWR|O_APPEND) = 3
        open("/proc/stat", O_RDONLY)            = 4
        open("/proc/meminfo", O_RDONLY)         = 4
        open("/proc/loadavg", O_RDONLY)         = 4
        open("/proc/vmstat", O_RDONLY)          = 4
        open("/proc/sys/fs/dentry-state", O_RDONLY) = 4
        open("/proc/sys/fs/file-nr", O_RDONLY)  = 4
        open("/proc/sys/fs/inode-state", O_RDONLY) = 4
        open("/proc/sys/fs/super-max", O_RDONLY) = -1 ENOENT (No such file or directory)
        open("/proc/sys/fs/dquot-max", O_RDONLY) = -1 ENOENT (No such file or directory)
        open("/proc/sys/kernel/rtsig-max", O_RDONLY) = -1 ENOENT (No such file or directory)
        open("/proc/net/sockstat", O_RDONLY)    = 4
        open("/proc/net/rpc/nfs", O_RDONLY)     = -1 ENOENT (No such file or directory)
        open("/proc/net/rpc/nfsd", O_RDONLY)    = -1 ENOENT (No such file or directory)
        open("/proc/diskstats", O_RDONLY)       = 4
        open("/proc/tty/driver/serial", O_RDONLY) = 4
        open("/proc/interrupts", O_RDONLY)      = 4
        open("/proc/net/dev", O_RDONLY)         = 4
    
        # strace top -n -b 1 &> results
        # grep open results 
        open("/etc/ld.so.cache", O_RDONLY)      = 3
        open("/lib64/libproc-3.2.7.so", O_RDONLY) = 3
        open("/usr/lib64/libncurses.so.5", O_RDONLY) = 3
        open("/lib64/libc.so.6", O_RDONLY)      = 3
        open("/lib64/libdl.so.2", O_RDONLY)     = 3
        open("/proc/stat", O_RDONLY)            = 3
        open("/proc/sys/kernel/pid_max", O_RDONLY) = 3
        open("/etc/toprc", O_RDONLY)            = -1 ENOENT (No such file or directory)
        open("/root/.toprc", O_RDONLY)          = -1 ENOENT (No such file or directory)

------
julio_the_squid
This is the second or third article I've read explaining load average, and sad
to say I still can't explain it.

All I know is that when it's inexplicably over 3-4, you can't determine why as
(no processes are using high CPU), and 1 in ten database queries are taking to
5 minutes, under normal load, that day will not be the best of your life.
Well, what I've determined is that a full storage device, or overloaded i/o
system for your disk, can produce very high load along with accompanying
performance of doom.

~~~
nailer
> All I know is that when it's inexplicably over 3-4

Depends on the box. Say you:

a) have an app that spawns worker threads on demand and doesn't have a limit

b) Have a modern, cheap Nehalem X5600 CPU.

2 sockets * 6 cores * 2 threads per socket mean you're only fully realizing
the investment when you have a load average of 24 (assuming that each of these
cores is 0% idle, which you can tell with eg top).

High IO would show up as high %system time vs %user.

~~~
nailer
'Threads per socket' should read 'threads per core' (long day).

------
patrickgzill
A couple notes:

1\. Load averages are not directly comparable across different versions of
Unix's such as FreeBSD, Linux, Solaris, etc.

2\. Load average is a good way to quickly check if there is anything else to
look at.

However, a far better series of tools is sar plus iostat , vmstat etc. which
should help you more quickly determine whether your problem is CPU, disk, or
network IO .

------
mhd
So, what about threads? I've seen Linux versions where they all had the same
load as the base process, resulting in a huge load in a multi-threaded server
application.

~~~
spudlyo
In Linux, running threads that are shown in 'top' (with thread mode 'H' on) to
be in state 'R' or 'D' are counted in the load average. You'll definitely see
this in programs with a ton of threads, like MySQL or the Java JVM.

If you're not in thread mode in top, this is why you will sometimes see a
process consuming > 100% CPU usage.

------
petercooper
Unnecessarily rewritten headline.

------
pinko
As far as I can tell, this article doesn't mention the most frequent problem
people have comprehending high load averages on Linux: the inclusion of
processes in uninterruptible sleep. This can result in a system with virtually
no CPU load reporting a very high load average. (And, IMHO, makes the load
average much less useful a metric than a pure running/run-queue-based method.)

------
barlo
Awesome article with great detail and explanation. I've always been curious
about how load averages are calculated and what exactly they mean.

~~~
spudlyo
tl;dr version:

The average number of threads/processes in the run queue and/or blocked on
disk i/o sampled at 1, 5, and 15 minutes.

------
pak
How about an article about measuring memory usage?

From what I've read, measuring the actual total memory consumption of an
individual process is nontrivial on both Mac OSX and Linux, because of the way
the stats are generated for things like ps, top, etc. and the way both kernels
share memory between processes whenever possible.

------
geoffc
My rule of thumb is that the end users of a web application will perceive that
the system is "slow" if the 1 minute load average is above 4 on the
application or database server.

~~~
seiji
Huh? What if I have a 48 core AMD server? A fully CPU-bound load average of 48
would be great.

Load average can tell you if the system is under-used, but it can't tell you
how the system is over-used.

I regularly have a few dual core systems spike to load averages of 50+ because
of suboptimal NFS mounts. The NFS issue keeps processes waiting around with
nothing to do for a while. Those processes are still counted towards the
"load" even though they have no CPU activity (they are blocked on IO).

~~~
FlorinAndrei
Always divide the load average by the number of cores.

Maybe we should re-define load average, as the old definition divided by the
number of cores.

