
The Magic of strace - chadfowler
http://chadfowler.com/blog/2014/01/26/the-magic-of-strace/
======
rosser
Small, somewhat nit-picky critique: the man pages for system calls are in
section 2. If you want to see the docs for the "read()" syscall, and not the
bash builtin "read", saying "man read" w̶o̶n̶'̶t̶ may not (see follow-up) do
what you expect. Instead, you should say

    
    
      man 2 read
    

This should probably be mentioned somewhere.

Otherwise, great writeup. Thanks for sharing!

(edited)

~~~
JoshTriplett
There's a manpage bash-builtins in section 7, but I've never seen a system
that had manpages for the individual builtins, let alone having them in
section 1. "man read" on every system I've used opens the read manpage in
section 2.

~~~
rosser
My local CentOS 6.2 VM has bash-builtins in section 1, which is where "man
read" takes me. Same with a RHEL 6.4 machine in the colo.

This might be a distro-specific thing.

~~~
_delirium
Same with OSX: "man read" opens the builtin(1) manpage. However on Debian 7.3,
it opens read(2).

~~~
noselasd
You can configure the priority of the sections in /etc/man.conf or similar.

------
MiWCryptohn
Don't forget it's userspace equiv (strace is syscalls), ltrace. This tracks
all lib calls made by process.

Under windows, strace is an SSL/TLS monitoring tool (also hella useful). It
shows payloads passed to CryptoAPI/CNG libs so you can easily troubleshoot
explicitly encrypted protocols like ldaps. Especially useful if you use client
authenticated TLS where is is not possible to use a TLS mitm proxy to snoop
the layer 7 data.

~~~
wslh
Shameless plug: if you want to trace Windows applications you can take a look
at my company products SpyStudio[1] and Deviare[2]. Before downvoting me try
them to see how powerful and unique they are in the Windows ecosystem.

VMware is using SpyStudio for creating and troubleshooting application
virtualization packages, this is, for example, a twitter post from a VMware
escalation engineer:
[https://twitter.com/DooDleWilk/status/428562701313662977](https://twitter.com/DooDleWilk/status/428562701313662977)

[1] [http://www.nektra.com/products/spystudio-api-
monitor/](http://www.nektra.com/products/spystudio-api-monitor/)

[2] [http://www.nektra.com/products/deviare-api-hook-
windows/devi...](http://www.nektra.com/products/deviare-api-hook-
windows/deviare-in-process/)

~~~
yread
Thanks, it actually works quite well!

------
leoh
Mac OS X has a suite of tools built on a similar package called
dtrace—opensnoop and execsnoop. Gives really nice real time lists of all files
opened on the system and all binaries executed, respectively.

~~~
IbJacked
Thanks for those, very cool and useful! Every time I see blog posts like the
above, and comments like yours and others, it reminds me just how much stuff I
either don't know or should learn more about.

------
tedivm
So what happened with the Lotus system, and how did strace help?

~~~
pjmlp
Got re-written on top of Eclipse.

Running version 7 on my work laptop. :(

~~~
et5000
The client was rewritten on top of Eclipse, not the server.

~~~
pjmlp
You are right, I should have been clearer.

------
Argorak
Thanks for the writeup! strace should definitely be in your toolbox. There is
also systemtap, which I like a lot as well. It has some problem on Linux
though, especially only being widely supported of Linux > 3.5 if the distro
you are using does not ship with patches. Custom userspace probes are a real
strong point.

I wrote a short article about stap using Rubys probes as an example:
[http://www.asquera.de/blog/2014-01-26/stap-and-
ruby-2](http://www.asquera.de/blog/2014-01-26/stap-and-ruby-2)

~~~
donavanm
To clarify, systemtap needs UTRACE or UPROBES kernel support to trace user
processes. Without those you can still inspect kernel functions. IIRC UTRACE
was mainly supported by Redhat, and available intheir kernels for quite a few
years. In 3.5 the UPROBES functionality was merged in to mainline. Other
tools, like perf, can use UPROBES support as well.

~~~
Argorak
Yep, thats the detailed story. A lot of Linuxes (especially current Ubuntu
LTS) ship without those patches, though, which makes the whole exercise of
compiling the kernel yourself (and maybe a debug image to go along with it) a
tedious one, and probably not fit for a production environment.

Thats why I decided to focus on Linux > 3.5, which will be available with
Ubuntu 14.04 LTS, where installing gets much easier if you know the right
packages. I definitely wanted to make sure that people can start playing
around with it in a few minutes.

Also, UPROBES are in my opinion one of the most interesting features to grab
with strace. It allows you to easily combine detailed kernel-level tracing
with tracing of your application.

(Not that I want to suggest that compiling your own kernel is hard to do, it
just takes the fun out of "let's trace!")

------
yread
You can use Process monitor [http://technet.microsoft.com/en-
us/sysinternals/bb896645.asp...](http://technet.microsoft.com/en-
us/sysinternals/bb896645.aspx) to see a similar overview of low level
activity. You won't see all the system calls, you can't pipe the output
directly, but there is a UI and you don't have to look up file descriptors

~~~
wslh
If you want to see the DLL calls, exceptions, etc, you can use the tools I
posted here:
[https://news.ycombinator.com/item?id=7156160](https://news.ycombinator.com/item?id=7156160)

------
kev009
If you think strace is useful, wait until you try dtrace.

~~~
fafner
or systemtap
[https://sourceware.org/systemtap/](https://sourceware.org/systemtap/) or ktap
[http://www.ktap.org/doc/tutorial.html](http://www.ktap.org/doc/tutorial.html)
for GNU/Linux

~~~
cbab
or LTTng with both kernel and userspace tracers
[http://lttng.org](http://lttng.org)

Disclaimer: LTTng developer :)

------
mrfusion
Has anyone heard of a program that will take strace (or dtrace) output and
create a pretty diagram showing which commands call which commands and which
files they read or create?

We've got a fairly complicated bioinformatics pipeline that calls about 100
other programs, and creates or reads about 100 different files. I'd love a way
to create a picture of what's going on. Which files each program uses, etc.

If such a program doesn't exist, would that be worth building? Could it be
something I could potentially sell?

~~~
peterwwillis
Sounded like an interesting script, so I just wrote it in about a half hour.
You're welcome to sell it if you want... (also, there's probably a default
recursion limit of 100; unroll the recursion in walkpid to go farther) Collect
logs with 'strace -o pids.log -e trace=process -f [specify your process
here]', run with 'perl printpids.pl < pids.log'

    
    
      #!/usr/bin/perl -w
      $|=1;
      use strict;
      my (%pidmap, @order);
      while ( <> ) {
          chomp;
          if ( /^(\d+)\s+(\w+)(.*)$/ ) {
              my ($pid, $syscall, $args) = ($1, $2, $3);
              if ( $syscall =~ /(^clone$|fork$)/ and $args =~ / = (\d+)$/ and $1 > 0 ) {
                  my $clonepid = $1;
                  $pidmap{$clonepid} = { -parent => $pid };
                  push(@order, $clonepid);
              }
              elsif ( $syscall =~ /^exec/ and $args =~ / = (\d+)$/ and $1 == 0 ) {
                  my $exec = $args;
                  @order = ($pid) if !@order;
                  $exec =~ s/^\("([^"]+?)",.*$/$1/g;
                  push( @{ $pidmap{$pid}->{-exec} } , $exec );
              }
          }
      }
      foreach my $pid ( @order ) {
          my $spaces = walkpid($pid);
          print "    " x $spaces . join("\n" . ("    " x $spaces), map { $_ . " ($pid)" } @{ $pidmap{$pid}->{-exec} } ) . "\n";
      }
      sub walkpid {
          my $pid = shift;
          my $c = shift || 0;
          if ( exists $pidmap{$pid}->{-parent} ) {
              return walkpid($pidmap{$pid}->{-parent}, $c+1);
          }
          return($pid, $c);
      }

~~~
mrfusion
Wow that's amazing! Can I have strace launch the program? Or does it already
have to be running?

~~~
alinspired
you can, last strace argument can be a command you want to strace

------
gopalv
"perf top -e syscalls: _statfs_ "

particularly when you don't know which process is calling all the syscalls.

Mix "perf record" and "perf trace" & you have the next generation of strace
tools.

~~~
edwintorok
So they integrated that upstream, I always wondered what happened to the tool
announced here:
[http://lwn.net/Articles/415728/](http://lwn.net/Articles/415728/)

    
    
      perf --help
        trace           strace inspired tool

------
eldavido
I use strace all the time doing ops at Crittercism. Some of the random things
it's helped with/taught me:

\- allowed exploring forking behavior of daemons, in particular the nitty-
gritty of gunicorn's prefork behavior, and understanding the rationale behind
single- and double-fork daemons generally (very important to understand for
job control e.g. writing upstart/init.d jobs)

\- isolated hot reads to memcache in situ, by identifying the socket
associated with the memcache connection, and finding which key was read the
most by a process (we built better logging after the fact, but sometimes
there's no substitute for instrumenting prod during tough perf/stress
problems)

\- let me explore the behavior of node.js's several threads, and find one of
them sending "X" over a socket to the other (still not quite sure what this
is, some kind of heartbeat/clock tick?)

\- helped understanding "primordial processes" and the exact details of how
forking/reparenting work on linux

It's a great tool and one that every ops/infrastructure engineer should be
familiar with.

~~~
alexnewman
It's still no dtrace. Something that I hope enters osx soon.

~~~
retr0h
dtrace has been in osx since 10.5 :/

------
dave1010uk
Quick strace command that I use all the time to see what files a process is
opening:

    
    
        strace -f <command> 2>&1 | grep ^open
    

Really useful to see what config files something is reading (and the order) or
to see what PHP (or similar) files are being included.

There's normally other ways to do this (eg using a debugger) but sending
strace's stderr to stdout and piping through grep is useful in so many cases
it's become a command I use every day or 2.

~~~
girvo
Hey, that's really useful! Cheers :)

------
justincormack
For OSX you need to use dtruss, for NetBSD and FreeBSD ktrace is what you
need.

~~~
kev009
ktrace is a little more broad and has much different invocation; truss is most
comparable on FreeBSD and SysV.

------
np422
Strace is easy to use, commonly available, and very useful in many situations.

More modern tools such as dtrace for the solaris and systemtap for linux
addresses similar problems but with a broader coverage.

------
dicroce
Also check out ltrace... Shows the calls to other libraries the process is
making...

I'd also like to point out that a key to using strace successfully is the
result column... Programs that fail often make system calls that fail right
before they exit... You can often tell what the program is trying and failing
to accomplish...

------
kyaghmour
In case you're curious, this is how ltrace (strace's library equivalent)
works: [http://www.opersys.com/blog/ltrace-
internals-140120](http://www.opersys.com/blog/ltrace-internals-140120)

------
Anthony-G
I’ve used strace before to help diagnose issues with buggy software I was
using and I thought this was a great article.

I just thought I’d let people know that it can be a lot easier to read
strace’s output if you read the output log file using Vim as it contains a
syntax file which can highlight PIDs, function names, constants, strings, etc.
Alternatively, if you don’t want to create an strace log file, you could pipe
the output to Vim and it will automatically detect it as being strace output,
e.g.

    
    
      strace program_name 2>&1 | vim  -

------
csmithuk
strace taught me that glibc never does what you think it does behind the
scenes!

------
memracom
Totally agree that strace is an awesome tool. I've even used it with Java apps
that were behaving wierdly, just attach and see what it is saying to the
kernel.

------
chadfowler
Bending to the will of the people, I have appended a conclusion, clarifying
the fate of the Lotus Domino server.
[http://chadfowler.com/blog/2014/01/26/the-magic-of-
strace/](http://chadfowler.com/blog/2014/01/26/the-magic-of-strace/)

------
davyjones
Just a few hours ago, a newly minted Ubuntu binary was crashing due a library
version mismatch. I thought I had updated the shared libraries to point to the
new versions. But definitely something was still hooked to the old version. I
just couldn't figure out how/where. ldd wasn't of much help because everything
was OK according to it. "If only I can get a bit more info when the binary is
running and spit out everything before the crash."

Tried my luck with gdb. Sure enough...there was libQt5DBus pointing to the old
libs leading to the crash. If you are feeling particularly adventurous, you
can step one instruction at a time after starting. Even without debug symbols,
there is quite a lot of info that be used while troubleshooting.

------
peterwwillis
There's a lot of fun to be had with strace. I wrote a tiny perl script that
spies on the file descriptors of another process and outputs it to your
terminal: [https://github.com/psypete/public-bin/blob/public-
bin/src/sy...](https://github.com/psypete/public-bin/blob/public-
bin/src/system/dumpfd.pl)

------
kylequest
Even in the 90s Java decompilers existed, so the "We had no source code"
excuse sounds a bit strange :-)

~~~
chadfowler
Oh we used those too, but in this case there were also native libraries. I was
a regular user of jad, even sometimes recompiling and replacing stuff
(ooooweeee) in production.

------
mwcampbell
It's instructive to see how much simpler the strace output for a simple
program is when the program is statically linked. Especially if you use an
alternative libc like musl ([http://musl-libc.org/](http://musl-libc.org/)).

------
arca_vorago
Don't forget that sometimes strace is overkill, and similar more easily parsed
things can be used instead, for example, /usr/bin/time (vs bash time) has been
coming in more and more handy for me.

------
alexnewman
The first level up on java is being able to tell useful things about it via
simply straceing it. Once again another win for dtrace.

------
Derpdiherp
Useful article. But the background of the blog flickers rather badly, it's
pretty migraine inducing.

------
alinspired
my favorite use of strace to learn which files (especially config files) are
being open by a new daemon/tool: strace -f -s1024 2>&1|grep open

also remember also useful 'ltrace' \- libraries tracing

------
LinuxIsNotUniX
Unix tools? You mean Linux....

------
CrispEditor
Please see

ftp://86.0.252.89/pub/release/website/tools/trace-20140126-x86_64-b95.tar.gz

This is a tool called ptrace - which does everything that strace does and a
lot more. You have working binaries in there, and most of the source - I
havent extricated the full build dependencies so it all builds, but this
includes extra facilities like reporting summaries of process trees, showing
only connections or files, and shlib injection into a target process.

If people are interested more on this, contact me at
CrispEditor-a.t-gmail.c-o-m

~~~
jrockway
I am a bit paranoid about downloading a binary from a site without a domain
name.

~~~
jenrzzz
ftp://crisp.dyndns-
server.com/pub/release/website/tools/trace-20140126-x86_64-b95.tar.gz works
too if it makes you feel any better.

~~~
rhizome
dyndns ain't much better.

~~~
csmithuk
Indeed. Dyndns tends to be the source of much Internet clap...

