
Introduction to strace - aburan28
https://jorge.fbarr.net/2014/01/19/introduction-to-strace/
======
brendangregg
No mention of overhead, and finishes with an strace -c and "This option is
very useful when trying to find out why a program is running slow." Well, it
will run slow if you run "strace -c", because you're running "strace -c".

Because of the overheads, I wouldn't trust the high resolution timestamps also
included in the blog post.

Also not mentioned: "strace -e" to select a single syscall doesn't reduce
overhead -- you pay the tax for tracing all syscalls anyway.

Because of strace's behavior, there's also been situations in past kernels
where it can hang processes and need a kill -9. Your application is now paused
while you're madly typing at the command line -- if I did that on one of my
instances at work (although I believe that bug has been fixed a long time
ago), request timeouts may have the instance fail over before I could finish
hitting enter. That's not so bad, but in your environment it could be much
worse. You may get such timeouts just from the overheads of strace.

It's one thing to write a post that has errors. It's another to write a post
that's dangerous.

This is why I wrote [http://www.brendangregg.com/blog/2014-05-11/strace-wow-
much-...](http://www.brendangregg.com/blog/2014-05-11/strace-wow-much-
syscall.html)

~~~
gbrown_
> Because of the overheads, I wouldn't trust the high resolution timestamps
> also included in the blog post.

Naive question why would the timestamps be inaccurate, surely each step of the
program's execution should just be slower? Am I missing something about how
the timestamp of a syscall would be off?

~~~
Hello71
they're _absolutely_ inaccurate for the reason you mentioned, but they're also
_relatively_ inaccurate because not all of the program's execution is
syscalls. regular program code runs at the same speed (more or less), but
syscalls are made much slower so appear to take more time than they really do.

------
musha68k
Back in the early aughts I was applying as a junior sysadmin for one of those
up-and-coming LAMP web-hosting companies here in Vienna. The interviewer was a
great, all-around-nice guy and a very experienced sysadmin. All went rather
well until he asked me on how I would go about fixing a specific Apache httpd
issue.. suffice to say I was still green behind my ears and didn't know about
strace and that was pretty much it.

I'm still very grateful that I funked up that interview as otherwise, down the
line I maybe wouldn't have gone into systems programming / unix internals as
easily.

Julia Evans has a great strace related zine on her website, check it out as
well:

[http://jvns.ca/blog/2015/04/14/strace-
zine/](http://jvns.ca/blog/2015/04/14/strace-zine/)

Also - if you are on MacOS/BSD you should check out the somewhat related
dtrace/dtruss, very powerful tools.

------
majke
Good brief intro article.

At some point I tried to get deeper and understand ptrace(2) syscall, the
technology behind strace command line tool. I wrote this piece:

[https://idea.popcount.org/2012-12-11-linux-process-
states/](https://idea.popcount.org/2012-12-11-linux-process-states/)

The ptrace() articles were never finished, but oh well. I guess ptrace() is
doomed to be undocumented and barely understood. Recently I found this gem in
1983 4.2BSD operating system man page :

> _Ptrace is unique and arcane; it should be replaced with a special file
> which can be opened and read and written. The control functions could then
> be implemented with ioctl(2) calls on this file. This would be simpler to
> understand and have much higher performance._

[http://www.tuhs.org/cgi-
bin/utree.pl?file=4.2BSD/usr/man/man...](http://www.tuhs.org/cgi-
bin/utree.pl?file=4.2BSD/usr/man/man2/ptrace.2)

~~~
pm215
ptrace() isn't undocumented, it has a fairly long manpage:
[http://man7.org/linux/man-
pages/man2/ptrace.2.html](http://man7.org/linux/man-pages/man2/ptrace.2.html)

It's certainly pretty hairy, but on the other hand only a handful of programs
(debuggers, strace, similar debug type tools) are ever going to need to use
it.

~~~
tcoppi
Pretty hairy is an understatement. There are so many edge cases with ptrace
based on how it works(interactions with signals especially) that make it
almost impossible to make a truly robust debugger with it. Solaris did it
right with their procfs debugger interface. I really wish Linux would step up
the game here.

------
MichaelBurge
You can also use gdb to debug more than just C code.

I used to know a DBA that had symbols installed on the production database
server so he could attach a debugger to the running Postgres backend process,
and use the backtrace to tell you what the query was doing and why it was
running slowly.

Also, note that running strace on a production process can change its observed
behavior. [http://man7.org/linux/man-
pages/man2/ptrace.2.html](http://man7.org/linux/man-pages/man2/ptrace.2.html)

~~~
bch
Maybe I'm not following you, but isn't that just attaching gbd to C code?

~~~
tonyarkles
There's often macros you can use to help. In that case, sure, it's C code.
Another "it's just C code" is debugging CPython:
[https://wiki.python.org/moin/DebuggingWithGdb](https://wiki.python.org/moin/DebuggingWithGdb)
These macros let you inspect both the C stack and the Python stack, and some
other stuff. Super super handy for rare cases where shit just gets weird.

------
mandarg
Tangentially, there's a funny Easter Egg in strace – with some trial and
error, you can get it to strace its own pid.

    
    
      $ strace -p 957 strace: I'm sorry, I can't let you do that, Dave.

~~~
digitalsushi
let z=$(readlink /proc/self)+1 && strace -p $z

probably a race. i'm not a real programmer.

~~~
qb45

      sh -c 'exec strace -p $$'
    

This is race-free and POSIX-compliant I think.

------
jdamato
Great introductory article, thanks for writing and sharing this!

I wrote an article explaining the inner workings of strace [1], and a detailed
article about Linux system calls [2] which others interested in this article
may find relevant.

[1]: [https://blog.packagecloud.io/eng/2016/02/29/how-does-
strace-...](https://blog.packagecloud.io/eng/2016/02/29/how-does-strace-work/)

[2]: [https://blog.packagecloud.io/eng/2016/04/05/the-
definitive-g...](https://blog.packagecloud.io/eng/2016/04/05/the-definitive-
guide-to-linux-system-calls/)

------
valbaca
I much prefer Julia Evan's zine on strace:

[http://jvns.ca/strace-zine-unfolded.pdf](http://jvns.ca/strace-zine-
unfolded.pdf)

------
amelius
The problem with strace is that it is not reentrant. I.e. you can't run strace
through strace.

So don't use it in scripts unless you are debugging.

~~~
GFK_of_xmaspast
If you're not debugging, why are you using strace?

