
Pin - A Dynamic Binary Instrumentation Tool - nkurz
http://pintool.org/
======
xal
Here is an actual Hacker tool that allows all sorts of insane CPU close
performance work. No comments all day. This is heartbreaking.

~~~
lallysingh
First, it's Sunday night.

Second, PIN gives you very low-level access, sure, but it also requires quite
a bit of work to get that data. You can get a lot of the same data (as seen in
the examples) with libpfm4 (<http://perfmon2.sourceforge.net/>). CPU
Performance counters can give you a lot of this data with _much_ lower run-
time overhead, and a lot less work.

Also, PIN's a little hairy. If you just want to generate code for run-time
execution, LLVM's your best bet. If you want to diddle with a running
executable, you can always use libelf(3) and ptrace(2) to read and diddle with
the running process. It may be useful for specific sorts of analyses you want
to run on an executable, but it's messy. If you're doing performance
instrumentation, dynamically modifying the code is going to alter your results
that can be hard to compensate for.

~~~
tptacek
You can't effectively do things like instrumenting every write instruction in
a program using ptrace. Also, the techniques Pin uses sound hairy, but they're
the same things software virtualization does.

~~~
lallysingh
True, but what do you do with that data? Transfer it out of process? Analyze
it? And how many systems can withstand that sort of slowdown without timeouts?

Performance counters can tell you quite a bit, and they cost very little to
set up. Snapshots and a little differential analysis can get you more
comprehend-able data without transfer/storage problems.

------
mspecter
An interesting application of Pin for malware analysis / visualization is
Danny Quist's Vera[1] and de-obfuscation framework[2]. It's also used in MIT's
Architecture course to benchmark different architecture designs.

[1] <http://www.offensivecomputing.net/?q=node/1687> [2]
<http://www.offensivecomputing.net/?q=node/492>

------
nnethercote
Here's a paper about Valgrind that includes some details on how it differs
from Pin: <http://www.valgrind.org/docs/valgrind2007.pdf>.

------
interconnector
Discussed at greater depth in the paper at <http://goo.gl/YDTwu> , if anyone's
interested.

~~~
nkurz
Thanks! Direct link to the paper here:
<http://ursuletz.com/people/faculty/pdfs/p190-luk.pdf>

------
smtddr
This is definitely something I won't forget about the next time I'm trying to
figure out what a binary is exactly doing. Also, I _really_ need something
like this for OSX right now.

~~~
k4st
I am working on a DBT framework that has some user space support. I do my main
development in OS X and Linux, and so I have done some testing of it on OS X.

The main focus of the DBT tool is Linux kernel modules, but let me know the
kinds of stuff you need it for and I can a) figure out if my tool is
applicable, and b) perhaps share the code.

~~~
smtddr
/Applications/Xcode.app/Contents/Developer/usr/bin/instruments

I need to know everything about that binary. How it works, what ports(unix
domain & network sockets), files it opens on the harddrive, libraries it's
linked to, how it decides what to do. Anything & Everything there is to know
about it. ^_^

~~~
simscitizen
There are plenty of built-in performance/introspection tools in OS X that you
can try first before resorting to a third-party solution:

1) What ports it opens:

\- netstat shows you what ports a program has running

\- DTrace shows you all syscalls a process makes (among other things). dtruss
is a convenient wrapper script included in OS X which shows you all the
syscalls a process makes (including opening sockets.

2) What files it opens

\- Again, DTrace's syscall provider lets you introspect all syscalls,
including open(). There's even a handy wrapper script included with OS X
called opensnoop.

\- Alternately, you can use the fs_usage command line tool to tap into the xnu
kernel's trace mechanism. This shows all sorts of filesystem events, including
what files are opened.

3) What libraries a binary is linked to

OS X binaries use the Mach-O format, not ELF like most other Unixes. So you
have to use OS X's binary introspection tools to understand that format rather
than the standard GNU binutils. What you're looking for here is otool, which
lets you introspect Mach-O binaries. Specifically, "otool -L
/Applications/Mail.app/Mail" for instance shows you which libraries Mail links
to. Run this recursively to get the transitive closure of all dependencies a
binary links against. Another way to do this is to run "vmmap -v <pid>" to
show you the vm layout of a process, which includes the __TEXT/__DATA segments
of all libraries the process links against.

And of course, gdb/lldb is included with the developer tools, you can just
attach to whatever process you care about and set breakpoints, type "info
sharedlib" to see what libraries are in the address space, etc. Also, for
better or worse, Objective-C is an extremely dynamic language, so you can even
do things like write a shared library with code you want to inject into a
process (potentially monkey-patching existing methods using ObjC categories)
and dlopen it from gdb to insert it into the target process's address space.

~~~
smtddr
Nice, never heard of otool or vmmap. I'll definitely try 'em out. thanks.

------
caf
The user guide gives a good flavour of the kind of things you can do with this
(
[http://software.intel.com/sites/landingpage/pintool/docs/584...](http://software.intel.com/sites/landingpage/pintool/docs/58423/Pin/html)
).

------
zokier
> Pin is proprietary software developed and supported by Intel and is supplied
> free of charge for _non-commercial_ use.

Huh. What are the licensing conditions and price for commercial use then?

~~~
lgeek
If this is a concern, feel free to use the open source, BSD-licensed main
competitor: DynamoRIO: <http://www.dynamorio.org/>

~~~
qznc
Isn't valgrind a more popular competitor?

~~~
lgeek
PIN/DynamoRIO and Valgrind have slightly different design aims. In short:

* Valgrind was designed to support rich analysis plugins (like Memcheck, which keeps a shadow copy of every bit of data) and performance was a secondary concern (on Valgrind, applications run on average about 4x slower, threads are serialized, etc).

* DynamoRIO and PIN are designed not to make much of an impact on performance (usually a few percent) and are more suitable for running in production, but it's somewhat more complicated to write plugins for them.

Both DynamoRIO[0] and Valgrind[1] maintain lists of publications which go into
much more detail.

[0] <http://www.dynamorio.org/pubs.html> [1]
<http://valgrind.org/docs/pubs.html>

------
cwp
How does this compare to dtrace?

~~~
rainforest
They differ quite significantly. dtrace uses probes - points where
instrumentation can be installed to inspect the process as it runs. Since it
requires these probes to be defined, probes only come "for free" in kernel-
space: e.g. tracing syscalls, pageins, that kind of thing. SystemTap offers
similar functionality - userspace probes can be defined for it too.

Pin, on the other hand, dynamically rewrites the binary to inject
instrumentation. This allows it to inject instrumentation code at a higher
granularity (individual instructions). It's useful where the application might
not define dtrace probes, or, for example on OSX, where the application has
"opted out" of it to protect it from being screenshotted (a la iTunes).

Pin is more like a scripted debugger than an instrumentation tool.

~~~
cwp
Ah, very clear. Thank you.

