
Intercepting and Emulating Linux System Calls with Ptrace - ingve
http://nullprogram.com/blog/2018/06/23/
======
schoen
This was a great tool many years ago for scripting this functionality with
Python:
[https://github.com/tutufan/subterfugue](https://github.com/tutufan/subterfugue)

The tagline was "strace meets expect". You could write a Python script to
control what happened when the process being traced made a syscall of your
choice! This was super-fun and super-educational and also came with a little
manifesto about ensuring users' control over software rather than the other
way around.

Unfortunately subterfugue hasn't been maintained in years and I think it
hasn't worked with the last few major releases of the Linux kernel. It would
be amazing to see a similar tool nowadays.

------
aray
Cool! I did this a while ago for some CPU-bound stuff, and ran into a bunch of
performance bottlenecks.

Some things that helped me scale ptrace-interception up:

\- SECCOMP_BPF filter (getting these right matters a lot)

\- moving all of your intercept work to a single side (enter or exit)

\- ensure affinity between the traced and tracing processes

\- nuke vdso

\- remove vdso from the aux vector (otherwise good libc's will find it again)

At the end of the day unfortunately the better solution would have been to
write kernel support for what I wanted to do, but it's a fun exercise in
learning about system calls.

------
dgl
It’s a nice write up and the part on system call interception is only serving
as an example, but be careful if you’re considering ptrace in something where
security matters.

See the answer on [https://stackoverflow.com/questions/4414605/how-can-linux-
pt...](https://stackoverflow.com/questions/4414605/how-can-linux-ptrace-be-
unsafe-or-contain-a-race-condition) for an example of a potential race. I
believe OpenBSD’s systrace had some issues like this.

------
sanxiyn
> The catch is that a process can only have one tracer attached at a time, so
> it's not possible emulate a foreign operating system while also debugging
> that process with, say, GDB.

This is actually possible. How? The obvious way... You intercept and emulate
ptrace calls with ptrace.

[https://robert.ocallahan.org/2016/04/using-rr-to-debug-
rr.ht...](https://robert.ocallahan.org/2016/04/using-rr-to-debug-rr.html)
explains how this works in practice.

------
gnufx
This is (basically?) the technique used by PRoot, which has issues with using
SECCOMP [1]. If anyone has the expertise, they'd doubtless be grateful of help
with the problem.

PRoot is pretty useful as a pure-userland solution for jobs that container-ish
things might otherwise do, e.g. user-mode installation of Nix/Guix.

[1] [https://github.com/proot-me/PRoot/issues/106](https://github.com/proot-
me/PRoot/issues/106)

------
yjftsjthsd-h
That's cool:) It looks like this is how WINE works, too, if I read correctly
([https://askubuntu.com/questions/146160/what-is-the-ptrace-
sc...](https://askubuntu.com/questions/146160/what-is-the-ptrace-scope-
workaround-for-wine-programs-and-are-there-any-risks))

~~~
antirez
Hello, actually WINE works in a different way, it does not attempt to
intercept system calls (it is not possible because Windows system calls work
in an incompatible way), instead it reimplements Windows DLLs, and normal
application code never calls syscalls directly, but always via the DLLs
implementing the libc or other libraries. So the DLLs of the core Windows
functions are simply replaced with the WINE implementation on top of the POSIX
API.

~~~
penagwin
So they essentially had to re-write the windows system files (the DLLs) to
implement libc and other linux friendly stuff?

~~~
monocasa
Yeah.

FWIW, Micorsoft's implementations of Win32 have swapped out the kernel a few
times anyway, so it's not the biggest leap for a third party (DOS/Win32s,
Win9x, WinNT).

Because of this, Microsoft considers the system call layer to be unstable, and
appears to go out of their way to change the system call table every service
pack. So the dll entry points are the stable ABI, and that's what 99.99% of
developers rely on.

~~~
saagarjha
That's an oddly un-Microsoft move given their general attitude towards
maintaining compatibility even for undocumented API…

~~~
monocasa
They've backed away from their classic Herculean efforts at back compat over
the past decade or so.

------
saagarjha
> The request field selects a specific Ptrace function, just like the ioctl(2)
> interface. For strace, only two are needed:

PTRACE_TRACEME: This process is to be traced by its parent. PTRACE_SYSCALL:
Continue, but stop at the next system call entrance or exit. PTRACE_GETREGS:
Get a copy of the tracee’s registers.

You have three things here ;)

