
Show HN: TARDIS – Warp a process's perspective of time by hooking syscalls - DavidBuchanan
https://github.com/DavidBuchanan314/TARDIS
======
foob
It's worth mentioning that libfaketime [1] is a more mature alternative with
macOS support and more complete coverage of the relevant system calls. Nothing
against the current project but that might be a better choice for many people.

[1] -
[https://github.com/wolfcw/libfaketime](https://github.com/wolfcw/libfaketime)

~~~
nzmsv
Sadly, the license is GPL rather than the expected LGPL.

~~~
jwilk
In what sense LGPL is "expected"?

~~~
zeckalpha
For a lib

~~~
dTal
This library is not meant to be incorporated into an application - it is meant
to be preloaded to modify the behaviour of an existing program. So I don't see
what permission to link into a proprietary program really gets you.

------
aray
Nice Implementation! I built something similar a while ago to warp time in
video games (for training reinforcement learning agents).

Some issues off the top of my head (that I ran into): VDSO censoring is a lot
harder than just symbol overriding, it has to actually be removed from the aux
vector (third thing on the process stack when the process launches after
arguments and environment variables. The EHDR entry is what you need to
remove.

Gist for censoring EHDR:
[https://gist.github.com/machinaut/a08b581c921775263cf0e20ccc...](https://gist.github.com/machinaut/a08b581c921775263cf0e20ccc974cbd)

Some libc's (notably glibc) are really good at finding/using EHDR even if you
do that symbol overriding, so dumping EHDR is the most assured way of making
sure it's gone.

ptrace overhead is HUGE -- because you're debugging a userspace program with
another program every time call now results in 4 context switches (to/from
your debugging program at every time call entry/exit), even pinning both to
the same CPU this is not fast.

This is where my least favorite part of the linux kernel comes in handy:
SECCOMP-BPF. Instead of firing _every_ syscall, you can write a syscall packet
filter rules list that only matches certain time-based syscalls with certain
arguments. This greatly improves the performance (but for me, still not fast
enough to play video games live).

At the end of the day I ended up reviving a >10 year old patch someone sent to
the linux kernel to add these parameters (time offset and time warp) to thread
structs and do the warping in the kernel (much faster -- dont pay the context
overhead, etc). Sadly even this didn't work because our end application needed
to run on multiple clouds in docker, and we'd need to have access to the host
kernel to do these operations.

I'd like to have an affine time warp as part of the cgroups, and then maybe
extend it through runc so anyone can run time-warped docker containers, but
maybe that's wishful thinking.

Overall I think this is great work, and super happy you posted it. I'd love to
chat about it sometime.

(P.S. most ironic to me was my version of this was called 'timelord' :)

~~~
AstralStorm
And the most foolproof way would be to run in a virtual machine or a prepared
container. Pretty fast too.

Having a clock cgroup would be easier and more useful than you'd think. Also,
you can play tricks like ntpd does in a container. (e.g. adjtime)

~~~
aray
This depends on the virtual machine or container!

Ironically, because the folks working on containers/VMs are _really_ good at
what they do, time access calls in particular have been really optimized (they
get called a lot). This makes it very hard to intercept time calls at this
layer! e.g. KVM and LXC both essentially hand time calls straight to the host.

This means time intercepts at the VM/container layer need fundamental support
(I mentioned affine time transformation in the linux kernel in another
comment) which doesn't work for people who need to deploy on current hosted
container.

------
tyingq
There was some similar commercial implementation like this called "time
machine" (I think) that sold like hotcakes during the Y2K prep work...had
versions for all the various RISC vendors, Linux, etc.

Edit: Yep, still exists. Was $2000 per server back then. Wonder if the price
was some sort of inside joke... $2k to prepare for Y2K?
[https://www.cnet.com/news/new-tool-tests-
for-y2k-compliance/](https://www.cnet.com/news/new-tool-tests-
for-y2k-compliance/)

------
hoytech
There's also fluxcapacitor:

[https://github.com/majek/fluxcapacitor](https://github.com/majek/fluxcapacitor)

~~~
majke
fluxcapacitor autor here.

Fluxcapacitor is focused on speeding up complex programs - most notably test
suites (that do fork/execve). The idea is to cheat on time, to allow testing
timeout-related branches in code. You can spin out a server and a client,
write a test that needs to wait 60 seconds for completion and see it pass in
0.6 seconds.

Tardis on the other hand seems single-threaded, which makes it useful for...
not really sure. I guess a demo how to use ptrace.

The problem with syscall interception with ptrace is that it doesn't work for
golang. Golang doesn't use libc. This means there is no way to hook into the
VDSO[1] - based syscalls. They are just a jump from userspace to special
userspace memory region, so ptrace won't ever see it.

So, this approach, using ptrace, as used in tardis and fluxcapacitor will not
work for golang.

[1] [http://man7.org/linux/man-
pages/man7/vdso.7.html](http://man7.org/linux/man-pages/man7/vdso.7.html)

~~~
aray
Syscall interception works for _every_ program, it's just a matter of doing it
correctly.

VDSO is a small set of (3) calls which are not syscalls but direct calls (for
speed/efficiency). Our goal is to remove this functionality to force libs to
call through the (slower) syscall route instead.

I mention in another comment how EHDR censoring is needed for robust VDSO
removal.

I've not run into a libc where censoring EHDR breaks time calls (i.e. it
doesn't fallback to syscalls) but possibly golang has this.

In this case it's straightforward to setup a fake VDSO and then instead of
EHDR censoring you just replace it with your fake VDSO address and you're
golden!

------
amelius
It uses ptrace, and ptrace sucks under Linux because you can't use it
recursively (i.e., a process under ptrace can't ptrace another process).

~~~
jwilk
Does recursive ptrace work elsewhere?

------
wyldfire
I designed something similar for general fault injection [1] (and to learn
rust). There's no intercept written for time syscalls yet, but it's on the
issues list.

[1]
[https://github.com/androm3da/libfaultinj](https://github.com/androm3da/libfaultinj)

~~~
jwilk
How does it compare to libfiu?

[https://blitiri.com.ar/p/libfiu/](https://blitiri.com.ar/p/libfiu/)

~~~
wyldfire
It's nowhere near as feature complete or well-documented. :(

Oh well, it's an interesting undertaking anyways.

It looks to me like libfiu has a pretty clever way of generating the libc
wrappers from a config file.

------
throwaway_374
So... in English... for us mere mortals with limited Linux kernel exposure...
does this accelerate (or decelerates) the system clock or is it a mock
patching for system time function calls?

------
partycoder
There's people that did this to cheat on our games. But using server time
exclusively can help mitigate this.

~~~
jfoutz
It's also used in virus and malware detection. Lots of stuff lays low for a
week or two to help hide the attack vector.

------
franze
cool name, sadly copyrighted
[http://tardis.wikia.com/wiki/Tardis:Copyrights](http://tardis.wikia.com/wiki/Tardis:Copyrights)

~~~
wyldfire
As pointed out elsewhere this page identifies a trademark on the term "TARDIS"
and not a copyright. Terms like "TARDIS" cannot be copyrighted.

But trademarks have limited scope and the standard test for infringement is
that it be "confusingly similar". Hilariously, BBC's TARDIS USPTO word mark
[1] includes "... computer software for use in database management; ...
computer, electronic and video games programs and equipment, namely,
software,"

[1] registration #4161487 --
[http://tmsearch.uspto.gov/bin/showfield?f=doc&state=4803:loe...](http://tmsearch.uspto.gov/bin/showfield?f=doc&state=4803:loemzg.2.2)

~~~
jwilk
The link doesn't work:

 _This search session has expired. Please start a search session again by
clicking on the TRADEMARK icon, if you wish to continue._

