
rr: lightweight recording and deterministic debugging - pmoriarty
http://rr-project.org/
======
Elv13
I can't have enough thanks to Mozilla for funding the development of this.
This tool works wonderfully and is a time saver when debugging hard to
reproduce issue or issue that happens only in the Nth iteration of a method
call.

In RR, you just have to reproduce the problem, then put a breakpoint back in
time to when the state was good then reverse continue back there. It creates a
very narrow window of debugging instead of hours of head scratching.

I wrote many GDB frontend extension for my personal use (they may or may not
work for others and may by broken in python3, I am not a python dev). This one
is very useful with RR. It allows to log "print" on auto-generated breakpoints
then print them into a spreadsheet.
[https://gist.github.com/Elv13/92b98579e62f086cd9c12f44e510ca...](https://gist.github.com/Elv13/92b98579e62f086cd9c12f44e510cad0)

It's very useful with RR because you can modify the columns `print` as many
time as you like and ask rr to regenerate it without executing again.

~~~
roca
FWIW Mozilla funded it until 2016 but since then Kyle Huey and I have been
maintaining it out of our own pockets while we work on our startup.

~~~
Elv13
And thanks for keeping working on it ;)

After having this problem in GNU "high priority" list for a decade, it's nice
to see this thing exist. My comment was more about the initial "make it
happen" part. To me, `rr` seems like a really non trivial project to get going
in the first place. Not a lot of orgs would have taken a risk with this.

~~~
roca
We deliberately chose a design that could be implemented with a very small
team. In fact there has never been more than about one person working full
time on rr, usually less. That's one reason we bet on not using code
instrumentation, for example.

But yes, Mozilla deserves major credit for supporting us building this crazy
thing --- and of course, releasing it.

------
mijoharas
For anyone that isn't aware, this is a brilliant tool. I just wish it
supported ARM hardware (as far as I remember there are some interrupts that
clobber state, so this can't be used. Please fill in the details if anyone
remembers).

~~~
pm215
The upstream issue with discussion about Arm support is
[https://github.com/mozilla/rr/issues/1373](https://github.com/mozilla/rr/issues/1373)
\-- the underlying problem is that rr's design assumes that if you execute N
instructions you'll always deterministically end up in the same place. In
architectures which implement atomics via a load-linked/store-conditional loop
(including Alpha, Arm, MIPS, PPC,...) this isn't true, because differences at
the OS level (eg timing of other interrupts or process scheduling) could cause
an ll/sc loop to loop round more often. It's not clear how this could be
addressed, because it's pretty deeply baked into rr's design.

~~~
roca
CPU support for trapping on a failed LL/SC would suffice.

~~~
pm215
Do you have a sketch of how that would work? It seems plausible but I haven't
thought through the details. Issue 1373 suggests a perf counter of failed-SC
events and looking at "branches taken - failed_SC", which I'm definitely
sceptical would be reliable.

~~~
roca
Sure, I just added it here:
[https://github.com/mozilla/rr/issues/1373#issuecomment-43627...](https://github.com/mozilla/rr/issues/1373#issuecomment-436270586)

------
dang
Discussed in 2014:
[https://news.ycombinator.com/item?id=8817954](https://news.ycombinator.com/item?id=8817954)

------
sddfd
This is a great help, I'm using it most of the time. After I got used to it,
plain gdb feels incomplete (mostly because rr allows you to reverse-step and
reverse-continue even with watch/breakpoints).

------
zurn
Is this language agnostic, supporting all GDB's languages, or is there a
specific set of languages that it supports?

(GDB supports eg Ada, Fortran and Rust)

edit: I had a look: the web page says "C/C++", there is some evidence on the
bug tracker of people using it with Rust. So my own quick peek was
inconclusive.

~~~
kibwen
Indeed, rr has worked with Rust since at least 2015 (
[http://huonw.github.io/blog/2015/10/rreverse-
debugging/](http://huonw.github.io/blog/2015/10/rreverse-debugging/) ), and I
see the author of rr around the Rust community enough (I believe he works at
Mozilla?) that I doubt this has regressed in the meantime.

~~~
rebelwebmaster
roc left Mozilla back in 2016. [https://robert.ocallahan.org/2016/03/leaving-
mozilla.html](https://robert.ocallahan.org/2016/03/leaving-mozilla.html)

------
de_watcher
I've got an option "\--rr" on my tests that launches the program under rr. So
I can debug a failed test in all directions.

------
cntlzw
Can anyone explain how these tools work? How are they recording program
execution? Don't they need to keep track of every register, memory address and
such? Seems rather complicated.

~~~
roca
The basic idea is that since CPUs are deterministic, if you report and replay
all _inputs_ to a process you don't have to record what goes on inside the
process such as registers and memory.

------
qalmakka
Does anyone know if this also supports LLDB? Or is it strictly tied to gdb? I
happen to slightly prefer LLDB these days (mainly because its `list`
instruction is much much saner)

~~~
sanxiyn
rr is a gdb protocol server. By default, rr also executes gdb client and
automatically connects to server, but you can use -s PORT option to run rr in
server only mode.

As I understand, after rr is running in server only mode, you can connect to
it on gdb with "target remote :PORT", or on LLDB with "gdb-remote :PORT". I
haven't tested this, but it should work as long as LLDB implements gdb
protocol in a compatible manner.

------
lnyng
It reminds me of the ReVirt [1] paper read in an advanced OS class (actually
mentioned in the slides). I didn't watch the complete talk. Wondering how much
is it different from ReVirt and other record debugging tools.

[1]
[https://www.usenix.org/legacy/events/osdi02/tech/full_papers...](https://www.usenix.org/legacy/events/osdi02/tech/full_papers/dunlap/dunlap.pdf)

~~~
DSingularity
ReVirt logs non-deterministic inputs at the hypervisor level. RR will be
recording these inputs at the Kernel level. So this makes the inability to
replay the OS execution the biggest difference.

~~~
MarkUndo
Interestingly, hypervisor-level logging of this stuff is both harder (because
it has the constraints of kernel-level code) and simpler (because the non-
deterministic behaviours are fewer and better-documented at the hardware level
than the Linux API level!)

I think it's very likely that, overall, recording a single process is
substantially more complex to implement than recording a whole VM. (With
significant caveats - recording a whole VM with good performance is going to
be _hard_ and making it really useful probably is a whole load of extra code)

------
aargh_aargh
The site is almost unreadable on mobile. Font too thin, doesn't scale and most
importantly, the contrast is way too low.

------
justinclift
In theory (!), Goland recently added support for reverse debugging of Go code
via rr:

[https://youtrack.jetbrains.com/issue/GO-3831](https://youtrack.jetbrains.com/issue/GO-3831)

Haven't tried it out yet myself, but I'd expect it to be at least functional.
:)

------
entelechy
How does that compare to undo.io ?

~~~
andrey_utkin
Undo has some features rr doesn't have. It supports shared memory operations.
It works in virtual machines and "in cloud". It imposes less strict
requirements on kernel or CPU features, for example, it works on AMD.
Fundamentally, what differs UndoDB and Undo Live Recorder from rr is the
architecture based on machine code instrumentation.

~~~
MarkUndo
Actually, I _think_ rr supports shared memory operations under some
circumstances... (I'd love to have my beliefs confirmed / corrected by
somebody more knowledgeable)

My understanding is that rr has some handling for read-only shared memory and
for arbitrary sharing within a tree of recorded processes.

Undo's shared memory is different because it doesn't need the other process to
be recorded, so you can do read/write sharing with arbitrary processes or
devices.

(disclaimer: current Undo engineer)

~~~
roca
You're correct.

------
xvilka
There is also a low level tool, disassembler and debugger, radare2 [1]. It
also allows to record the session in both debug and emulation sessions and
replay it[2].

[1] [https://github.com/radare/radare2](https://github.com/radare/radare2)

[2]
[https://radare.gitbooks.io/radare2book/content/debugger/revd...](https://radare.gitbooks.io/radare2book/content/debugger/revdebug.html)

~~~
roca
radare2 lets you take snapshots of memory and registers and restore them, but
it doesn't let you, say, record the entire execution of multi-process Firefox
from startup to shutdown and later replay that perfectly. It's not capturing
the effects of the environment that you need to make that work.

------
nialv7
A very useful tool that has saved me a lot of trouble. Sadly I can't use it
anymore since I switched to a Ryzen CPU.

~~~
pmoriarty
What makes it unusable on Ryzen?

~~~
roca
Ryzen's performance counters aren't quite accurate enough for rr to use.
[https://github.com/mozilla/rr/issues/2034](https://github.com/mozilla/rr/issues/2034)

I hope that AMD will fix this one day.

------
chappar
Last time when I checked, rr did not support multi-threaded application well.
Has that changed now?

~~~
roca
rr supports multithreaded applications, but only uses a single core. So
parallel applications slow down.

~~~
scott_s
Being slower is not, I think, the major downside. It is that an entire class
of errors - race conditions - are basically outside of the scope of the tool.
Which is understandable! Race conditions are hard, and when I read about the
tool, my first thought was "How are they handling race conditions?" and it
turns out, essentially, they're not. But race conditions are also the hardest
part about debugging multithreaded applications.

I'm not sure if the tool ensures deterministic scheduling of threads on the
single core, but I doubt that it does. If it does not, then playbacks will not
be deterministic on playback, which means you could encounter different race
condition outcomes on playback. If it does, then while you may have
deterministic playback, the tool is unlikely to help with the class of race
conditions that require simultaneous execution.

To be clear: I'm not criticizing the tool or the work of the people. If I were
to design such a tool, I would probably start with a single core as well. It
seems like a valuable tool and great progress for software debugging. But I do
think race conditions in multithreaded programs are a current limitation.

edit: The technical report says that they deterministically schedule threads
([https://arxiv.org/pdf/1705.05937.pdf](https://arxiv.org/pdf/1705.05937.pdf)):

 _" RR preemptively schedules these threads, so context switch timing is
nondeterminism that must be recorded. Data race bugs can still be observed if
a context switch occurs at the right point in the execution (though bugs due
to weak memory models cannot be observed)."_

The "weak memory model" part means it won't help with, say, debugging lock-
free algorithms where you screw up the semantics.

~~~
roca
You should read
[https://arxiv.org/abs/1705.05937](https://arxiv.org/abs/1705.05937) so you
don't need to speculate. rr absolutely does guarantee that threads are
scheduled the same way during replay as during recording, otherwise it
wouldn't work at all on applications like Firefox which use a lot of threads.

Also, rr definitely is very useful for debugging race conditions. For example
Mozilla developers have debugged lots of race conditions using it. One thing
that really helps is rr's "chaos mode", which randomizes thread scheduling in
an intelligent way to discover possible races. See
[https://robert.ocallahan.org/2016/02/introducing-rr-chaos-
mo...](https://robert.ocallahan.org/2016/02/introducing-rr-chaos-mode.html)
and [https://robert.ocallahan.org/2016/02/deeper-into-
chaos.html](https://robert.ocallahan.org/2016/02/deeper-into-chaos.html) and
[https://robert.ocallahan.org/2018/05/rr-chaos-mode-
improveme...](https://robert.ocallahan.org/2018/05/rr-chaos-mode-
improvements.html).

~~~
scott_s
Very cool stuff! And yes, I took a look at the paper, as I noted in my edit.
But I think there's still two classes of race conditions outside of its scope:
ones that require simultaneous execution (where you can get surprising
interleavings) and lock-free algorithms where correct use of the memory model
is paramount. In my personal experience, these are the hardest problems to
debug.

~~~
codehog
Even those are probably not 100% outside of its scope. I forget the details of
chaos mode, but that kind of induced thread-switching can cause just the kind
of interleaving you seem to be talking about.

What rr cannot capture is a very small subclass of race conditions involving
things like cache line misses - I think that's what you're alluding to by
"correct use of the memory model is paramount" but it's a subclass even of
those. Yes, those are hugely difficult to diagnose and it would be fantastic
if tools like rr or UndoDB could capture them. But there's a vast swathe of
also very difficult race conditions that this recording tech can and does help
with today.

