Who Is Debugging the Debuggers? Exposing Debug Bugs in Optimized Binaries

userbinator · on Dec 3, 2020

I wonder how much this will help with a very common experience:

    (gdb) p x
    $3 = <value optimized out>
    (gdb) p $ecx
    $4 = 1234

"What do you mean, value is optimized out!?!? It's sitting right there in the bloody register!"

I've long since grown accustomed to mentally decompiling the Asm and finding the correspondence to the source from that instead.

glandium · on Dec 3, 2020

> "What do you mean, value is optimized out!?!? It's sitting right there in the bloody register!"

It's the biggest problem I have with debuggers. Interestingly, with rr you can usually reverse-stepi a little, and get the value from a slightly earlier time, where its location was known to the debug info. A life saver, although depending on what the code does to the value, that might not help entirely. If only rr was available on all platforms... I guess TTD can help similarly on Windows.

roblabla · on Dec 3, 2020

WinDBG Preview's TTD works really well, it's a really solid alternative to rr... if you can get past windbg's terrible ui/ux and piss poor discoverability.

I really hope someone makes the TTD stuff available from vscode (or even better, from IDA) so I can use the awesome core functionality without the terrible frontend.

ckok · on Dec 3, 2020

The curious thing is that is isn't a bug in the debugger itself usually. It's the optimizer process that doesn't leave proper instructions of where "x" is at this point.

sanxiyn · on Dec 3, 2020

This specific paper won't. This detects incorrect debug information and it considers debug information loss to be correct.

bregma · on Dec 3, 2020

It's technically possible to give perfect debug information even with a highly optimized binary. In a typical Linux/BSD ELF file the debug information is stored as a DWARF program that effectively emulates the state of all the registers in the target machine, and that program gets executed from the saved frame state of the current function to produce the state at the current breakpoint complete with tracking backpointers into the source. To track all information properly would require much much more state to be preserved over optimization passes and saved with each executable. The end result would effectively end up embedding the compiler internal state inside each and every target executable. The cost, of course, would be massive (terabyte) executables that take minutes or so to execute every step in the debugger, and compile times orders of magnitude greater than they are today.

The current assumption is that the programmer knows enough about what they are doing and how their tools work under the hood that they can live with the current limitations when they try to debug non-debug builds.

stefan_ · on Dec 3, 2020

There is a big disconnect between all the stuff compilers shovel into DWARF information and then what parts of that gdb is actually making active use of or even supports to begin with.

DWARF is the perfect case for why it's often better to just have an intertwined compiler & debugger implementation and information format instead of the current tragedy.

khuey · on Dec 3, 2020

Generally when you see <optimized out> it really is because the compiler didn't emit location entry for that variable at that point in the program. There's plenty of stuff in DWARF that gdb/etc doesn't make a lot of use of but variable locations aren't among them.

mshockwave · on Dec 3, 2020

there is a proposal recently in the LLVM community trying to use "extended lifetime" to solve this problem

GiuseppeDiLuna · on Dec 3, 2020

Dear HN community,

I am one of the paper authors, the paper has been accepted for presentation at ASPLOS 2021.

If you have any question let me know!

jmorse2 · on Dec 3, 2020

Thanks for finding all those bugs! I see the BI / SI / LI invariances you define are looking for program states that aren't present in the unoptimised program, while PI is looking for the absence of information that is in the unoptimised code. Do you think it will be possible to define and search for invariants that involve program states that are in the unoptimised program, but are presented in in the wrong way in the optimised program? For example variable values being presented at the wrong time, or stepping behaviour that misleads developers.

It'd be great to hunt those kinds of bugs, however it's hard to define what the "right" behaviour would be in those circumstances.

GiuseppeDiLuna · on Dec 3, 2020

Hi!

We are able to identify some kind of mis-stepping behaviour (look at Section 7-6). However, as you can imagine, we cannot catch all cases.

As you pointed out, a real problem is that there is no clear definition of what should be the semantic of debug information in the optimized binary (e.g., the compiler is free to squish several lines of source code into a smaller snippet of assembly language).

We are working to have more refined results in the near future .

pabs3 · on Dec 3, 2020

What are your future plans for your framework? Will you spin it out into a startup? Will it only be used for writing future papers? Will it be released under an open source license?

GiuseppeDiLuna · on Dec 3, 2020

Hi!

We plan to use the framework to investigate more the correctness of debug information. Personally, I am not interested in startup and commercial opportunities.

Releasing the software is a step we are discussing.

pabs3 · on Dec 4, 2020

Have you done any analysis of the debug symbols provided by Linux distros like Debian or Fedora?

scaramanga · on Dec 3, 2020

Nice work, thanks for doing this.

Is there anybody working on evolving DWARF in the ways you suggest in the paper?

GiuseppeDiLuna · on Dec 3, 2020

Many thanks for your interest!

We are not working on DWARF. However, we hope that the paper will spur a public discussion on these problems. Helping the entire community to have a more expressive standard.

framecowbird · on Dec 3, 2020

> We have used \n to find 23 bugs in the LLVM toolchain

For a second here I thought they had called their tool "\n". That would be pretty baller...

dstick · on Dec 3, 2020

Hehe, and then incorporate a company called "rm -rf /"

GiuseppeDiLuna · on Dec 3, 2020

Many thanks for catching the typos! I will update the arxiv to fix it.

MaxBarraclough · on Dec 3, 2020

Very cool. Promising stuff. Is the Debug^2 framework publicly available? Any chance it will find its way into the test suites of the debuggers?

> Our framework feeds random source programs to the target toolchain and surgically compares the debugging behavior of their optimized/unoptimized binary variants.

I thought this had the flavour of John Regehr's work, and sure enough, they're using C-Reduce.

An interesting somewhat related paper about automatically exposing bugs in C compilers, by Regehr et al: Finding and Understanding Bugs in C Compilers, https://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf

edit I missed the author's comment in this thread that Releasing the software is a step we are discussing. Please do release the framework, ideally under a standard Free and Open Source licence. As we've just seen with C-Reduce, this is a helpful thing to do.

SAI_Peregrinus · on Dec 3, 2020

> When debugging issues —sometimes caused by “heisenbugs”

IME "Heisenbugs" are specifically bugs whose behavior changes when you attach a debugger (or other instrumentation). It's a joking reference to the Heisenberg uncertainty principle of quantum mechanics. The more you try to measure one property of the system, the less you can know about some other related property. They're usually race conditions or memory corruption where the instrumentation disrupts the timing or order of operations just enough to make the bug go away.