Hacker News new | past | comments | ask | show | jobs | submit login
How we use binary search to find compiler bugs (bernsteinbear.com)
78 points by tekknolagi on Oct 21, 2022 | hide | past | favorite | 16 comments



Check out the Delta Debugging paper. There has been a ton of follow-on work (1000+ direct citations), and lots of tools exist.

Here's the citation, abstract and link to a PDF copy:

Simplifying and isolating failure-inducing input Andreas Zeller, Ralf Hildebrandt IEEE Transactions on Software Engineering 28 (2), 183-200, 2002

Given some test case, a program fails. Which circumstances of the test case are responsible for the particular failure? The delta debugging algorithm generalizes and simplifies the failing test case to a minimal test case that still produces the failure. It also isolates the difference between a passing and a failing test case. In a case study, the Mozilla Web browser crashed after 95 user actions. Our prototype implementation automatically simplified the input to three relevant user actions. Likewise, it simplified 896 lines of HTML to the single line that caused the failure. The case study required 139 automated test runs or 35 minutes on a 500 MHz PC.

http://cs.purdue.edu/homes/xyzhang/fall07/Papers/delta-debug...

Edit: adjusted tone


An even more fine-grained binary search technique for isolating compiler bugs is optimization fuel:

http://blog.ezyang.com/2011/06/debugging-compilers-with-opti...


https://dlang.org/blog/2020/04/13/dustmite-the-general-purpo...

Dustmite (say thank you to cybershadow) is the D community's beloved test case reduction tool


JReduce (https://dl.acm.org/doi/abs/10.1145/3453483.3454091) uses similar technique but with semantics for debugging such problems with Java decompiler and bytecodes.

The problem is we don't know if compiling with these functions follows the monotonicity of binary searching: what if a problem is caused by a special combination of two functions? You have to arrange these two functions in the right order in order to find the problem.


Binary search is also helpful in code that doesn’t reliably produce useful stack traces (side eyes Promise.then). Comment out half of the potentially implicated code. If you get the same error, you very likely haven’t found a stack frame. If you get a different error or no error at all, keep halving until you either isolate the thing or know where to look.



> You may have heard of bisecting from geometry or from git bisect1. Those are the two places I heard about it

I heard of it from when I first started programming and I cut the code in half (maybe 3rds or other units, close enough) until I found which part was breaking the program.


bugpoint is the equivalent for llvm: https://www.llvm.org/docs/CommandGuide/bugpoint.html


I have wondered how to extend this to combinations of elements.

What if the bad outcome only happens when two or more elements are enabled?

Is there a generalization of binary search for powersets?

Can this be done efficiently?


If I understand your problem correctly, you could iterate through the elements disabling one at a time and checking whether the bad outcome still occurs.

Assuming this works, it should be optimal: if you’re only getting a binary outcome at each step, you’ll never do better than lg(2^n) = n.


You might want to look at Hypothesis, a testing library that can also randomly generate test cases and reduce examples to a more minimal case. It supports many data types and "strategies".


This does work for combinations -- often I get JIT lists with two functions on it and the issue has to do with something in the call path itself.


Is this a fuzzing?


Oh, good point, I should probably mention fuzzing in similar work. Since we're not modifying the input data or input program, I might not count it as fuzzing. ¯\_(ツ)_/¯


cvise is another tool for test case reduction, similar to creduce mentioned in the article:

https://github.com/marxin/cvise


Cool! Thanks for sharing.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: