Check out the Delta Debugging paper. There has been a ton of follow-on work (1000+ direct citations), and lots of tools exist.
Here's the citation, abstract and link to a PDF copy:
Simplifying and isolating failure-inducing input
Andreas Zeller, Ralf Hildebrandt
IEEE Transactions on Software Engineering 28 (2), 183-200, 2002
Given some test case, a program fails. Which circumstances of the test case are responsible for the particular failure? The delta debugging algorithm generalizes and simplifies the failing test case to a minimal test case that still produces the failure. It also isolates the difference between a passing and a failing test case. In a case study, the Mozilla Web browser crashed after 95 user actions. Our prototype implementation automatically simplified the input to three relevant user actions. Likewise, it simplified 896 lines of HTML to the single line that caused the failure. The case study required 139 automated test runs or 35 minutes on a 500 MHz PC.
The problem is we don't know if compiling with these functions follows the monotonicity of binary searching: what if a problem is caused by a special combination of two functions? You have to arrange these two functions in the right order in order to find the problem.
Binary search is also helpful in code that doesn’t reliably produce useful stack traces (side eyes Promise.then). Comment out half of the potentially implicated code. If you get the same error, you very likely haven’t found a stack frame. If you get a different error or no error at all, keep halving until you either isolate the thing or know where to look.
> You may have heard of bisecting from geometry or from git bisect1. Those are the two places I heard about it
I heard of it from when I first started programming and I cut the code in half (maybe 3rds or other units, close enough) until I found which part was breaking the program.
If I understand your problem correctly, you could iterate through the elements disabling one at a time and checking whether the bad outcome still occurs.
Assuming this works, it should be optimal: if you’re only getting a binary outcome at each step, you’ll never do better than lg(2^n) = n.
You might want to look at Hypothesis, a testing library that can also randomly generate test cases and reduce examples to a more minimal case. It supports many data types and "strategies".
Oh, good point, I should probably mention fuzzing in similar work. Since we're not modifying the input data or input program, I might not count it as fuzzing. ¯\_(ツ)_/¯
Here's the citation, abstract and link to a PDF copy:
Simplifying and isolating failure-inducing input Andreas Zeller, Ralf Hildebrandt IEEE Transactions on Software Engineering 28 (2), 183-200, 2002
Given some test case, a program fails. Which circumstances of the test case are responsible for the particular failure? The delta debugging algorithm generalizes and simplifies the failing test case to a minimal test case that still produces the failure. It also isolates the difference between a passing and a failing test case. In a case study, the Mozilla Web browser crashed after 95 user actions. Our prototype implementation automatically simplified the input to three relevant user actions. Likewise, it simplified 896 lines of HTML to the single line that caused the failure. The case study required 139 automated test runs or 35 minutes on a 500 MHz PC.
http://cs.purdue.edu/homes/xyzhang/fall07/Papers/delta-debug...
Edit: adjusted tone