

Ask HN: What to do when facing a bootstrapping issue while integration testing? - arunaugustine

The abstract question I had:<p>What to do when faced with this situation: I need to test a piece of software for logical correctness. But the computation is so complex that in order to generate the test cases and expected results, I need the very program I am building.<p>Real Life situation:  I wrote an AI program for a class that plays the 9 men morris game. It implemented both MINIMAX algorithm and ALPHA-BETA pruning in order to find the best move given a current board position. How do I test if my implementation is correct? How do I generate the test cases because, it is very tedious (maybe not, but if it is) to work out game tree algorithms by hand?<p>PS: I am a beginner programmer and a CS student learning from my mistakes. One of my past failed project taught me the importance of unit testing. I am just trying to figure out what my current project is trying to teach me. So I apologize in advance if this is a totally newbie question.
======
jonjacky
I have faced somewhat similar problems, testing complicated physics
calculations. A few strategies:

\- There are special cases where it is feasible to do a calculation 'by hand'
(actually with a calculator or a much simpler program). The complicated
program should get (nearly) the same answer as the hand calculation in those
special cases.

\- Any solution should have certain properties which it is feasible to check
'by hand'. For example in physics there are often conserved quantities (energy
etc.). Or the output should change as a known function of one of the inputs,
if you have a series of test cases where only that one input varies.

\- Instrument the program. Write assertions that check intermediate results.
Each intermediate result should be a known function of preceding intermediate
results - write code which checks that.

That last option is similar to the formal verification recommended in another
answer. In flow-blown formal verification you try to write a proof that the
intermediate result is the intended function of the preceding intermediate
results, for all allowed inputs -- and then you produce a chain of such proofs
from the input to the output. It is often much easier to just write code to
check the intermediate results.

~~~
arunaugustine
Yes!, these strategies make sense: \- I could for example, run a tic-tac-toe
game against the game tree algorithms and test it instead of nine-men morris
game which is more complicated. I never thought of that!

\- I wonder if there is a systematic procedure than can enumerate the fixed
properties for algorithms. I guess this is the hard part in formal
verification theory.

------
micheljansen
Well, to throw in a wildly out of context Einstein quote: "We cannot solve
problems by using the same kind of thinking we used when we created them.".

What I'm trying to say: you cannot generate a set of test problems using the
program you are trying to test, unless you know the program is correct, which
is probably why you want to test it to begin with (unless you are just writing
tests to prevent regression). On a side note: if you ever find yourself
working on machine learning, you'll find that the same goes for rigorously
separating your training set and benchmark/test set).

Formal verification of software is hard. It's a huge field of research. If you
really want to prove your algorithm is correct (not your implementation of
it!), you're probably better off going the mathematical or formal modeling
route. If you want to test your implementation, maybe you don't need to test
every single case. A set of carefully hand-crafted test cases might go a long
way.

~~~
arunaugustine
Thanks for your reply. Yes, my question pertains to testing my implementation
of an algorithm, when hand-crafting the test case is hard to do.

------
agibsonccc
Is there any way you can preset a search path in your algorithm? Since these
are heuristics based, it wouldn't be a bad idea to walk a finite subset of the
game tree for each side and ensure the "best decisions" are made. You could
also test subsets of the algorithm's nodes to determine they are pruned
properly. My background is mainly in NLP, so I only have the basic knowledge
of game playing, but usually bootstrapping manually can and is the best
route..

Usually tests are deterministic, but when you start throwing probabilities in
the mix, a max likelihood/score for anything usually comes down to a hand
crafted situation..

~~~
arunaugustine
Yes, compared to formal verification etc handcrafting seems much easier to do
for most algorithm implementations.

