And how do you do this in practice? I am struggling to think of a good way to keep the production code that fails the test and the production code that doesn't fail the test together. I might have my test check out an old version of the production code, compile it and test against that. But that is hard to get right.