In some cases, yes. In others, you can only manage that on a micro-level.
An example that comes to mind: at some point I was working on an optimized paintbrush tool that was to give perceptually similar results to drawing the brush image repeatedly along a path. I replaced the individual draws with convolution, but there was no way to make this exactly correspond with the "over" compositing operator that it was replacing. So I used a post-processing step to adjust it.
In this case, the divergence from the reference implementation was so great that you couldn't really compare them. But the results looked good, and the performance gains were tremendous. The only way to compare it would have been to create a new reference implementation that imitated the new algorithm, which would have told me very little. Instead, I did lots and lots of real world testing to ensure there were no missed edge cases.