This description doesn't quite make sense. If the threshold is regularly changing, the calculation can output the same result number for the same 21 parameters and have it be a bug or not a bug from month to month, depending on the threshold. How can you write a test for that without locking in the threshold? Indeed, without hard-coding the threshold in the calculation itself?