If someone it stupid enough not to even change variable names then they'd be caught easily. While with Millions of solutions it would be easy to get some replication, most of the time there wouldn't be any and this is when manual intervention by a person could be triggered.
This in conjuntion with your solution would work pretty well especially if someone submitted a wrong answer for another question that they haven't seen.
Use an algorithm to cross-check homework submissions, checking for structural copying. If a high correlation is found, as with human intervention, you suspect cheating but cannot prove it.
So, only for the potential cheating students, issue an extra quiz at some scheduled time, and discard the homework score. These quizzes should be all issued in parallel batches and last an hour or so, so solutions can't possibly be copied among cheaters. Make sure the quiz is actually harder than the homework, conceptually.
In other words, rather than trying to punish potential cheaters, just keep testing them until you're sure they're not cheating. This is better IMO than producing a 10x size problem set and selecting random subsets, because the number of suspected cheaters will probably be a small-ish percentage. Therefore, rather than having to write a ton of etra test material for everyone, you only have to write a few extra quizzes for a subset of the class.