You're thinking at a much more concrete level than I am. When I think about an A...

You're thinking at a much more concrete level than I am. When I think about an AST differ for the purpose of identifying copied code, I'd abstract out different ways of writing loops; I'd unify polymorphic calls to pattern matching; I'd break function call graphs down to a forest of basic blocks. In other words, I'd tune my AST comparison to actually look for algorithm similarity. I wouldn't be so trivially gamed.

Even better, I'd work with a traced execution, and examine isomorphisms between call graphs and data structures.