That's a good point but all it means is that we can't test the hypothesis one way or the other due to never being entirely certain that a given task isn't anywhere in the training data. Supposing that "AIs can't" is then just as invalid as supposing that "AIs can".