Training isn't copying (any more than a browser or CDN cache is) or distribution. Copyright is out of scope. Is there anything else that would prohibit training?
The act of copying while training is incidental, in the same way as browser caches are incidental to the viewing of the content. Except that with training, you don't want to end up with a duplicate. The whole point is not to copy the original. Training is not copying.
In the specific case of Wikipedia I would guess it's allowed by the license, but that's not generally true.