Hacker News new | past | comments | ask | show | jobs | submit login

The author of a work has copyright automatically, so what would prohibit it by default is copyright law.

In the specific case of Wikipedia I would guess it's allowed by the license, but that's not generally true.




Training isn't copying (any more than a browser or CDN cache is) or distribution. Copyright is out of scope. Is there anything else that would prohibit training?


All of those things are copying. Browser caches probably fall under fair use. CDNs are contracted by the distributor so literally licensed.


The act of copying while training is incidental, in the same way as browser caches are incidental to the viewing of the content. Except that with training, you don't want to end up with a duplicate. The whole point is not to copy the original. Training is not copying.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: