Hacker News new | past | comments | ask | show | jobs | submit login

> I publish copyrighted code. Some company decides to consume it without purchasing a license. The product they distribute is vastly different from my code itself, but I can still sue them into oblivion.

Isn't the analogy more: an employee at a company reads your copyrighted code, along with many other pieces of code, and produces a new piece of code? Your code influenced the output, but in no way can you a) detect that influence b) assert any copyright over the output.

In your analogy, your code is still "in tact" within the new product; that's not the case with LLM-produced output.




If the new code is almost identical to the original code, then it is very much subject to copyright, or at the very least is grounds for a lawsuit. And generetive AIs can and do generate output extremely similar to some of its inputs if given the right prompt.

My personal opinion is that training a model isn't infringing the copyright. But generating outputs can, if they are sufficiently similar. And since the model itself can't be liable for such infringing works, I think the creator of the model should be responsible.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: