Hacker News new | past | comments | ask | show | jobs | submit login

> I'm actually perfectly fine if StackOverflow wants to sell an answer I made to help train AI.

I’m not.

This was a collaborative effort to make the lives of programmers easier, and the data was always meant to be a public good. OpenAI – and, more importantly – all the other LLMs with pockets that aren’t as deep – should be able to just download the database and train on it for free.

I don’t care about any license. I don’t care about attribution. Learning isn’t copying, so copyright is irrelevant. I contributed about a thousand answers to Stack Overflow, all with the understanding that anybody can download and use them for free, not so they can be locked up by Stack Overflow.

What concerns me with deals like this is that it’s altering the cultural norm to expand copyright to cover not just copying, but use. Deals like this being made by OpenAI makes it more likely to cause pushback at the social and legal level when other LLMs are trained without these deals in place.

It’s akin to – and can possibly result in – regulatory capture, making it difficult for new startups to compete with OpenAI.




> the data was always meant to be a public good.

The words are a copyleft-able public good. Concepts, facts, and ideas are not; anyone can use them for anything, including making money. If you're actually worried about specific wording or other creative choices being unjustly used improperly by an LLM, then by all means that should be enforced. But those examples are just very rare, because the LLMs are very good at extracting facts from prose.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: