Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Which open source license can I use to forbids GPT-3 model training?
6 points by b1n on June 29, 2021 | hide | past | favorite | 15 comments
Products like GitHub Copilot (https://news.ycombinator.com/item?id=27676266) use open source code to train their "intelligent" code complete suggestions.

I want to release some open source code, but I don't want to make the mistake of training my replacement.

Are there any open source licenses that explicitly forbid use when training an AI?




> Are there any open source licenses that explicitly forbid use when training an AI?

Github/OpenAI's defense is "training ML systems on public data is fair use" (https://news.ycombinator.com/item?id=27678354). Unless this assertion gets invalidated in courts I think they mostly don't care about wordage in your license

> I want to release some open source code, but I don't want to make the mistake of training my replacement.

Most of a software engineer's value is not in the code they write. By far and large employers care instead that you solve their problems. Code's just a by-product, means to achieve that.


> Are there any open source licenses that explicitly forbid use when training an AI?

No, though plenty don’t license for that purpose and/or would require any work thereby created requiring w license to use the same license (e.g., all copyleft licenses).

But since Copilot and others use publicly available (independent or license, not specifically open-source) code on the basis that such use does not require a license, the license isn't going to stop them. Are you prepared to sue Microsoft to prevent them from using your code? If not, given their position on “Fair Use”, you aren't going to stop them.


Then that would no longer be an Open source license.


Or, with a little more detail: "Open Source" is defined by the Open Source Definition (https://opensource.org/osd), and clause 6 requires no constraint of fields of endeavor.

Which means, you cannot limit the purposes for which your software is used.

If you wanted to prohibit it being used to train ML models, you could add this restriction to your license, but it would no longer be Open Source(TM).


My understanding is that Open Source means just that - The source is open for inspection. It doesn't immediately imply that the source is open for free usage by all.



Thanks, I was just about to ask.


It's worth reading the definition: https://opensource.org/osd

I think this would likely fall foul of clause 6, in that using the software for training ML models would likely be considered a "field of endeavor". It's not the classic application of this clause (ie. "you can't use this software for military purposes", etc), but I think it'd still be covered.

Of course, IANAL, and your lawyer's opinion might differ.


It's just another tool.

In the past, programmers didn't have compilers, or linters, or debuggers, or dependency managers, or ... etc. All the tooling a modern programmer uses could be viewed as having taken work away from humans.

Just like these tools, AI will help you write more, better, code. There's a long way to go until we can generate software from an end-user's description of what they want done.



It's worth discussing the fact that this is a tool you can never own or fully control.


Which is not that different to eg. Visual Studio, really.

I don't dispute your point though. I just feel like it's more a matter of degree, rather than a binary now-you-do / now-you-don't situation.


IMHO, you are not training your replacement, you are training some kind of assistant.


Yea lol we're over estimating this AI here


> but I don't want to make the mistake of training my replacement.

If that was possible, would you really want to spend much of your life doing work that could be automated away?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: