>However, Copilot generates verbatim code, and it generates novel code. That is,...

shakna · on July 5, 2021

> That seems like a pretty expansive view. I’ve read some GPL code in my life, and I’m sure it has influenced me. Does that make all my code “derived”? I wouldn’t say that. To truly be derived it needs to be a nontrivial amount, otherwise every time you type “i++;” you’re in violation. This is hard to prove.

You're not a piece of software, so the areas of copyright law that are applicable are completely different. (And yes, copyright does acknowledge a minimal amount required to be copyrightable - but that minimal amount may sometimes be argued to be a single line.)

However, you can absolutely face civil charges if you reproduce too-similar code for a competitor, after absorbing the technical architecture at another workplace.

> This is exactly the same case for language generators. You have a language model, and a piece of code that makes predictions based on the given text and the language model. Swap out the language model, you get different results.

Legally speaking, Copilot isn't advertised with multiple available language models. It isn't presented that way, so it won't be treated that way. It will be treated as a singular piece of software.

> Given this information, why — and be specific — is a language model not like a database?

In the eyes of the law, and this is very specific, the model is marketed as part of the software, and so is part of the software. The underlying design architecture is utterly irrelevant, because it is presented as a package deal of "GitHub Copilot".

jonathankoren · on July 7, 2021

> You're not a piece of software, so the areas of copyright law that are applicable are completely different. (And yes, copyright does acknowledge a minimal amount required to be copyrightable - but that minimal amount may sometimes be argued to be a single line.)

Putting aside the philosophical aspects of this statement, you proved my point. I said that the ultimate person held liable for violating a license is not a tool, but the person choosing to integrate suggested changes by the tool. But now somehow you expect me to believe that the person that built an automaton, but is not directing the automaton, and certainly doesn't have final say in whether or not to incorporate the automaton's suggestions is at legally culpable, because they're being held to a stricter standard? If that was legal standard with any tool, then literally every manufacturer of every tool would be held liable for any and all misuse. Obviously, this is not the case.

> Legally speaking, Copilot isn't advertised with multiple available language models. It isn't presented that way, so it won't be treated that way. It will be treated as a singular piece of software.

Actually speaking, you're not a lawyer, and that this is INCREDIBLY controversial statement, that doesn't really standup to much scrutiny, since there is a bright line that separating the two.

Even if Github was ruled against (and they won't be), case law is filled with examples where the injunctive relief is limited to claims presented (in this case source related to a specific a work) rather than the entire system including playback device and the recording.