> Copilot is a derivative work of that code, the license terms of that code (e.g. GPL, non-commercial) should extend to the new function that was derived from it.
But the very act of training copilot is not problematic. And in fact, if GitHub never did anything with Copilot, the physical act of training the model is not problematic at all. And that's what at issue here. How Copilot is used is orthogonal to the article.
> Sure, I can make a shirt with Spider Man on it and give it to my brother, but if a company were to use what I made or I tried to sell it, I would expect a cease and desist from Disney.
Yes. And training the model isn't the part where you sell it. It's the part where you make it.
> Training the model may very well be a copyright issue. The images have been copied, they are being used.
What do you think "being used" means here? If I work for a company and download a bunch of text and save it to a flash drive, have I violated copyright? Of course not. If I put that data in a spreadsheet, is it copyright infringement? Of course not. If I use Excel formulas on that text is it infringement? Still no.
And so how can you claim in any way that the creation of a model is anything more than aggregating freely available information?
I don't disagree with you about the use of a model. But training the model is just taking some information and running code against it. That's what's important here.
This is not model training.
> Copilot is a derivative work of that code, the license terms of that code (e.g. GPL, non-commercial) should extend to the new function that was derived from it.
But the very act of training copilot is not problematic. And in fact, if GitHub never did anything with Copilot, the physical act of training the model is not problematic at all. And that's what at issue here. How Copilot is used is orthogonal to the article.
> Sure, I can make a shirt with Spider Man on it and give it to my brother, but if a company were to use what I made or I tried to sell it, I would expect a cease and desist from Disney.
Yes. And training the model isn't the part where you sell it. It's the part where you make it.
> Training the model may very well be a copyright issue. The images have been copied, they are being used.
What do you think "being used" means here? If I work for a company and download a bunch of text and save it to a flash drive, have I violated copyright? Of course not. If I put that data in a spreadsheet, is it copyright infringement? Of course not. If I use Excel formulas on that text is it infringement? Still no.
And so how can you claim in any way that the creation of a model is anything more than aggregating freely available information?
I don't disagree with you about the use of a model. But training the model is just taking some information and running code against it. That's what's important here.