Two years ago, an issue was opened in Microsoft's IntelliCode GitHub repository[0] titled "Licensing issues". It receives a response from a Microsoft employee. Eventually, the argument is made that this is a derivative work, as it is derived from thousands (?) of open-source projects. From what I understand, this seems to be true.
However, here's the fun part: Microsoft is training its AI dataset on these open-source projects. Would the terms of the license still apply here?
Further, would you say the law hasn't caught up with this use of open-source projects yet?
I am also curious about the legality of GitHub Copilot, since they seem to do largely the same thing from an AI standpoint.
[0]: https://github.com/MicrosoftDocs/intellicode/issues/201
EDIT: IntelliCode, not IntelliSense!
For why these things are essentially mangled compression engines one can take a look at "Hopfield Networks is all you need": https://arxiv.org/abs/2008.02217. It allows representing all modern transformer networks (which is what CoPilot is using) as a bunch of hopfield networks which are essentially memory modules connected in some complicated topology to encode some data set.