I think without looking at the contracts, we don't really know. Given this is al...

trhway · on Nov 20, 2023

>MSFT with the right team could build a better LLM

somehow everybody seems to assume that the disgruntled OpenAI people will rush to MSFT. Between MSFT and the shaken OpenAI, I suspect Google Brain and the likes would be much more preferable. I'd be surprised if Google isn't rolling out eye-popping offers to the OpenAI folks right now.

bugglebeetle · on Nov 20, 2023

> I am pretty sure MSFT with the right team could build a better LLM.

I wouldn’t count on that if Microsoft’s legal team does a review of the training data.

johannes1234321 · on Nov 20, 2023

Like the review which allowed them tonignore licenses while ingesting all public repos in GitHub? - And yes, true, T&C allow them to ignore the license, while it is questionable whether all people who uploaded stuff to GitHub had the rights given by T&C (uploading some older project with many contributors to GitHub etc.)

bugglebeetle · on Nov 20, 2023

Different threat profile. They don’t have the TOS protection for training data and Microsoft is a juicy target for a huge copyright infringement lawsuit.

blazespin · on Nov 20, 2023

Yeah, that's an interesting point. But I think with appropriate RAG techniques and proper citations, a future LLM can get around the copyright issues.

The problem right now with GPT4 is that it's not citing its sources (for non search based stuff), which is immoral and maybe even a valid reason to sue over.

VirusNewbie · on Nov 20, 2023

but why didn't they? Google and Meta both had competing language models spun up right away. Why was microsoft so far behind? Something cultural most likely.