I think without looking at the contracts, we don't really know. Given this is all based on transformers from Google though, I am pretty sure MSFT with the right team could build a better LLM.
The key ingredient appears to be mass GPU and infra, tbh, with a collection of engineers who know how to work at scale.
>MSFT with the right team could build a better LLM
somehow everybody seems to assume that the disgruntled OpenAI people will rush to MSFT. Between MSFT and the shaken OpenAI, I suspect Google Brain and the likes would be much more preferable. I'd be surprised if Google isn't rolling out eye-popping offers to the OpenAI folks right now.
Like the review which allowed them tonignore licenses while ingesting all public repos in GitHub? - And yes, true, T&C allow them to ignore the license, while it is questionable whether all people who uploaded stuff to GitHub had the rights given by T&C (uploading some older project with many contributors to GitHub etc.)
Different threat profile. They don’t have the TOS protection for training data and Microsoft is a juicy target for a huge copyright infringement lawsuit.
Yeah, that's an interesting point. But I think with appropriate RAG techniques and proper citations, a future LLM can get around the copyright issues.
The problem right now with GPT4 is that it's not citing its sources (for non search based stuff), which is immoral and maybe even a valid reason to sue over.
but why didn't they? Google and Meta both had competing language models spun up right away. Why was microsoft so far behind? Something cultural most likely.
The key ingredient appears to be mass GPU and infra, tbh, with a collection of engineers who know how to work at scale.