It seems this is built on LLAMA. Did meta change the license to make it open sou...

sillysaurusx · on Aug 24, 2023

It’s not possible to have a license over an ML model trained on other peoples’ works, since such models are uncopyrightable. They’re more like a phone book; a collection of facts trained by an entirely un-creative process. https://news.ycombinator.com/item?id=36691050

This hasn’t been proven in court, but it seems the most likely outcome.

nemoniac · on Aug 24, 2023

Not saying that this applies to LLMs but if you describe them as "a collection of facts [collected and] trained by an entirely un-creative process" then it begins to sound like one could argue for Database Right.

https://en.wikipedia.org/wiki/Database_right

supermatt · on Aug 24, 2023

And yet meta are specifying a license, implying they do hold the copyright.

sillysaurusx · on Aug 24, 2023

They’re mistaken.

It’ll take awhile for this mistake to be reconciled in court though.

inciampati · on Aug 24, 2023

Keep working on it! A good court case and a landmark decision about this could change the landscape for the market for these models.

ImprobableTruth · on Aug 24, 2023

Llama 2 is open source-ish. Weights are freely available and can be commercially used, but only if you have less than 700m users and agree to some "don't do naughty things" terms.

jmiskovic · on Aug 24, 2023

Nope. It's limited by 700m monthly-active users at the time Llama2 was released, a weird catch clause for a handful of Meta competitor companies. The license doesn't satisfy OSS requirements, but it is quite reasonable.

https://ai.meta.com/llama/license/ https://ai.meta.com/llama/use-policy/

supermatt · on Aug 24, 2023

If the JSON license isn't considered open, due to requiring that "The Software shall be used for Good, not Evil.", then I don't see how tacking an additional financial threshold onto it makes it more open. I don't think meta even released the training dataset, so you cant even replicate it (should you have the funds to do so).

There are other LLMs that don't have such restrictions, and publish their training data.

satvikpendem · on Aug 24, 2023

There is no open source-ish. It either protects the fundamental freedoms or it...doesn't.

jmiskovic · on Aug 24, 2023

Open source is both a colloquial term for available/modifiable/distributable code (like Llama2), and a strict OSI-approved list of licenses. I'd say opensource-ish is a great fit here.

Edit: this is in fact fairly interesting discussion because LLM is a new breed of digital products. Meta's terms are practical for limiting the usage for commercial applications, and they are designed to protect the general population. It's not the worn out "protecting us from ourselves", its actually preventing Llama users from harming non-users. Yes, we can be jaded and say it's about protecting the brand and dissociating from bad actors. My point is that it's hard to apply usual arguments for open source and freedom of computing, when you're defending rights of people who want to harm other people.

pastage · on Aug 24, 2023

Sure there is, BSD is bad because xyz, GPL is bad because zyx. That said Llama restrictions are rather harsh and you are not allowed to improve other models with it. So no freedom there just some ok beer.

keyle · on Aug 24, 2023

"open source-ish" sounds like the perfect way to massively profitable future litigations.

Also "don't do naughty things", is there a chart for that? How is that defined, is it part of the non-existing license?

yreg · on Aug 24, 2023

> How is that defined, is it part of the non-existing license?

What do you mean? https://github.com/facebookresearch/llama/blob/main/LICENSE

Havoc · on Aug 24, 2023

From memory llama 2 license does allow tuned models with suitable credit & license inclusion. The restricted using it to train other models though (a bit like people use gpt4 to generate question/answer pairs to train their models)