Llama is not open source. It's corporate freeware with some generous allowances.
Open source licenses are a well defined thing. Meta marketing saying otherwise doesn't mean they get to usurp the meaning of a well understood and commonly used understanding of the term "open source."
Nothing about Meta's license is open source. It's a carefully constructed legal agreement intended to prevent any meaningful encroachment by anyone, ever, into any potential Meta profit, and to disavow liability to prevent reputational harm in the case of someone using their freeware for something embarrassing.
If you use it against the license anyway, you'll just have to hope you never get successful enough that it becomes more profitable to sue you and take your product away than it would be annoying to prosecute you under their legal rights. When the threshold between annoying and profitable is crossed, Meta's lawyers will start sniping and acquiring users of their IP.
> "Nothing about Meta's license is open source. It's a carefully constructed legal agreement intended to prevent any meaningful encroachment by anyone, ever, into any potential Meta profit, and to disavow liability to prevent reputational harm in the case of someone using their freeware for something embarrassing."
You seem to be making claims that have little connection to the actual license.
The license states you can't use the model if, at the time Llama 3 was released, you had >700 million customers. It also says you can't use it for illegal/military/etc uses. Other than that, you can use it as you wish.
That "etc" is doing a lot of work here. The point of OSI licenses like MIT, Apache 2.0 is to remove the "etc". The licensing company gives up its right to impose acceptable use policies. More restrictive, but still OSI approved, licenses are as clear as they possibly can be about allowed uses and the language is as unambiguous as possible. Neither is the case for the Llama AUP.
Those additional restrictions mean it's not an open source license by the OSI definition, which matters if you care about words sometimes having unambiguous meanings.
I call models like this "openly licensed" but not "open source licensed".
Call it what you will, but it'd be silly if Meta let these 700M+ customer mega-corps (Amazon, Google, etc) just take Meta models and sell access to them without sharing revenue with Meta.
You should be happy that Meta find ways to make money from their models, otherwise it's unlikely that they'd be giving you free access (until your startup reaches 700M+ customers, when the free ride ends).
> until your startup reaches 700M+ customers, when the free ride ends
No it doesn’t. The licence terms talk about that those who on the release date of llama3 had 700M+ customers need an extra licence to use it. It doesn’t say that you loose access to it if in the future you gain that many users.
You don't lose access, but the free ride ends. It seems that new licence will include payment terms. Zuckerberg discusses this on the Dwarkesh interview.
What does the “free ride ends” mean? If you mean you can’t use the next model they might release after you have reached that many users, sure that might be true. It is not true that you have to pay for the already released llama 3.
I don’t care what Zuckerberg says. I care what the licence says. I recommend you to read it. It is shorter and more approachable than the usual rental agreement of a flat.
Here is the relevant Llama 3 license section, below, in it's entirety. It says that if you have 700M+ users then you'll need a new license, which Meta may or may not choose to grant to you. It does not say what the terms of that new license will be, but if you are interested you can watch the Dwarkesh interview, or just believe me when I tell you that Zuck said it'll be a commercial license - you will pay.
**
2. Additional Commercial Terms. If, on the Meta Llama 3 version release date, the monthly active users of the products or services made available by or for Licensee, or Licensee’s affiliates, is greater than 700 million monthly active users in the preceding calendar month, you must request a license from Meta, which Meta may grant to you in its sole discretion, and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.
It seems pretty clear cut that it’s monthly active users when Llama 3 is released.
> If, on the Meta Llama 3 version release date, the monthly active users … is greater than 700 million monthly active users in the preceding calendar month …
If that’s not true then the free license applies to you.
Presumably megacorp's laywers are engaged with you doing due diligence before the acquisition, will be looking into this, and evaluating the license. Maybe they have prior licensing agreements with Meta, or plan to replace your use of Llama with something different, who knows.
OTOH if you are being acquired by Elon Musk, then there may be no due diligence, he will tear up any existing license agreements, spend the next year bickering with Meta on Twitter, then be sued to comply.
> Here is the relevant Llama 3 license section, below, in it's entirety.
I agree too that this is the relevant section.
> It says that if you have 700M+ users then you'll need a new license
It does not say that. It says that if you or your affiliate had 700M+ users on the day of llama3's release date then you need an other licence.
This does not trigger if you just gain 700M+ users. Simply it does not. It does trigger if you become affiliated by someone who in that past date already had 700M+ (for example if google buys you up, or if you become a strategic partner of google).
The key here is "on the Meta Llama 3 version release date" which sets the exact date for when the monthly active users of the products or services should be counted.
> It does not say what the terms of that new license will be
Correct. And I assume the terms would be highly onerous. That I do not dispute.
> or just believe me when I tell you that Zuck said it'll be a commercial license
I believe you on that. That is not what we disagree on. The bit we seem to disagree on is when exactly do you need this extra licence. You state that you need it if your company gains in a future date 700M+ users. That is simply not supported by the very section you quoted above.
In practice this isn't a matter of how you or I interpret this license - it's a matter of how watertight it is legally.
There's no reason to suppose that terms of any commercial licensing agreement would be onerous. At this stage at least these models are all pretty fungible and could be swapped out without much effort, so Meta would be competing with other companies for your business, if they want it. If they don't want your business (e.g. maybe you're a FaceBook competitor), then they have reserved right not to license it to you.
In any case, don't argue it with me. In practice this would be your lawyers engaged with Meta and their lawyers, and product licensing team.
Isn’t a simple interpretation of this type of license that some people get the open source license and others get the commercial license? Almost like a switch statement for licenses. If you belong in the category that gets the commercial one, you cannot call it open source for sure, but if you belong to the other category then it seems like an open source license to me. There is no guarantee about future licenses, and some (reasonable) restrictions but all open source licenses have some terms attached.
I don’t understand how the idea of open source become some sort of pseudo-legalistic purity test on everything.
Models aren’t code, some of the concepts of open source code don’t map 1:1 to freely available models.
In spirit I think this is “open source”, and I think that’s how the majority of people think.
Turning everything into some sort of theological debate takes away a lot of credit that Meta deserves. Google isn’t doing this. OpenAI sure as fuck isn’t.
> Turning everything into some sort of theological debate takes away a lot of credit that Meta deserves.
It's not theological, it's the misuse of a specific legal definition that we all have interest in maintaining. "Freely available models" or "open license" are accurate.
Other companies keeping things for themselves doesn't warp reality, or the existing definitions we use to describe it. Giving them the credit they deserve, especially in comparison to the others, should be enough.
Hate to break it to you but there’s a thousand court cases a day precisely because “specific legal definition” is a surprisingly flexible concept depending on context. Likewise when new technologies emerge it often requires reappraisal and interpretation of existing laws, even if that reappraisal is simply extending the old law to the new context.
This isn't a problem with interpretation, as I would guess those are. This is a term that clearly describes requirements for a category, with the these models licenses purposefully and directly excluding themselves from that category.
> In spirit I think this is “open source”, and I think that’s how the majority of people think.
No, it isn't. You do, but, as evidenced by other comments, there's clearly people that don't. Thinking that you're with the majority and it's just a vocal minority is one thing, but it could just as easily be said that the vocal groups objecting to your characterization are representative of the mainstream view.
If we look at these models as the output of a compiler, that we don't have the inputs to, but that we are free (ish) to use and modify and redistribute, it's a nice grant from the copyright holder, but that very much doesn't look like open source. Open source, applied to AI models would mean giving us (a reference to) the dataset and the code used to train the model so we could tweak it to train the model slightly differently. To be less apologetic or something by default, instead of having to give it additional system instructions.
Model Available(MA) is freer than Model unavailable, and it's more generous than model unavailable, but it's very much not in the spirit of open source. I can't train my own model using what Meta has given us here.
And just to note, Google Gemma is the one they are releasing weights for. They are doing this and deserve credit for it.
It doesn’t mean it’s a bad license, just that it doesn’t meet the definition. There are legitimate reasons for companies to use source-available licenses. You still get to see the source code and do some useful things with it, but read the terms to see what you can do.
Meanwhile, there are also good reasons not to water down a well-defined term so it becomes meaningless like “agile” or “open.”
This gets confusing because people want to use “open source” as a sort of marketing term that just means it’s good, so if you say it’s not open source that’s taken to imply it’s bad.
But it’s also a bit absurd in a sense - let’s say you have all of Meta’s code and training data. Ok, now what? Even if you also had a couple spare data centers, unlimited money, and an army of engineers, you can’t even find enough NVIDIA cards to do the training run. This isn’t some homebrew shit, it’s millions upon millions of dollars of computational power devoted to building this thing.
I think at a fundamental level people have to start thinking a little differently about what this is, what open really means, and the like.
People are thinking what open really means, and they're telling you this isn't open. it definitely isn't Open Source, as defined by the OSI.
Open Source has a specific meaning and this doesn't meet it. It's generous of Meta to give us these models and grant us access to them, and let us modify them,
fine tune them, and further redistribute them. It's really great! But we're still in the dark as to how they came about the weights. It's a closed, proprietary process, of which we have some details, which is interesting and all, but that's not the same as having access to the actual mechanism used to generate the model.
This is like saying an image is or isn't open source. The model itself isn't a program, so asking whether it's open source or not is a bit of a category error.
So it's a bit silly for anyone to claim a model is open source, but it's not silly to say a model is open. What open means isn't well defined when it comes to a model in the same way that source code is.
Imo if someone reveals the model's architecture and makes the weights available with minimal limitations, it's probably reasonable to call it open. I don't know that that would apply to llama though since I believe there are limitations on how you can use the model.
I think you're conferring one hell of a lot of credit to Meta that is entirely undeserved. This is not a charitable, net benefit to humanity organization. These are not the good guys. These people are responsible for one hell of a lot of harm, and imagining they have good intentions is naive at best. I don't doubt the individual software engineers and researchers are good people. It's the corporation that's in charge of the llama product, however, and it's the lawyers, executives, and middle management that will start cracking down on technicalities and violations of the license. The precise instant that it becomes more profitable and less annoying to sue someone for violation of the license, Meta's lawyers will do so, because that's what companies are obligated to do. The second some group of shareholders start pointing out blatant violations of the license in products using llama, the lawyers will be obligated to crack down.
Meta is a corporation, and not subject to rational, good faith human judgment. It's a construct that boils down to an algorithmic implementation of the rules, regulations, internal policies, communication channels, and all those complex interactions that effectively prevent sensible, good faith human intervention at any given stage that would even allow the company to just let people continue to violate their stated license. Like trademarks, if you don't enforce a contract, the inaction dissipates your ability to enforce it later on. They don't pay these lawyers to come up with these licenses and contracts for shits and giggles.
The license is not the outcome of a happy weekend brainstorm session tacked on ad hoc just to maximize the benefit to humanity and blissfully join the wide world of open source.
The license is intended to prevent any serious competitive use of their AI models by third parties. It was crafted deliberately and carefully and expensively. They didn't use existing open source licenses because no license offered them the particular mix of rights and restrictions that fit their overall strategy. It's for PR, the ability to stifle competition, to get free beta testing and market research, and 100% of every part of the license is intentional and an insidious perversion of the idea of "open."
Meta doesn't deserve credit, they deserve condemnation. They could have gone with any number of open source licenses, using GPL or CC licensing with specific provisions to protect their interests and prevent commercial exploitation, or use dual licensing to incentivize different tiers of access. They deliberately and with a high level of effort pursued their own invented license. They are using weasel words and claiming they are open source all over the place in order to foster good will.
The argument "but nobody has been sued" is more than a little silly. There's simply no product known to use their models currently on the market that's both a blatant enough violation and worth enough money to sacrifice the good will they've been fostering. There's no human in organizations that size with the capacity to step in and prevent the lawsuits from happening. It'll be a collective, rules and policies decision completely out of anyone's hands to prevent, even if Zuck himself wanted to intervene. The shareholders' interests reign supreme.
Meta isn't a moral institution.
It's a ruthlessly profitable one.
What are the practical use cases where the license prohibits people from using llama models? There are plenty of startups and companies that already build their business on llamas (eg phind.com). I do not see the issues that you assume exist.
If you get that successful that you cannot use it anymore (have 10% of earth's population as clients) probably you can train your own models already.
The parameters and the license. Mistral uses Apache 2.0, a neatly permissive open source license. As such, it's an open source model.
Models are similar to code you might run on a compiled vm or native operating system. Llama.cpp is to a model as Python is to a python script. The license lays out the rights and responsibilities of the users of the software, or the model, in this case. The training data, process, pipeline to build the model in the first place is a distinct and separate thing from the models themselves. It'd be nice if those were open, too, but when dealing with just the model:
If it uses an OSI recognized open source license, it is an open source model.
If it doesn't use an OSI recognized open source license, it's not.
Llama is not open source. It's corporate freeware.
Mistral is not “open source” either since we cannot reproduce it (the training data is not published). Both are open weight models, and they are both released under a license whose legal basis is unclear: it's not actually clear if they own any intellectual property over the model at all. Of course they claim such IP, but no court has ruled on this yet AFAIK and legislators could also enact laws that make these public domain altogether.
I have a hard time about the "cannot reproduce" categorization.
There are places (e.g. in the Linux kernel? AMD drivers?) where lots of generated code is pushed and (apart from the rants of huge unwieldy commits and complaints that it would be better engineering-wise to get their hands on the code generator, it seems no one is saying the AMD drivers aren't GPL compliant or OSI-compliant?
There are probably lots of OSS that is filled with constants and code they probably couldn't rederive easily, and we still call them OSS?
But with generated code what you end up with is still code, that can be edited by whoever needs. If AMD stopped maintaining their drivers then people would be maintaining the generated code, it wouldn't be a nice situation but it would work, whereas model weights are akin to the binary blobs you get in the Android world, binary blobs that nobody call open-source…
I personally think that the model artifacts are simply programs with tons of constants. Many math routines have constants in their approximations and I don’t expect the source to include the full derivation for these constants all the time. I see LLMs as a same category but with (much) larger sets of parameters. What is better about the LLMs than some of the mathematical constants in complicated function approximations, is that I can go and keep training an LLM whereas the math/engineering libraries might not make it easy for me to modify them without also figuring out the details that led to those particular parameter choices.
Is “reproducibility” actually the right term here?
It’s a bit like arguing that Linux is not open source because you don’t have every email Linus and the maintainers ever received. Or that you don’t know what lectures Linus attended or what books he’s read.
The weights “are the thing” in the same sense that the “code is the thing”. You can modify open code and recompile it. You can similarly modify weights with fine tuning or even architectural changes. You don’t need to go “back to the beginning” in the same sense that Linux would continue to be open source even without the Git history and the LKM mailing list.
> It’s a bit like arguing that Linux is not open source because you don’t have every email Linus and the maintainers ever received. Or that you don’t know what lectures Linus attended or what books he’s read.
Linux is open source, because you can actually compile it yourself! You don't need Linus's email for that (and if you needed some secret cryptographic key on Linus' laptop to decrypt and compile the kernel, then it wouldn't make sense to call it open-source either).
A language model isn't a piece of code, it's a huge binary blob that's being executed by a small piece of code that contains little of the added value, everything that matters is in the blob. Sharing only the compiled blob and the code to run makes it unsuitable for an “open source qualifier” (It's kind of the same thing as proprietary Java code: the VM is open-source but the bytecode you run on it isn't).
And yes, you can fine-tune and change things in the model weights themselves the same way you can edit the binary of a proprietary game to disable DRMs, that doesn't make it open-source either. Fine tuning doesn't give you the same level of control over the behavior of the model as the initial training does, like binary hacking doesn't give you the same control as having the source code to edit and rebuild.
There is an argument to be made about the importance of archeological preservation of the provenance of models, especially the first few important LLMs, for study by future generations.
In general, software rot is a huge issue, and many projects which may be of future archeological importance are increasingly non-reproducible as dependencies are often not vendored and checked into source, but instead downloaded at compile time from servers which lack strong guarantees about future availability.
This is comment is cooler than my Arctic Vault badge on GitHub.
Who were the countless unknown contemporaries of Giotto and Cimabue? Of Da Vinci and Michelangelo? Most of what we know about Renaissance art comes from 1 guy - Giorgio Vasari. We have more diverse information about ancient Egypt than the much more recent Italian Renaissance because of, essentially, better preservation techniques.
Compliance, interoperability, and publishing platforms for all this work (HuggingFace, Ollama, GitHub, HN) are our cathedrals and clay tablets. Who knows what works will fill the museums of tomorrow.
In today's Dwarkesh interview, Zuckerberg talks about energy becoming a limit for future models before cost or access to hardware does. Apparently current largest datacenters consume about 100MW, but Zuck is considering future ones consuming 1GW which is the output of typical nuclear reactor!
So, yeah, unless you own your own world-class datacenter, complete with the nuclear reactor necessary to power the training run, then training is not an option.
On a sufficiently large time scale the real limit on everything is energy. “Cost” and “access to hardware” are mere proxies for energy available to you. This is the idea behind the Kardashev scale.
A bit odd to see this downvoted... I'm not exactly a HN newbie, but still haven't fully grasped the reasons people often downvote here - simply not liking something (regardless of relevance or correctness) seems to often be the case, and perhaps sometimes even more petty reasons.
I think Zuck's discussion of energy being the limiting factor was one of the more interesting and surprising things to come out of the Dwarkesh interview. We're used to discussion of the $1B, $10B, $100B training runs becoming unsustainable, and chip shortages as an issue, but (to me at least!) it was interesting to see Zuck say that energy usage will be a disruptor before those do (partly because of lead times and regulations in expanding power supply, and bringing it in to new data centers). The sheer magnitude of projected power consumption needed is also interesting.
There is an odd contingent or set of contingents on here that do seem to down vote by ideology rather than lack of facts or lack of courtesy. It's a bit of a shame, but I'm not sure there's much to be done.
> the same way you can edit the binary of a proprietary game to disable DRMs, that doesn't make it open-source either
This is where I have to disagree. Continuing the training of an open model is the same process as the original training run. It's not a fundamentally different operation.
> Continuing the training of an open model is the same process as the original training run. It's not a fundamentally different operation.
In practice it's not (because LoRA) but that doesn't matter: continuing the training is just a patch on top of the initial training, it doesn't matter if this patch is applied through gradient descent as well, you are completely dependent on how the previous training was done, and your ability to overwrite the model's behavior is limited.
For instance, Meta could backdoor the model with specially crafted group of rare tokens to which the model would respond a pre-determined response (say “This is Llama 3 from Meta” as some kind of watermark), and you'd have no way to figure out and get rid of it during fine-tuning. This kind of things does not happen when you have access to the sources.
That's one of many techniques, and is popular because it's cheap to implement. The training of a full model can be continued with full updates, the same as the original training run.
> completely dependent on how the previous training was done, and your ability to overwrite the model's behavior is limited.
Not necessarily. You can even alter the architecture! There have been many papers about various approaches such as extending token window sizes, or adding additional skip connections, quantization, sparsity, or whatever.
> specially crafted group of rare tokens
The analogy here is that some Linux kernel developer could have left a back door in the Linux kernel source. You're arguing that Linux would only be open source if you could personally go back to the time when it was an empty folder on Linus Torvald's computer and then reproduce every step it took to get to today's tarball of the source, including every Google search done, every book referenced, every email read, etc...
That's not what open source is. The code is open, not the process that it took to get there.
Linux development may have used information from copyrighted textbooks. The source code doesn't contain the text of those textbooks, and in some sense could not be "reproduced" without the copyrighted text.
Similarly, AIs are often trained on copyrighted textbooks but the end result is open source.
> Not necessarily. You can even alter the architecture!
You can alter the architecture, but you're still playing with an opaque blob of binary *you don't know what it's made of*.
> The analogy here is that some Linux kernel developer could have left a back door in the Linux kernel source. You're arguing that Linux would only be open source if you could personally go back to the time when it was an empty folder on Linus Torvald's computer and then reproduce every step it took to get to today's tarball of the source, including every Google search done, every book referenced, every email read, etc...
No, it is just a bad analogy. To be sure that there's no backdoor in the Linux kernel, the code itself suffice. That doesn't mean there can be no backdoor since it's complex enough to hide things in it, but it's not the same thing as a backdoor hidden in a binary blob you cannot inspect even if you had a trillion dollar to spend on a million of developers.
> The code is open, not the process that it took to get there.
The code is by definition a part of a process that gets you a piece of software (which is the actually useful binary), and it's the part of the process that contains most of the value. Model weights are binary, and they are akin to the compiled binary of the software (training from data being a compute-intensive like compilation from source code, but orders of magnitude more intensive).
> Similarly, AIs are often trained on copyrighted textbooks but the end result is open source.
Court decisions are pending on the mere legality of such training, and it has nothing to do with being open-source, what's at stake is whether or not these models can be open-weight or if it is copyright infringement to publish the models.
The starting point is the ability to run the LLM as you wish, for any purpose - so if a license prohibits some uses and you have to start any usage with thinking whether it's permitted or not, that's a fail.
Then the freedom where "source" matters is the practical freedom to change the behavior so it does your computing as you wish. And that's a bit tricky - since one interpretation would require having the training data, training code and parameters; but for current LLMs the training hardware and cost of running it is a major practical limitation, so much that one could argue that the ability to change the behavior (which is the core freedom that we'd like) is separate from the ability to recreate the model, and would be more relevant in the context of the "instruction training" which happens after the main training, is the main determiner of behavior (as opposed to capability), and so the main "source would be the data for that (instruct training data, and the model weights before that finetuning) so that you can fine-tune the model on different instructions, which requires much less resources than training it from scratch, and don't have to start with the instructions and values imposed on the LLM by someone else.
See this discussion and blog post about a model called OLMo from AI2 (https://news.ycombinator.com/item?id=39974374). They try to be more truly open, although here are nuances even with them that make it not fully open. Just like with open source software, an open source model should provide everything you need to reproduce the final output, and with transparency. That means you need the training source code, the data sets, the evaluation suites, the inference code, and more.
Most of these other models, like Llama, are open weight not open source - and open weight is just openwashing, since you’re just getting the final output like a compiled executable. But even with OLMo (and others like Databrick’s DBRX) there are issues with proprietary licenses being used for some things, which prevent truly free use. For some reason in the AI world there is heavy resistance to using OSI-approved licenses like Apache or MIT.
Finally, there is still a lack of openness and transparency on the training data sets even with models that release those data sets. This is because they do a lot of filtering to produce those data sets that happen without any transparency. For example AI2’s OLMo uses a dataset that has been filtered to remove “toxic” content or “hateful” content, with input from “ethics experts” - and this is of course a key input into the overall model that can heavily bias its performance, accuracy, and neutrality.
Unfortunately, there is a lot missing from the current AI landscape as far as openness.
Weights aren’t source because the goal of having open source software is that you can know how the software you’re consuming works, and you can produce the final software (the executable) using the source yourself. When you only have weights, you are getting something like the executable. Sure you can tweak it, but you don’t have the things you need to reproduce it or to examine how it works and validate it for your purposes. As such open weights are not in the spirit of open source.
Yes or no, do you conceed that for almost everyone, none of what you said matters, and almost everyone can use llama 3 for their use case, and that basically nobody is going to have to worry about being sued, other than maybe like Google, or equivalent?
You are using all these scary words without saying the obvious, which is that for almost everyone, none of that matters.
Would you then say that in general Open Source doesn't matter for almost everyone? Most people running Linux aren't serving 700 million customers or operating military killbots with it after all.
> in general Open Source doesn't matter for almost everyone?
Most of the qualities that come with open source (which also come with llama 3), matter a lot.
But no, it is not a binary, yes or no thing, where something is either open source and useful or not.
Instead, there is a very wide spectrum is licensing agreements. And even if something does not fit the very specific and exact definition of open source, it can still be "almost" there and therefore be basically as useful.
I am objecting to the idea that any slight deviation from the highly specific definition of open source means that it no longer "counts".
Even though, If something is 99.9% the same as open source, then you get 99.9% of the benefits, and it is dishonest to say that it is significantly different than open source.
If I build a train, put it into service, and say to the passengers “this has 99.9% of the required parts from the design”, would you ride on that train? Would you consider that train 99.9% as good at being a train? Or is it all-or-nothing?
I don’t necessarily disagree with your point about there still being value in mostly-open software, but I want to challenge your notion that you still get most of the benefit. I think it being less than 100% open does significantly decay the value, since now you will always feel uneasy adopting these models, especially into an older existing company.
You can imagine a big legacy bank having no problem adopting MIT code in their tech. But something with an esoteric license? Even if it’s probably fine to use? It’s a giant barrier to their adoption, due to the risk to their business.
That’s also not to say I’m taking it for granted. I’m incredibly thankful that this exists, and that I can download it and use it personally without worry. And the huge advancement that we’re getting, and the public is able to benefit from. But it’s still not the same as true 100% open licensing.
> If I build a train, put it into service, and say to the passengers “this has 99.9% of the required parts from the design”, would you ride on that train?
Well if the missing piece is a cup holder on the train, yes absolutely! It would absolutely be as good as the binary "contains a cup holder" train design.
So the point stands. For almost everyone, these almost open source licenses are good enough for their usecase and the limitations apply to almost noone.
And you have chosen a wonderful example that exactly proves my point. In your example, the incorrect people are claiming that "99.9%" of a train is dangerous to ride in, while ignoring the fact that the missing .1% is the cup holders.
> You can imagine a big legacy bank
Fortunately, most people aren't running a big legacy bank. So the point stands, once again.
> It’s a giant barrier to their adoption
Only if you are at a big legacy bank, in your example, or similar. If you aren't in that very small percentage of the market, you are fine.
I don't support GP's claims, but you have to realize that you're "almost everyone" up until you build something very successful with lots of capital at stake, and then you definitely become "someone special" and have to think ahead about how the licenses of your models impact you.
Of course random individuals don't care much about the licenses on their personal AI projects. But if you intend to grow something significant, you better read the label from the start.
Or you could out play nice and pay Meta for the privilege at the point you are on the radar? I mean 99% of YC startups out there are building their business on some kind of proprietary cloud API. The fact that you can even run this..on your own servers is a massive departure from the entire tech ecosystem of the last 10-12 years.
> When the threshold between annoying and profitable is crossed, Meta's lawyers will start sniping and acquiring users of their IP.
I'm curious: given that the model will probably be hosted in a private server, how would meta know or prove that someone is using their model against the license?
If they can develop any evidence at all (perhaps from a whistleblower, perhaps from some characteristic unique to their model), they can sue and then there's they get to do "discovery", which would force the sued party to reveal details.
This is interesting. Can you point me to an OSI discussion what would constitute an open source license for LLMs? Obviously they have "source" (network definitions) and "training data" and "weights".
Actually right now the OSI is hosting ongoing discussion this year on what it means for AI to be open source. Here is their latest blog post on the subject:
Llama is not open source. It's corporate freeware with some generous allowances.
Open source licenses are a well defined thing. Meta marketing saying otherwise doesn't mean they get to usurp the meaning of a well understood and commonly used understanding of the term "open source."
https://opensource.org/license
Nothing about Meta's license is open source. It's a carefully constructed legal agreement intended to prevent any meaningful encroachment by anyone, ever, into any potential Meta profit, and to disavow liability to prevent reputational harm in the case of someone using their freeware for something embarrassing.
If you use it against the license anyway, you'll just have to hope you never get successful enough that it becomes more profitable to sue you and take your product away than it would be annoying to prosecute you under their legal rights. When the threshold between annoying and profitable is crossed, Meta's lawyers will start sniping and acquiring users of their IP.