The majority of that revenue comes from violating data protection law and regulators and litigants are slowly racking up a series of wins which will gut ads margins.
There is no Plan B, they are just going to break the law until they can’t and there’s zero clue what happens after that.
They sat back and let OpenAI kick their ass precisely because ghouls like Prabakar call the shots and LLM are not a good display ads fit.
As long as the US government can afford to run with trillion dollar deficits. Once the budget cuts take place and the defense budget gets cut by half, all bets are off.
Give me another one then? It’s a search engine with no search index. It’s an ads supported site without an ads network. What is it then if not a wrapper?
I’m saying it’s not worth anything because it is a Bing wrapper. The other “products” are just to drive more traffic to Bing ads. There’s no there there.
That’s a pretty disingenuous framing. Obviously the content of the communication- what you ask the chatbot - can totally be personally identifiable, and is stored.
You aren’t providing communications content privacy, you’re providing a meta data proxy which is not remotely the same.
If you submit personal information in your Prompts, it may be reproduced in the Outputs, but no one can tell whether it was you personally submitting the Prompts or someone else.
True, but it's also the best possible thing until they can run an LLM in your browser (or maybe we figure out how to do homomorphic encryption of neural networks).
What else should they call it? I suppose "anonymous" could work (i.e. they only know about you what you tell them).
Correct, that would be full communications content protection. This is not that, this is a pretty half assed product with a misleading set of promises.
Either use one of the many open models that beat GPT-3.5 and you have a real, and better!, product.
I literally have no idea what the point of this is, or why not make the actual step of rolling out a real private LLM. It’s literally inferior to what Mixtral provides right now, under Apache license.
I don’t understand how the idea of open source become some sort of pseudo-legalistic purity test on everything.
Models aren’t code, some of the concepts of open source code don’t map 1:1 to freely available models.
In spirit I think this is “open source”, and I think that’s how the majority of people think.
Turning everything into some sort of theological debate takes away a lot of credit that Meta deserves. Google isn’t doing this. OpenAI sure as fuck isn’t.
> Turning everything into some sort of theological debate takes away a lot of credit that Meta deserves.
It's not theological, it's the misuse of a specific legal definition that we all have interest in maintaining. "Freely available models" or "open license" are accurate.
Other companies keeping things for themselves doesn't warp reality, or the existing definitions we use to describe it. Giving them the credit they deserve, especially in comparison to the others, should be enough.
Hate to break it to you but there’s a thousand court cases a day precisely because “specific legal definition” is a surprisingly flexible concept depending on context. Likewise when new technologies emerge it often requires reappraisal and interpretation of existing laws, even if that reappraisal is simply extending the old law to the new context.
This isn't a problem with interpretation, as I would guess those are. This is a term that clearly describes requirements for a category, with the these models licenses purposefully and directly excluding themselves from that category.
> In spirit I think this is “open source”, and I think that’s how the majority of people think.
No, it isn't. You do, but, as evidenced by other comments, there's clearly people that don't. Thinking that you're with the majority and it's just a vocal minority is one thing, but it could just as easily be said that the vocal groups objecting to your characterization are representative of the mainstream view.
If we look at these models as the output of a compiler, that we don't have the inputs to, but that we are free (ish) to use and modify and redistribute, it's a nice grant from the copyright holder, but that very much doesn't look like open source. Open source, applied to AI models would mean giving us (a reference to) the dataset and the code used to train the model so we could tweak it to train the model slightly differently. To be less apologetic or something by default, instead of having to give it additional system instructions.
Model Available(MA) is freer than Model unavailable, and it's more generous than model unavailable, but it's very much not in the spirit of open source. I can't train my own model using what Meta has given us here.
And just to note, Google Gemma is the one they are releasing weights for. They are doing this and deserve credit for it.
It doesn’t mean it’s a bad license, just that it doesn’t meet the definition. There are legitimate reasons for companies to use source-available licenses. You still get to see the source code and do some useful things with it, but read the terms to see what you can do.
Meanwhile, there are also good reasons not to water down a well-defined term so it becomes meaningless like “agile” or “open.”
This gets confusing because people want to use “open source” as a sort of marketing term that just means it’s good, so if you say it’s not open source that’s taken to imply it’s bad.
But it’s also a bit absurd in a sense - let’s say you have all of Meta’s code and training data. Ok, now what? Even if you also had a couple spare data centers, unlimited money, and an army of engineers, you can’t even find enough NVIDIA cards to do the training run. This isn’t some homebrew shit, it’s millions upon millions of dollars of computational power devoted to building this thing.
I think at a fundamental level people have to start thinking a little differently about what this is, what open really means, and the like.
People are thinking what open really means, and they're telling you this isn't open. it definitely isn't Open Source, as defined by the OSI.
Open Source has a specific meaning and this doesn't meet it. It's generous of Meta to give us these models and grant us access to them, and let us modify them,
fine tune them, and further redistribute them. It's really great! But we're still in the dark as to how they came about the weights. It's a closed, proprietary process, of which we have some details, which is interesting and all, but that's not the same as having access to the actual mechanism used to generate the model.
This is like saying an image is or isn't open source. The model itself isn't a program, so asking whether it's open source or not is a bit of a category error.
So it's a bit silly for anyone to claim a model is open source, but it's not silly to say a model is open. What open means isn't well defined when it comes to a model in the same way that source code is.
Imo if someone reveals the model's architecture and makes the weights available with minimal limitations, it's probably reasonable to call it open. I don't know that that would apply to llama though since I believe there are limitations on how you can use the model.
I think you're conferring one hell of a lot of credit to Meta that is entirely undeserved. This is not a charitable, net benefit to humanity organization. These are not the good guys. These people are responsible for one hell of a lot of harm, and imagining they have good intentions is naive at best. I don't doubt the individual software engineers and researchers are good people. It's the corporation that's in charge of the llama product, however, and it's the lawyers, executives, and middle management that will start cracking down on technicalities and violations of the license. The precise instant that it becomes more profitable and less annoying to sue someone for violation of the license, Meta's lawyers will do so, because that's what companies are obligated to do. The second some group of shareholders start pointing out blatant violations of the license in products using llama, the lawyers will be obligated to crack down.
Meta is a corporation, and not subject to rational, good faith human judgment. It's a construct that boils down to an algorithmic implementation of the rules, regulations, internal policies, communication channels, and all those complex interactions that effectively prevent sensible, good faith human intervention at any given stage that would even allow the company to just let people continue to violate their stated license. Like trademarks, if you don't enforce a contract, the inaction dissipates your ability to enforce it later on. They don't pay these lawyers to come up with these licenses and contracts for shits and giggles.
The license is not the outcome of a happy weekend brainstorm session tacked on ad hoc just to maximize the benefit to humanity and blissfully join the wide world of open source.
The license is intended to prevent any serious competitive use of their AI models by third parties. It was crafted deliberately and carefully and expensively. They didn't use existing open source licenses because no license offered them the particular mix of rights and restrictions that fit their overall strategy. It's for PR, the ability to stifle competition, to get free beta testing and market research, and 100% of every part of the license is intentional and an insidious perversion of the idea of "open."
Meta doesn't deserve credit, they deserve condemnation. They could have gone with any number of open source licenses, using GPL or CC licensing with specific provisions to protect their interests and prevent commercial exploitation, or use dual licensing to incentivize different tiers of access. They deliberately and with a high level of effort pursued their own invented license. They are using weasel words and claiming they are open source all over the place in order to foster good will.
The argument "but nobody has been sued" is more than a little silly. There's simply no product known to use their models currently on the market that's both a blatant enough violation and worth enough money to sacrifice the good will they've been fostering. There's no human in organizations that size with the capacity to step in and prevent the lawsuits from happening. It'll be a collective, rules and policies decision completely out of anyone's hands to prevent, even if Zuck himself wanted to intervene. The shareholders' interests reign supreme.
Meta isn't a moral institution.
It's a ruthlessly profitable one.
There is no Plan B, they are just going to break the law until they can’t and there’s zero clue what happens after that.
They sat back and let OpenAI kick their ass precisely because ghouls like Prabakar call the shots and LLM are not a good display ads fit.
The best parallel for Google is Kodak.