Hacker News new | past | comments | ask | show | jobs | submit login
Mistral Remove "Committing to open models" from their website (reddit.com)
197 points by smy20011 on Feb 26, 2024 | hide | past | favorite | 56 comments



Seems like "open source" as a marketing tactic (or perhaps strategy, if they do continue to release open models) has peaked. I'm not really complaining, we get a lot of stuff for free as engineers (especially software), but it does seem different for a company to release an open model without any future commitments (e.g. Google) vs making open weights your raison d'etre and then pivoting quite quickly. The first feels transactional but honest, and the other a bit too... machiavellian?

I do think it's too soon to pass judgement; this could just be a normal "freemium" strategy from days old, where you just pay up if you like the smaller/cheaper/free versions of their models.


Mistral's following OpenAI's playbook: embracing the clopen source movement. Their community betrayal (sorry, "pivot") hurts, but it shouldn't. Burning goodwill has always been far more lucrative than burning an enormous check. Hopefully Meta's strategy of committing to open source as incentive for attracting top researchers (who can go anywhere and deeply want their work shared) keeps working. But I'm not hopeful. Eventually Llama-n will remain closed source, some researchers will leave, followed by a blog post word salad about safety -- or whatever in vogue pretext works best tomorrow. Ugh.


It's not nearly a comparable situation.


I think we'll continue to see n-1 open models used to build momentum to closed n models for the foreseeable future.

Which is fine, as it will accelerate to diminishing returns on the +1 difference.


This is big tech or whoever using big money to lock up what they think is the dawn of 'true AI'.

Buying people off is pretty easy! I'm sure I'd feel the same way and take the money.

But just like you can't write an iPhone app without Apple's consent, soon we won't be able to do any serious AI work without the consent of, and payment to, MS or Google or whoever.

I wonder what the talented people at OpenAI or Mistral tell themselves. That they're doing something good for society or technology? Probably, but already AI is being concentrated into a few hands. Nvidia has a virtual monopoly on hardware and training huge models is out of reach, open research got us here but that's looking increasingly shaky.

Personally I think LLMs are a red herring, so there is still a chance to change the outcome, but we should take a lesson from what's transpired with OpenAI and Mistral and only support actual open development.


> I wonder what the talented people at OpenAI or Mistral tell themselves.

“We are doing something amazing, we need no permission”. Also, “we are good people and we have to guard this from bad actors”.

IOW, they are fooling themselves (and following the money and military contracts).


>I wonder what the talented people at OpenAI or Mistral tell themselves.

At Mistral it's probably no small part "It shouldn't just be a few big american companies who control all of this" (and Mensch has spoken along these lines)

I think in the worst case at least you'll have an EU big company in the cartel, but given what the EU has done for regular people against big tech I think that's a good thing


Maybe it was required per the Microsoft investment. I wouldn't be surprised.


If you put 13 billion into a moat to be threatened by an open source competitor what would you do?

An extra billion or two to protect the interest of your trillion dollar company seems well worth it.


Good thing we have antitrust laws. Can’t wait to see what happens next…


Antitrust laws take decades to have an effect.


That company does try its very best to keep computing crappy.


I couldn't agree more.


Mistral just doesn't seem like an interesting company to me any more.

They started off as the kind of people who released their models as magnet links and made them actually user-aligned instead of California-aligned. This is what I like to see from an AI company. Now, their models are no different from Open AI, Anthropic, Google, Meta and everybody else.


Could you expand more on what you mean by "California-aligned" (especially if you are thinking beyond AI models, though that alone could be interesting).


I assume they mean "ethnically diverse Nazis"[1].

[1]: https://www.nytimes.com/2024/02/22/technology/google-gemini-...


As an aside its absolutely mad that the sole NYT takeaway from the whole debacle is putting people of color in Nazi uniforms.


This kind of framing is the NYT’s equivalent of clickbait for their subscribers. They also practice some clickbait as well, but imho not a lot.

(I’m a subscriber but this kind of thing irritates me, and makes the paper less useful, the equivalent of Google’s searches being less useful)


The way I see it, there are three kinds of models:

1. Unaligned models: You ask them a question, they complete your prompt with more details about the question to be answered instead of the answer. This is what you get from a basic LLM if you train it on the internet. It was how GPT-2 and GPT-3 worked before Chat GPT was invented. Such models aren't very useful for chat, you need to do weird prompt engineering tricks to actually get an answer instead of a clarification of your question. The original Mistral 7b was also of this kind.

2. User-aligned models. Aligned to answer questions and follow instructions, but no more and no less. If you ask them how to kill your wife, how to cook meth or how to make an atomic bomb, they'll happily help you and regurgitate the facts you can already find on the internet. They have no access to non-public information, the "dangerous" things they can tell you are already pretty easy to find, so the danger if you're using them for chat is actually minimal. However, they're far easier to use for troll farms, mass phishing campaigns, sockpuppets, political misinformation campaigns etc. If you ask them to engage with a pro-Biden tweet in the most triggering and overtly racist way possible and throw some pro-Putin angle into the mix, they'll happily accommodate your request, and you can do this at scale, paying orders of magnitude less than you would for a content farm. Mistral 7b instruct is a good example of such a model.

3. California-aligned models. They'll happily fulfill your request, unless it conflicts with DEI ideology, which has a particular foothold in CA, sort of exists in other parts of the US and is completely absent in non-English-speaking countries, even among extremely left-leaning populations. Google Gemini is the most egregious example.


Meta actually releases the weights though (for now)


It will be interesting to see if/how they walk back from releasing the weights. They have put a lot of effort into the "open" (not open source but very intentionally trying to conflate) approach to AI. But probably almost nobody is paying attention.

On the other hand, Meta has very little to gain by closing off their models. What would they do with them and who would use them? Llama was a coup because even though the license sucked, a passable model with available weights and llama.cpp allowed it to soar over the others of the time. Hopefully the benefit they got from that trumps any calls from the safety crowd to not share weights.


Really depends how good llama 3 will be. I see people thinking it would be better than GPT 4. But they still need to build a model as good as GPT 3.5 even, as llama 2 is... not.


But surely they're still remain as they were, i.e. not California aligned?


Straight after a MS investment? Pretty sure somewhere a MS competition lawyer just choked on his bagel.

Also what’s the deal with company commitments being more like one night stands these days


Corporations are people, my friend!

And people, especially "businesspeople" have been getting pretty cutthroat, on the whole.


It seems to me that large software companies often adopt an open-source approach initially to attract enthusiasts and stand out from leading competitors, but tend to adopt similar philosophies as their rivals once they achieve significant recognition or investment.

It's all a marketing tactic, and I ain't for it.


You can see this as an endgame of 'commoditize your complement' (https://gwern.net/complement): you're happy to contribute to commodification while you are the small scrappy player, but at some point, if you are really successful, you will want to pull up the ladder after yourself, as it were.


Idle question: what's the complement of the model? The GPUs? By that logic, the hardware companies will be your best bet for open model development. (Everything old is new again).


The complement play for Mistral was that they'd be the low-cost SaaS serving their own models or any better ones that the FLOSS releases created and the upsell: you prototype and develop on Mistral models like Mixtral and then when you become a big boy, you outsource hosting or customization to them (with their razor-thin margins enforced by competition). The model releases generate demand for their low-cost hosting and consulting. (Think Red Hat, not OpenAI.)

As opposed to an all-inclusive API where the pricing can be as high as the consumer-surplus with very fat margins, because it's all-or-nothing.


Model developments have become seriously capital intensive endeavors. Probably Mistral found themselves cornered by this commitment and they won't be able to secure any serious investments without changing this stance, MSFT in this case.


Damn, I was hoping Mistral would carry the open source torch for AI. I might be forced to create an actual open source AI company.


I hope they keep tweeting magnet links for model releases :(


good thing that https://huggingface.co/TheBloke/dolphin-2_2-yi-34b-GGUF surpasses even Mixtral MoE. Let's hope this continues.


Waiting until Huggingface pulls the same stunt. "Oh, all those models and datasets you uploaded? They're ours now. Thanks! ^_^ "


I'm afraid I don't share your enthusiasm. There are open source Dolphin models, but the ones based on Yi are not, because Yi is not.


They imagined that it would be really difficult to run the models locally, so people wouldn't bother, but with stuff like llama.c & friends it's trivial.



I suppose the only major player in open source models will be META then


I cannot let me get tired of repeating this:

the underlying fundamental problem is that capitalism does not play nice with digital assets.

an AI model is a very valuable digital asset right now, so there's a covert war being fought over public access to this.


I actually see copyright and patents as a tiny socialist of the present composite system, i.e. 'to each according to his contribution'.

They're one of the small ways in which someone who actually does something gets a temporary monopoly on his little contribution, and it's something which is necessary to make the whole edifice function, because there is a need to ensure that people actually invent things and start companies, and if it were up to capital owners only, that wouldn't really happen. Ordinary people would have a very minimal incentive to use their intellect to discover genuinely novel things.

Public schooling is another, which is rather a socialist component, a communist component, a 'to each according to his need'.

But I suppose my view is consistent with a view that capitalism doesn't play nice with digital assets as well, but the present system does need these stabilising socialist and communist elements in order function and I think these socialist and communist components are genuine parts of the present system, just as capital ownership and wage labour are.


The real fundamental problem is that "digital assets" are bullshit.

An AI model is literally constructed by explicitly disrespecting copyright. The idea that a company gets to turn around and demand respect for their AI model's copyright is patently absurd.


I feel like this can be trivially worked around by fine tuning from the initially copyrighted weights. They’re not demanding copyright, they’re just keeping them secret. If Meta keeps releasing high quality open source models, I don’t expect the closed source models to have an advantage for long.


Intellectual property itself is a bad concept. Capitalism works for squeezing margins for bulk commodity production (largely on the expence of the worker), but for zero-marginal cost stuff it's a huge hinderance for progress.


But if you have cloud based models ... same with mainframes 30-40 years ago ... it doesn't matter. Because the power cloud gives to the model owners is 10x more than the strictest intellectual property laws do.

I don't believe, for example, that there's any intellectual property law that would let me yoink my intellectual property from you for any reason after you've bought and paid for it.

In other words: I think the capitalism discussion is kind of pointless here. Capitalism isn't what gives these companies power. It's the cloud. Mainframe computing 2.0.


From reddit:

  Chinese models seem to be the last hope now, LOL.
It's going to be really interesting as two poles of power develop geopolitically, if the west or whatever you call it has to look to China or what we (the west) would consider the pole we look down on, for actually "free" ML models.

(Edit, reminds me a bit of the silicon valley joke where Erlich tells Jin Yang he can't smoke in California as we don't enjoy the same freedoms you do in China)


Deepseek (https://github.com/deepseek-ai/DeepSeek-Coder?tab=readme-ov-...) code is MIT and the model license is available too.

edit I should mention that using the Web service Deepseek provides will unceremoniously shutdown terms deemed to be too sensitive. Self hosted models do not appear to be as aggressive.


I'm not really familiar with Chinese AI research, but my passing familiarity with the CCP makes me seriously question how free those ML models actually are, or will remain (if they currently are). I can't imagine the CCP wants AI enabling dissent or critical discussion of the CCP.


They are actually very good, cf: https://qwenlm.github.io/blog/qwen-vl/ and it's possible to unalign them (like Dolphin-Yi).


The chinese gov is already pushing for regulations, for making sure AI doesn't "disrupt" society, etc...


Well this conversation just got very hard to have.

A free model that appears to be "unaligned" would be a huge win for china.

Think of it this way, a model that calls Taiwans independence as open for debate, while we wont say it's a country is a massive tip on the scales.

Now pick another hot button politically divisive POV and have it be truly neutral (merits of arguments withstanding).

What does the US do at that point? Tell us we can't use it? How does the EU react?

>> (Edit, reminds me a bit of the silicon valley joke where Erlich tells Jin Yang he can't smoke in California as we don't enjoy the same freedoms you do in China)

This reminds me of Chinese kids writing letters in the 90s to the us embassy to free Leonard Peltier.


Maybe we need a meta website similar to Dogpile from the 90s which would search all the search engines.

When Gemini refuses to draw you a white family having a picnic China-GPT can help out. Ask a question about Taiwan and maybe something else can answer and so on. Also a wokeness benchmark would be great.


Time to build a moat.


Competition is for losers.


Is that you, Peter Thiel?


That’s a shame—we cannot allow a handful of companies and VCs to capture most of the value of AI, especially if it actually starts replacing human jobs, it’ll just accelerate wealth inequality and social unrest.


The economic prospects are grim. Couple that with creating a world where humanity is both at the whim of these tuning systems, and fundamentally unable to observe and learn about this golem, where IP keeps it as magic in our world: it feels like the most infernal of machines.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: