Hacker News new | past | comments | ask | show | jobs | submit login
Not all 'open source' AI models are open: here's a ranking (nature.com)
127 points by weinzierl 9 months ago | hide | past | favorite | 18 comments



For some snark, I'd include "OpenAI" in the table with X on all the metrics...


The authors did exactly that in the paper which the articles attempts to summarise:

https://dl.acm.org/doi/pdf/10.1145/3630106.3659005#page=6


Interesting, I wonder why it was omitted in Springer's top tier journal Nature? Oh yeah: https://openai.com/index/axel-springer-partnership/

Edit: Looks like the science publishing company is "operated independently", despite family ties - https://media.springernature.com/full/springer-cms/rest/v1/c...


There are no family ties. These are two completely different companies.

Axel Springer is the company that bought Politico (and produces rags like Bild in Germany).

Springer Nature is majority owned by Holtzbrinck, a competitor to Axel Springer.


Thx. The Springer Nature PDF is somehow STILL confusing as it addresses family ties to NY but not to Axel Springer. However, I guess my priors regarding family business were overstrong.


Why include OpenAI's non open source models in a list of open source wannabes?


The entire article is about the semantics of the word "open", so an organization literally named "OpenAI" is obviously relevant on the article's own terms.

Besides which, this is a summary of a research paper that DID include OpenAI on the list. Omitting it is one of several choices that change the takeaways of the paper (linked above), which did a much better job at acknowledging and sorting the benefits of different varieties of "open"-ness.

Overall, TFA takes a cherrypicked scope on the naming controversy that hides the fact that some participants are contributing much more to public research and access than others.


Because it's officially "open" AI?


Which models claim that they are "open source" but are not? I think none of them. I think people confuse "open weight" with "open source." That confusion comes from the "internet AI experts," not the labs that produced them.


Facebook has been touting their work as both "open source" and "open science", despite being in direct violation of the terms [1, 2]. Others have done similar things, but not on such a big scale and with models with such big reach. In the community, the terms have now become sufficiently muddied that it is hard to have a conversation about what an open model even is.

[1]: https://arxiv.org/abs/2307.09288

[2]: https://opensource.org/blog/metas-llama-2-license-is-not-ope...

Thankfully, OLMo [3] reverted their decision to go with a license that contained usage restrictions and hopefully this sets the standard going forward so that the corporate actors no longer can appropriate the term and will find a suitable term on their own such as "available". But it pains me to see that they had to write "truly open" to distinguish themselves from the "poseurs".

[3]: https://allenai.org/olmo


When llama got leaked meta went full EEE and now discussion around open source and LLMs will take years to straighten out


Mistral even used the term 'truly open' in their announcement:

https://mistral.ai/news/mixtral-8x22b/

Also from the Llama 3 announcement: 'Today, we’re introducing Meta Llama 3, the next generation of our state-of-the-art open source large language model.'


Section 1.3 of the paper has lots of examples.

https://dl.acm.org/doi/pdf/10.1145/3630106.3659005#page=6


The confusion also stems from companies deliberately using the term for marketing purposes. Strong separation between the open source, academia and business environments doesn’t exist as much as it used to. Can’t imagine a Richard Stallman or Linus Thorvalds engaging in blurry half-open source, half business projects.


Aside from what others have mentioned (tons of them do use the term open source), "open weight" is deliberately meant to parallel "open source" but in practice it doesn't. Most "open weight" models have usage restrictions that would make a code license most definitely not open.

The right term for most of these models would be "weights available".


Standard open source rules apply. Unless you can:

Modify the source code

Compile it yourself*

And be able to do it on your own compute*

Then it’s not open

* Which would include the data and the “compiler” which is generally the non-FOSS cUDNN or CUDA drivers btw

*it’s almost like we had a mini-computer, microprocessor revolution 40 years ago for precisely the same reason


I think France (Macron) help push through the 'open' exemption, whatever that practically turns out to be, for the final EU AI Act...obviously not lost on people that is where Mistral lives.


When they say “LLM data”, does that usually include the tokenizer as well? Beginner question from someone at the end of Karpathy’s Zero to Hero course.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: