They have more fully open stuff in the pipeline. IMHO it's good that they put out stuff for hobbyists to play around with so that they're not immediately overtaken by people ready to deploy things at commercial scale.
The parent comment was referring to free ($0 a month) models. With those, companies don't need to plan around the possibility that Stability AI hikes up the price afterwards.
You forgot the part where a holding company eventually buys Stability and starts to squeeze their customers that built the core part of their product around a pricing structure that's "static + CPI".
With a bit less snark; I think the pricing structure is reasonable, but that doesn't mean customers won't eventually get screwed over by it. See the Unity fiasco from earlier this year (which was backed out, but serves well as an example).
> See the Unity fiasco from earlier this year (which was backed out, but serves well as an example).
They learned from the Russians. The Cuban missile crisis wasn't about Russian missiles in Cuba, it was about establishing an air base in Cuba. The missiles were the "price shock" item, and the airbase was the goal. If the Russians had put in just an airbase, it was very likely they never would have kept it.
I think the point is that a company can be owned and ran by "the good guys" until it isn't. This is why so many people are worried about the future of Bandcamp.
And 15% of revenue after you pass the "tiny startup" thresholds?
This looks to be shaping up as the same hook that Unity has, except you have to pay for model infra and engineering in addition to the product development costs.
It'd be nice if your company morphed into an open source HuggingFace + Together.ai / Replicate.ai where the model code is truly 100% open source, with no commercial use or revenue restriction clause.
Basically, I'm eagerly awaiting the Gitlab of AI to emerge.
I assume you have investor / revenue targets, though. And it's your company.
Nah its flat fee for all the models, no revenue share.
Amazon Prime but for generative AI, large companies have signed up for huge deals already and everyone has been happy with pricing, its designed to be simple base then we upsell other services but focus on research.
This lets us grow with the market and will do models of every modality for every country, including new BERTs and more.
We fund huge amounts of OSS, gave over 10m A100 hours in the last year and almost our collaborations are open.
For the stable series we are making it simple and scalable so we don't need to work against the people using the models.
After the Unity re-licensing fiasco, and OpenAI yanking old models,
is there any protections in place to allow the use of StableLM-Zephyr-3B indefinitely or will you be able to just deny any continuing access to models?
> Hardware: StableLM Zephyr 3B was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.
I might be missing it but do they say the number of training tokens that was used to train this?
This would help with efforts like TinyLlama in trying to figure out how well the scaling works with training tokens vs parameter size and challenging the chinchilla model.
How those licenses work for generated content?
If it's non-commercial does it mean I can still use it for work to generate stuff?
In other words - is it similar to ie. using GIMP, which is open source, but I can still use created content in commercial product without attribution?
Yeah, I think this is a great release, but I also suspect that most people won't end up using it just because of the license. It's actually a lot more restrictive than what I would personally consider "commercial" usage:
> Non-Commercial Uses does not include any production use of the Software Products or any Derivative Works.
So even if you want to launch a free service using this, that's not allowed.
Not really. They already chose to show the benchmark where it does best and even then it’s still quite a bit worse (though definitely impressive for its size).
If you take a look at other benchmarks, for example MMLU@5-shot then this does 46.3, while gpt-3.5 does 70.
But there might be some use cases where this one is close enough in performance and the difference in cost and speed make it a better choice.
By comparing on benchmarks that are either limited, or have data leaks, or in most cases just don't make sense in terms of usability - I've personally stopped looking at benchmarks to compare models. Personally, if I want to try a new model I hear a lot of chatter about, I use it for a few hours in my daily workflow. My baseline is GPT3.5 and GPT4, and I compare the models with them in terms of my day to day usage.
The LLM field is still messy at large, if you look at the rankings of model performance, they still do not reflect their usability in real life. I think one major challenge is to find a corresponding benchmark.
I'm very interested in high quality 3B models, but it's hard to get excited about this given the increasing array of commercially usable models.