Hacker News new | past | comments | ask | show | jobs | submit login
StableLM Zephyr 3B (stability.ai)
123 points by roborovskis on Dec 7, 2023 | hide | past | favorite | 38 comments



"This model is being released under a non-commercial license that permits non-commercial use."

I'm very interested in high quality 3B models, but it's hard to get excited about this given the increasing array of commercially usable models.


They have more fully open stuff in the pipeline. IMHO it's good that they put out stuff for hobbyists to play around with so that they're not immediately overtaken by people ready to deploy things at commercial scale.


It will be included under our membership next week which starts at $1 a month after grant ($20 base)


The parent comment was referring to free ($0 a month) models. With those, companies don't need to plan around the possibility that Stability AI hikes up the price afterwards.


It’ll be static with cpi max, flat membership for all core models

Just released video, sdxl turbo and 3d, code and more coming

Very positive reaction so far and we will still do our grants for OSS and do OSS collaborations, done over 10m A100 hours over last year

Launches next few days https://x.com/emostaque/status/1732197072290353455?s=46


You forgot the part where a holding company eventually buys Stability and starts to squeeze their customers that built the core part of their product around a pricing structure that's "static + CPI".

With a bit less snark; I think the pricing structure is reasonable, but that doesn't mean customers won't eventually get screwed over by it. See the Unity fiasco from earlier this year (which was backed out, but serves well as an example).


> See the Unity fiasco from earlier this year (which was backed out, but serves well as an example).

They learned from the Russians. The Cuban missile crisis wasn't about Russian missiles in Cuba, it was about establishing an air base in Cuba. The missiles were the "price shock" item, and the airbase was the goal. If the Russians had put in just an airbase, it was very likely they never would have kept it.


Know it well from video game days, but its being set up for huge predictability / stability given we have simple company control.

The base membership is even separate from other services, designed to support open model development.


I think the point is that a company can be owned and ran by "the good guys" until it isn't. This is why so many people are worried about the future of Bandcamp.


> Above 1m figuring out, contact https://stability.ai/contact

And 15% of revenue after you pass the "tiny startup" thresholds?

This looks to be shaping up as the same hook that Unity has, except you have to pay for model infra and engineering in addition to the product development costs.

It'd be nice if your company morphed into an open source HuggingFace + Together.ai / Replicate.ai where the model code is truly 100% open source, with no commercial use or revenue restriction clause.

Basically, I'm eagerly awaiting the Gitlab of AI to emerge.

I assume you have investor / revenue targets, though. And it's your company.


Nah its flat fee for all the models, no revenue share.

Amazon Prime but for generative AI, large companies have signed up for huge deals already and everyone has been happy with pricing, its designed to be simple base then we upsell other services but focus on research.

This lets us grow with the market and will do models of every modality for every country, including new BERTs and more.

We fund huge amounts of OSS, gave over 10m A100 hours in the last year and almost our collaborations are open.

For the stable series we are making it simple and scalable so we don't need to work against the people using the models.


After the Unity re-licensing fiasco, and OpenAI yanking old models,

is there any protections in place to allow the use of StableLM-Zephyr-3B indefinitely or will you be able to just deny any continuing access to models?


We looked at that, will be self service for commercial with flat pricing including all base models, weights are all downloadable by anyone.

Models are very interesting


Can you link to an explanation of how membership and licensing works for commercial use?


Yeah, Replit is likely the best option out there for a 3B model size, right?


Refact has a decent 1.6B model that I think is better

https://huggingface.co/smallcloudai/Refact-1_6B-fim


This space is so confusing when it comes to licenses.

“Zephyr” is MIT, but “Stability Zephyr” is non commercial. They could have at least used a different name.

“Inspired” in all but license it would seem


> Hardware: StableLM Zephyr 3B was trained on the Stability AI cluster across 8 nodes with 8 A100 80GBs GPUs for each nodes.

I might be missing it but do they say the number of training tokens that was used to train this?

This would help with efforts like TinyLlama in trying to figure out how well the scaling works with training tokens vs parameter size and challenging the chinchilla model.


We included full training details for the base model on 4 trillion tokens including wandb etc

https://stability.wandb.io/stability-llm/stable-lm/reports/S...


How those licenses work for generated content? If it's non-commercial does it mean I can still use it for work to generate stuff? In other words - is it similar to ie. using GIMP, which is open source, but I can still use created content in commercial product without attribution?


Note that it uses a non-commercial license. Still pretty cool though!


Yeah, I think this is a great release, but I also suspect that most people won't end up using it just because of the license. It's actually a lot more restrictive than what I would personally consider "commercial" usage:

> Non-Commercial Uses does not include any production use of the Software Products or any Derivative Works.

So even if you want to launch a free service using this, that's not allowed.


Am I reading it right that performance was roughly comparable with GPT-3.5? How is this even possible?


Not really. They already chose to show the benchmark where it does best and even then it’s still quite a bit worse (though definitely impressive for its size). If you take a look at other benchmarks, for example MMLU@5-shot then this does 46.3, while gpt-3.5 does 70.

But there might be some use cases where this one is close enough in performance and the difference in cost and speed make it a better choice.


No it's not (according to their benchmarks).

Zephyr-7B-B still beats it in most benchmarks but it's close.

This model is almost Zephyr-7B-B performance at 3B size which is a lot better for inference requirements.


Yeah got a way to beat 3.5 but it beats most of the first generation llama tunes even guacano 65b

Lots of improvements to go


By comparing on benchmarks that are either limited, or have data leaks, or in most cases just don't make sense in terms of usability - I've personally stopped looking at benchmarks to compare models. Personally, if I want to try a new model I hear a lot of chatter about, I use it for a few hours in my daily workflow. My baseline is GPT3.5 and GPT4, and I compare the models with them in terms of my day to day usage.


So in your experience which open model is currently the best?


The LLM field is still messy at large, if you look at the rankings of model performance, they still do not reflect their usability in real life. I think one major challenge is to find a corresponding benchmark.


Can't wait for someone smarter than me to make this compatible with MLC on iPhone.


You can do this by just following a tutorial: https://huggingface.co/docs/diffusers/main/en/using-diffuser.... ML/AI models are just function graphs and most of the frameworks support saving and loading safetensor serialized graphs.


Stability is apparently up for sale, hence the recent steady stream of releases


How fast are these small models on a 4090, is it like 100ms ? 500ms ?


Mistral-7B gives you 80 tokens/second on 4090. So this one will be faster...


It’s about twice the speed


How would one go about making a .llamafile for this?


Convert it to GGML and use a zip tool to add that to a llamafile package.

https://huggingface.co/TheBloke?search_models=Zephyr doesn't have a GGML for it yet but I wouldn't be surprised to see one by the end of the day.


And it's been uploaded https://huggingface.co/TheBloke/stablelm-zephyr-3b-GGUF/tree...

Getting 20toks/s on M1 mba where as LLaVa I ground to a halt. Very impressed




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: