Hacker News new | past | comments | ask | show | jobs | submit login

I remember seeing them on HN when the first started! I never understood what’s the price you pay, how did they get such a big speed up and less memory usage?





There's previous comments, apparently the founder did a lot of math re-deriving things from scratch :)

https://news.ycombinator.com/item?id=39672070

https://unsloth.ai/blog/gemma-bugs


nice work in gemma-bugs -- compared to plenty of research work that is a km deep in real math, this tech note is a just few python tweaks. But finding those and doing it? apparently this is useful and they did it. Easy to read (almost child-like) writeup.. thx for pointing to this.

They main author used to worth Nvidia. There's a free plan, and you can pay to get multiple GPU support.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: