I think because they are trained on Claude/O1, they tend to have comparable performance. The small models quickly fails on complex reasoning. The larger the models, the better the reasoning is. I wonder, however, if you can hit a sweet spot with 100gb of ram. That's enough for most professional to be able to run it on an M4 laptop and will be a death sentence for OpenAI and Anthropic.
because the valley is burning money and GPUs training these and somebody else comes out with another model for a tiny fraction of cost it's an easy assumption to make it was trained on synthetic data
It is a laptop. The memory is also shared which means if you are looking for a non-gaming workload, you can use it. If you have laptop equivalents in the same memory range, feel free to share.
I have laptop equivalents in the same memory range and is at least $2,500 cheaper.
Unfortunately, it does not have "unified memory", a somewhat "powerful GPU", and of course no local LLM hype behind it.
Instead, I've decided to purchase a laptop with 128GB RAM with $2,500 and then another $2,160 for 10 years Claude subscription, so I can actually use my 128GB RAM at the same time as using a LLM.
I see this comment all the time. But realistically if you want more than 1 token/s you’re going to need geforces, and that would cost quite a lot as well, for 100 GB.
GB10, or DIGITS, is $3,000 for 1 PFLOP (@4-bit) and 128GB unified memory. Storage configurable up to 4TB.
Can be paired to run 405B (4-bit), probably not very fast though (memory bandwidth is slower than a typical GPU's, and is the main bottleneck for LLM inference).