I think because they are trained on Claude/O1, they tend to have comparable perf...

kamikazeturtles · on Jan 21, 2025

> I think because they are trained on Claude/O1, they tend to have comparable performance.

Why does having comparable performance indicate having been trained on a preexisting model's output?

I read a similar claim in relation to another model in the past, so I'm just curious how this works technically.

wordpad25 · on Jan 21, 2025

because the valley is burning money and GPUs training these and somebody else comes out with another model for a tiny fraction of cost it's an easy assumption to make it was trained on synthetic data

byefruit · on Jan 21, 2025

Do you have any evidence for this accusation?

O1's reasoning traces aren't even shown, are you suggesting they've somehow exfiltrated them?

elashri · on Jan 21, 2025

At the price of $5,000 before taxes. There would be better and most cost effective options to run models that will require that much memory.

csomar · on Jan 21, 2025

It is a laptop. The memory is also shared which means if you are looking for a non-gaming workload, you can use it. If you have laptop equivalents in the same memory range, feel free to share.

rfoo · on Jan 21, 2025

I have laptop equivalents in the same memory range and is at least $2,500 cheaper.

Unfortunately, it does not have "unified memory", a somewhat "powerful GPU", and of course no local LLM hype behind it.

Instead, I've decided to purchase a laptop with 128GB RAM with $2,500 and then another $2,160 for 10 years Claude subscription, so I can actually use my 128GB RAM at the same time as using a LLM.

csomar · on Jan 21, 2025

That's not the same thing. Also, can you share this 128GB $2500 laptop?

kridsdale1 · on Jan 21, 2025

Ok, but that means you’re not getting full privacy. It’s a trade off.

kergonath · on Jan 21, 2025

I see this comment all the time. But realistically if you want more than 1 token/s you’re going to need geforces, and that would cost quite a lot as well, for 100 GB.

nenaoki · on Jan 21, 2025

https://nvidianews.nvidia.com/news/nvidia-puts-grace-blackwe...

GB10, or DIGITS, is $3,000 for 1 PFLOP (@4-bit) and 128GB unified memory. Storage configurable up to 4TB.

Can be paired to run 405B (4-bit), probably not very fast though (memory bandwidth is slower than a typical GPU's, and is the main bottleneck for LLM inference).

kergonath · on Jan 21, 2025

That’s not something I can get, so it’s not really relevant. There is always a better device around the corner.

justincormack · on Jan 21, 2025

Not shipping until May or so.