If you would buy this I’d love to know how you’d use it.

BizarroLand · 2025-10-15T21:00:32 1760562032

I would use it for locally hosted RAG or whatever tech has supplanted it instead of paying API fees. We have ~20TB of documents that occasionally need to be scanned and chatted with and $4,000 one time (+ electricity) is chump change compared to the annual costs we would otherwise be looking at.

antinomicus · 2025-10-15T19:47:45 1760557665

Though the adage “this is the worst it’ll ever be” is parroted daily by AI cultists, the fact is it’s still yet to be proven that currently available LLMs can be made cost effective. For now every ai company is lighting tens of billions of dollars on fire every year and hoping better algorithms, hardware, and user lock in will ensure profits eventually. If this doesn’t happen, they will design more and more “features” in the LLM to monetize it - shopping, ads, sponsored replies, who knows? It may get really awful. And these companies will have so much of our data and eventually the need to make profits will lead them to sell that data and just generally try to extract as much out of us as they can.

This is why in the long run I believe we all should aspire to do LLM inference locally. But unfortunately we just are not anywhere close to par with the SoTA cloud models available. Something like DGX spark would be a decent step in this direction, but this platform appears to mostly be for prototyping / training models meant to eventually be run on data center nvidia hardware.

Personally I think I will probably spec out an M5 max/ultra Mac Studio once that’s a thing, and start trying to do this more seriously. The tools are getting better every day and “this is the worst it’ll ever be” is much more applicable to locally run models.