Hacker News new | past | comments | ask | show | jobs | submit login

> OpenAI could open GPT 4 tomorrow and it wouldn’t meaningfully impact their revenue.

I find this very difficult to believe, GPT-4 is still the best public model. If they hand out the weights other companies will immediately release APIs for it, cannibalizing OpenAI's API sales.




That’s the theory. In practice, it requires immense infrastructure to run it, let alone all the tooling and sales pipelines surrounding it. Companies are risk averse by definition, and in practice the risks are usually different than the ones you imagine from first principles.

It’s dumb. The first company to prove this will hopefully set an example that will be noticed.


It didn't take long for perplexity, anyscale, together.ai, groq, deepinfra, or lepton to all host mistral's 8x7B model, both faster and cheaper then Mistral's own api.

https://artificialanalysis.ai/models/mixtral-8x7b-instruct/h...


Hosting a 7B model is completely different than hosting a 150B+ model. I thought this would be obvious, but I should have been explicit.


It's not really. And 8x7B is not a 7B model, it's a MoE that's closer to 60B that has to be kept in memory, and uses 2 experts per token so it runs at 15B speeds.

All of the current frameworks support MoE and sharding among GPUs so I don't see what the issue is.


Ollama makes it pretty easy to run inference on a bunch of model-available releases. If a company is after code/text generation, finding a company/contractor to fine tune one of the model-available releases on their source code, and have IT deploy Ollama to ask their employees with M3 MacBooks, decked out with 64 GiB of RAM is well within the abilities of a competent and well funded IT department.

What recognition has Facebook gotten for their model releases? How has that been priced into their stock price?


That's completely different scale. You're not going to run GPT4 like a random ollama model. At that point you need dedicated external hardware for the service, and proper batching/pipelining to utilise it well. This is way out of the "enough ram in the laptop area".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: