Hacker News new | past | comments | ask | show | jobs | submit login

I don't see how that is evidence of the claim. We are doing all these things because they make existing models work better, but a larger model with RAG etc is still better than a small one, and everyone keeps working on larger models.





There is a contingent that I think Marcus is responding to that have been claiming that all we need to get to AGI or ASI is pure transformer scaling, and that we were very close with only maybe $10B or $100B more investment to get there. If the last couple of years of research have given us only incrementally better models to the point that even the best funded teams are moving to hybrid approaches then that's evidence that Marcus is correct.

This website by a former OpenAI employee was arguing that a combination of hardware scaling, algorithmic improvements, etc would all combine to yield AGI in the near future: https://situational-awareness.ai/

Ridiculous. Obviously people will keep on working on the architecture and software tricks in more ways than just scaling, but that doesn't mean scaling doesn't work. All the AI labs are pursuing huge compute ramp-ups to scale training like they've always done. xAI and Meta are bragging about their 100k H100 clusters and expanding, Microsoft is building huge datacenter networks for blackwell. No Marcus is not close to being correct.

Saying that to prove scaling isn't all we need is for the AI labs to stop all work on software optimizations is a non-sensical and non-serious ask.


> AI labs are pursuing huge compute ramp-ups to scale training

Yeah, and many, not just Marcus, are doubtful that the huge ramp-ups and scale will yield proportional gains. If you have evidence otherwise, share it.


The point is that those ramp-ups indicate that quite a few people do believe that they will yield gains, if not proportional, then still large enough to justify the expense. Which is to say, the claim that "even the best funded teams are moving to hybrid approaches" is not evidence of anything.

Believing that something is the case doesn't make it so. And the available evidence is saying it isn't more than it is, which is the point. Maybe it so happens that there's another sudden leap with X amount more scaling, but the only thing anyone has regarding that is faith. Faith is all that's maintaining the bubble.

No shit it doesn't offer proportional gains, this was part of the scaling laws from the very beginning. There are of course diminishing returns, it doesn't mean it's not worth pursuing or that there won't be useful returns from scaling.

Everyone out there is saying these reports are very misleading. Pretty sure it's just sensationalizing the known diminishing returns.


"a larger model with RAG etc is still better than a small one"

This paper from DeepMind a few years ago offers a counter example to this claim.

https://arxiv.org/abs/2112.04426




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: