Hacker News new | past | comments | ask | show | jobs | submit login

I believe that's unrelated.. There was recent talk of Altman wanting to raise $7T for some new chip venture, anticipating future industry needs for compute.

Certainly neither OpenAI or Anthropic have that sort of cash on hand, nor have been offered any trillion dollar "train now, pay later" deal. Also, compute is is short supply atm (not up 1000x from last year), and even if the money and compute were available, I couldn't see these companies committing to that kind of model size/spend increase in just one or two generations - they need to see continuing progress at each iteration to justify and guide future direction.

$1B training costs for upcoming (GPT-5, Claude-4) models wouldn't be so surprising though (at least not now that we've got used to these crazy numbers).




Your assertion is that a fundamentally quadratic complexity class is going from low hundred millions to maybe a billion in a new league of capability and that it’s coincidental that we have from Altman, the mainstream press, and OG leadership at YC that he’s going for a ten figure raise led by Riyadh?


Not sure where you are getting quadratic from... Increasing context size had quadratic cost using original attention, but it seems everyone has switched to newer more efficient attention schemes. Claude-3 is being experimented with 10x context size of GPT-4 (1M vs 128K), but surely didn't cost 10^2 x $100M to train!

Note that scaling up model size doesn't necessarily refer to context size anyway - may also be things like embedding dimensions size, number of transformer layers, number of experts (MoE), etc.

In general scaling up 10x in size and/or cost between generations is about as much as makes sense, and about as much as can be achieved in a year. Anthropic have explicitly talked about $1B models coming soon, so this isn't just speculation.


I’m on record as saying that we need to squeeze the water out of these bloated models.

It’s possible to scale better than N^2 for some value of “better”. OpenAI has yet to demonstrate that they have the elusive combination of technical sophistication and institutional health to do so. Mistral can run a better model as judged by outcomes on my Mac Studio than GPT-4 is on an Azure disagg rack. Altman seems to understand this.

I’m on record as saying he’s amoral, non-technical, and a clear and present danger to Enlightenment civilization, not that he’s stupid.

I think his math on what it’s going to cost an OpenAI that Karpathy wants nothing to do with to reach the next level is refreshingly candid.


He says he did not ask for $7T - https://news.ycombinator.com/item?id=39747415


There’s an old saying: “If you can only be good at one thing, be good at lying, because then you’re good at everything.”

The Street’s consensus seems to be that we should be making big enough screens to display the number seven trillion in decimal notation.

Altman has said all kinds of things. He said that Green Dot should buy Loopt, he said that Autodesk should buy Socialcam, he said that he contemplates putting ice nine into the glass of people who cross him, he said that Larry Summers should be given authority over anything but a prison cell.

I’m a big believer in aligned incentives and it seems pretty counter-productive to let all that slide and then backpedal when he tells the truth about the price tag. I’m pro-no-filter-Altman.

https://youtu.be/OawnzWtwB58?si=JvdPVjbnab5Li4SQ




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: