It is hard for me to square "This company is a few short years away from buildin...

lacker · 2024-09-25T20:33:15 1727296395

It's easy to have missed this part of the story in all the chaos, but from the NYTimes in March:

Ms. Murati wrote a private memo to Mr. Altman raising questions about his management and also shared her concerns with the board. That move helped to propel the board’s decision to force him out.

https://www.nytimes.com/2024/03/07/technology/openai-executi...

It should be no surprise if Sam Altman wants executives who opposed his leadership, like Mira and Ilya, out of the company. When you're firing a high-level executive in a polite way, it's common to let them announce their own departure and frame it the way they want.

startupsfail · 2024-09-25T20:49:38 1727297378

Greg Brockman, OpenAI President and co-founder is also on extended leave of absence.

And John Schulman, and Peter Deng are out already. Yet the company is still shipping, like no other. Recent multimodal integrations and benchmarks of o1 are outstanding.

vasco · 2024-09-25T21:11:54 1727298714

> Yet the company is still shipping, like no other

If executives / high level architects / researchers are working on this quarter's features something is very wrong. The higher you get the more ahead you need to be working, C-level departures should only have an impact about a year down the line, at a company of this size.

mise_en_place · 2024-09-26T01:32:26 1727314346

Funny, at every corporation I've worked for, every department was still working on last quarter's features. FAANG included.

dartos · 2024-09-26T02:28:26 1727317706

That’s exactly what they were saying. The department are operating behind the executives.

ttcbj · 2024-09-25T21:15:32 1727298932

This is a good point. I had not thought of it this way before.

saalweachter · 2024-09-26T11:57:50 1727351870

C-level employees are about setting the company's culture. Clearing out and replacing the C-level employees ultimately results in a shift in company culture, a year or two down the line.

Aeolun · 2024-09-25T23:51:42 1727308302

You may find that this is true in many companies.

ac29 · 2024-09-25T22:46:51 1727304411

> the company is still shipping, like no other

Meta, Anthropic, Google, and others all are shipping state of the art models.

I'm not trying to be dismissive of OpenAI's work, but they are absolutely not the only company shipping very large foundation models.

g8oz · 2024-09-26T02:44:20 1727318660

Indeed Anthropic is just as good, if not better in my sample size of one. Which is great because OpenAI as an org gives shady vibes - maybe it's just Altman, but he is running the show.

MavisBacon · 2024-09-26T09:43:36 1727343816

Claude is pretty brilliant.

pama · 2024-09-25T23:17:21 1727306241

Perhaps you havent tried o1-preview or advanced voice if you call all the rest SOTA.

Aeolun · 2024-09-25T23:55:31 1727308531

If only they’d release the advanced voice thing as an API. Their TTS is already pretty good, but ai wouldn’t say no to an improvement.

moondistance · 2024-09-26T02:06:50 1727316410

VP Research Barret Zoph and Chief Research Officer Bob McGrew also announced their departures this evening.

RobertDeNiro · 2024-09-26T00:22:32 1727310152

Greg’s wife is pretty sick. For all we know this is unrelated to the drama.

theGnuMe · 2024-09-26T02:33:27 1727318007

Sorry to hear that, all the best wishes to them.

imdsm · 2024-09-26T07:15:01 1727334901

Context (I think): https://x.com/gdb/status/1744446603962765669

Big fan of Greg, and I think the motivation behind AGI is sound here. Even what we have now is a fantastic tool, if people decide to use it.

csomar · 2024-09-26T04:20:47 1727324447

> Yet the company is still shipping, like no other.

I don't see it for OpenAI, I do see it for the competition. They have shipped incremental improvements, however, they are watering down their current models (my guess is they are trying to save on compute?). Copilot has turned into garbage and for coding related stuff, Claude is now better than gpt-4.

Honestly, their outlook is bleak.

benterix · 2024-09-26T07:40:14 1727336414

Yeah, I have the same feeling. It seems like operating GPT-4 is too expensive, so they decided to call it "legacy" and get rid of it soon, and instead focus on cheaper/faster 4o, and also chain its prompts to call it a new model.

I understand why they are doing it, but honestly if they cancel GPT-4, many people will just cancel their subscription.

vicentwu · 2024-09-26T02:16:52 1727317012

Past efforts leds to today's products. We need to wait to see the real imapct on the ability to ship.

mistercheph · 2024-09-26T06:30:42 1727332242

In my humble opinion you're wrong, Sora and 4o voice are months old and no signs they're not vaporware, and they still haven't shipped a text model on par with 3.5 sonnet!

dartos · 2024-09-26T02:30:03 1727317803

> like no other

Really? Anthropic seems to be popping off right now.

Kagi isn’t exactly in the AI space, but they ship features pretty frequently.

OpenAI is shipping incremental improvements to its chatgpt product.

jjtheblunt · 2024-09-26T02:42:49 1727318569

"popping off" means what?

dartos · 2024-09-26T05:09:37 1727327377

Modern colloquialism generally meaning Moving/advancing/growing/gaining popularity very fast

elbear · 2024-09-26T06:28:45 1727332125

Are they? In my recent experience, ChatGPT seems to have gotten better than Claude again. Plus their free limit is more strict, so this experience is on the free account.

0xKromo · 2024-09-26T06:59:13 1727333953

Its just tribalism. People tend to find a team to root for when there is a competition. Which one is better is subjective at this point imo.

elbear · 2024-10-03T06:09:12 1727935752

Well, I use both of them all of the time, so I'm in a better position to compare them. Sometimes, I tend to prefer one, like ChatGPT after Claude decreased the free quota, but I still choose it sometimes, because I still find value in its answers.

jpeg-irl · 2024-09-26T07:29:24 1727335764

The features shipped by Anthropic in the past month are far more practical and provide clear value for builders than o1's chain of thought improvements.

- Prompt Cache, 90% savings on large system prompts for 5 mins of calls. This is amazing

- Contexual RAG, while not ground breaking idea, is important thinking and method for better vector retrieval

FactKnower69 · 2024-09-25T21:15:15 1727298915

[flagged]

fjdjshsh · 2024-09-25T21:55:39 1727301339

Is that your test suite?

015a · 2024-09-25T23:34:22 1727307262

Companies are held to the standard that their leadership communicates (which, by the way, is also a strong influencing factor in their valuation). People don't lob these complaints at Gemini, but the CEO of Google also isn't going on podcasts saying that he stares at an axe on the wall of his office all day musing about how the software he's building might end the world. So its a little understandable that OpenAI would be held to a slightly higher standard; its only commensurate with the valuation their leadership (singular, person) dictates.

mckirk · 2024-09-25T22:11:36 1727302296

To be fair, that question is one of the suggested questions that OpenAI shows themselves in the UI, for the o1-preview model.

(Together with 'Is a hot dog a sandwich?', which I confess I will have to ask it now.)

magxnta · 2024-09-25T22:41:01 1727304061

If you have a sandwich and cut it in half, do you have one or two sandwiches?

fuzztester · 2024-09-26T00:22:12 1727310132

Mu.

https://en.m.wikipedia.org/wiki/Mu_(negative)

See Non-dualistic meaning section.

fragmede · 2024-09-25T23:38:51 1727307531

Depends on what kind of sandwich it was before, and along which axis you cut it, and where you fall on the sandwich alignment chart.

Dylan16807 · 2024-09-26T03:53:30 1727322810

Assuming a normal cut, this isn't a question about how you define a sandwich, this is a question about the number of servings, and only you can answer that.

bee_rider · 2024-09-26T01:56:57 1727315817

Yes, you do have one or two sandwiches.

Edit: oh dang, I wanted to make the “or” joke so badly that I missed the option to have zero sandwiches.

fairity · 2024-09-25T21:03:14 1727298194

Quite interesting that this comment is downvoted when the content is factually correct and pertinent.

It's a very relevant fact that Greg Brockman recently left on his own volition.

Greg was aligned with Sam during the coup. So, the fact that Greg left lends more credence to the idea that Murati is leaving on her own volition.

frakkingcylons · 2024-09-25T21:22:08 1727299328

> It's a very relevant fact that Greg Brockman recently left on his own volition.

Except that isn’t true. He has not resigned from OpenAI. He’s on extended leave until the end of the year.

That could become an official resignation later, and I agree that that seems more likely than not. But stating that he’s left for good as of right now is misleading.

meiraleal · 2024-09-25T22:48:31 1727304511

> Quite interesting that this comment is downvoted when the content is factually correct and pertinent.

>> Yet the company is still shipping, like no other.

this is factually wrong. Just today Meta (which I despise) shipped more than openAI in a long time.

SkyMarshal · 2024-09-25T21:48:36 1727300916

> When you're firing a high-level executive in a polite way, it's common to let them announce their own departure and frame it the way they want.

You also give them some distance in time from the drama so the two appear unconnected under cursory inspection.

SadTrombone · 2024-09-26T00:37:15 1727311035

To be fair she was also one of the employees who signed the letter to the board demanding that Altman be reinstated or she would leave the company.

hobofan · 2024-09-26T03:06:07 1727319967

Does that actually mean anything? Didn't 95% of the company sign that letter, and soon afterwards many employees stated that they felt pressured by a vocal minority of peers and supervisors to sign the letter? E.g. if most executives on her level already signed the letter, it would have been political suicide not to sign it

saagarjha · 2024-09-26T08:05:32 1727337932

She was second-in-command of the company. Who else is there on her level to pressure her to sign such a thing, besides Sam himself?

bradleyjg · 2024-09-26T00:40:15 1727311215

Isn’t that even worse? You write to the board, they take action on your complaints, and then you change your mind?

barkingcat · 2024-09-26T01:13:00 1727313180

It means when she was opting for the reinstating of Altman, she didn't have all the information needed to make a decsion

Now that she's seen exactly what prompted the previous board to fire Altman, she fires herself because she understands their decision now.

mempko · 2024-09-26T00:53:02 1727311982

Exactly, Sam Altman wants group think, no opposition, no diversity of thought. That's what petty dictators demand. This spells the end of OpenAI IMO. Huge amount of money will keep it going until it doesn't

aresant · 2024-09-25T20:09:28 1727294968

I think the much more likely scenario than product roadmap concerns is that Murati (and Ilya for that matter) took their shot to remove Sam, lost, and in an effort to collectively retain billion$ of enterprise value have been playing nice, but were never seriously going to work together again after the failed coup.

deepGem · 2024-09-25T20:48:55 1727297335

Why is it so hard to just accept this and be transparent about motives ? It's fair to say 'we were not aligned with Sam, we tried an ouster, didn't pan out so the best thing for us to do is to leave and let Sam pursue his path", which the entirely company has vouched for.

Instead, you get to see grey area after grey area.

jjulius · 2024-09-25T21:13:26 1727298806

Because, for some weird reason, our culture has collectively decided that, even if most of us are capable of reading between the lines to understand what's really being said or is happening, it's often wrong and bad to be honest and transparent, and we should put the most positive spin possible on it. It's everywhere, especially in professional and political environments.

discordance · 2024-09-25T23:58:09 1727308689

For a counter example of what open and transparent communincation from a C-level tech person could look like, have a read of what the SpaCy founder blogged about a few months ago:

https://honnibal.dev/blog/back-to-our-roots

vincnetas · 2024-09-26T07:16:07 1727334967

Stakes are orders of magnitude lower in spaCy case compared to OpenAI (for announcer and for people around them). It's easier to just be yourself when you're back on square one.

bergen · 2024-09-26T08:15:30 1727338530

This is not a culture thing imo, being honest and transparent makes you vulnerable to exploits, which is often a bad thing for the ones being honest and transparent in a high competition area.

jjulius · 2024-09-26T18:08:27 1727374107

Being dishonest and cagey only serves to build public distrust in your organization, as has happened with OpenAI over the past couple of months. Just look at all of the comments throughout this thread for proof of that.

Edit: Shoot, look at the general level of distrust that the populous puts in politicians.

lotsofpulp · 2024-09-26T04:40:10 1727325610

It is human nature to use plausible deniability to play politics and fool one’s self or others. You will get better results in negotiations if you allow the opposing party to maintain face (i.e. ego).

See flirting as a more basic example.

fsndz · 2024-09-26T09:10:15 1727341815

hypocrisy has to be the core of every corporate or political environment I have observed recently. I can count the occasions or situations where telling the simple truth is helpful. even the people who tell you to tell the truth are often the ones incapable of handling it.

dragonelite · 2024-09-26T09:40:47 1727343647

From experience unless the person mention their next "adventure"(within like a couple of months) or gig it usually means a manager or c-suite person got axed and was given the option to gracefully exit.

deepGem · 2024-09-26T16:47:48 1727369268

By the barrage of exits following Mira's resignation, it does look like Sam fired her, the team got the wind of this and are now quitting in droves. This is the thing about lying and being polite. You can't hide the truth for long.

Mira's latest one liner tweet 'OpenAI is nothing without it's people" speaks volumes.

fsndz · 2024-09-26T11:23:09 1727349789

FactKnower69 · 2024-09-25T21:17:13 1727299033

McKinsey MBA brain rot seeping into all levels of culture

cedws · 2024-09-25T22:42:27 1727304147

That's giving too much credit to McKinsey. I'd argue it's systemic brainrot. Never admit mistakes, never express yourself, never be honest. Just make up as much bullshit as possible on the fly, say whatever you have to pacify people. Even just say bullshit 24/7.

Not to dunk on Mira Murati, because this note is pretty cookie cutter, but it exemplifies this perfectly. It says nothing about her motivations for resigning. It bends over backwards to kiss the asses of the people she's leaving behind. It could ultimately be condensed into two words: "I've resigned."

Earw0rm · 2024-09-26T05:42:45 1727329365

It's a management culture which is almost colonial in nature, and seeks to differentiate itself from a "labor class" which is already highly educated.

Never spook the horses. Never show the team, or the public, what's going on behind the curtain.. or even that there is anything going on. At all time present the appearance of a swan gliding serenely across a lake.

Because if you show humanity, those other humans might cotton on to the fact that you're not much different to them, and have done little to earn or justify your position of authority.

And that wouldn't do at all.

NoGravitas · 2024-09-26T14:59:06 1727362746

> Just make up as much bullshit as possible on the fly, say whatever you have to pacify people.

Probably why AI sludge is so well suited to this particular cultural moment.

kyawzazaw · 2024-09-26T06:03:09 1727330589

not for two sigma

startupsfail · 2024-09-25T20:56:06 1727297766

“the entire company has vouched for” is inconsistent with what we see now. Low/mid ranking employees were obviously tweeting in alignment with their management and by request.

ssnistfajen · 2024-09-26T05:50:18 1727329818

People, including East Asians, frequently claim "face" is an East Asian cultural concept despite the fact that it is omnipresent in all cultures. It doesn't matter if outsiders have figured out what's actually going on. The only thing that matters is saving face.

widowlark · 2024-09-25T20:52:08 1727297528

id imagine that level of honesty could still lead to billions lost in shareholder value - thus the grey area. Market obfuscation is a real thing.

stagger87 · 2024-09-25T20:55:31 1727297731

It's in nobodies best interest to do this especially when there is so much money at play.

rvnx · 2024-09-25T21:11:35 1727298695

A bit ironic for a non-profit

dragonwriter · 2024-09-26T09:09:56 1727341796

Everyone involved works at and has investments in a for-profit firm.

The fact that it has a structure that subordinates it to the board of a non-profit would be only tangential to the interests involved even if that was meaningful and not just rhe lingering vestige of the (arguably, deceptive) founding that the combined organization was working on getting rid of.

mewpmewp2 · 2024-09-25T22:23:51 1727303031

As I understand they are going to be stop being non-profit soonish now?

blitzar · 2024-09-26T06:15:04 1727331304

We lie about our successes why would we not lie about our failures?

sumedh · 2024-09-26T10:32:26 1727346746

> Why is it so hard to just accept this and be transparent about motives

You are asking the question, why are politicians not honest?

mewpmewp2 · 2024-09-25T22:22:13 1727302933

Because if you are a high level executive and you are transparent on those things, and if it backfires, it will backfire hard for your future opportunities, since all the companies will view you as a potential liability. So it is always safer and wiser option to not say anything in case of any risk of it backfiring. So you do the polite PR messaging every single time. There's nothing to be gained on the individual level of being transparent, only to be risked.

deepGem · 2024-09-25T23:10:14 1727305814

I doubt someone with Mira or Ilya’s calibre have to worry about future opportunities. They can very well craft their own opportunities.

Saying I was wrong should not be this complicated, or saying we failed.

I do however agree that there is nothing to be gained and everything to be risked. So why do it.

dh2022 · 2024-09-26T00:06:10 1727309170

Their (Ilya and Mira) perspective on anything is so far remote from your (and my) perspectives that trying to understand their personal feelings behind their resignation is an enterprise doomed to failure.

Barrin92 · 2024-09-25T20:31:11 1727296271

>but were never seriously going to work together again after the failed coup.

Just to clear one thing up, the designated function of a board of directors is to appoint or replace the executive of an organisation, and openAI in particular is structured such that the non-profit part of the organisation controls the LLC.

The coup was the executive, together with the investors, effectively turning that on its head by force.

bookofjoe · 2024-09-25T20:47:35 1727297255

"When you strike at a king, you must kill him." — Emerson

sllewe · 2024-09-25T21:11:32 1727298692

or an alternate - "Come at the king - you best not miss" -- Omar Little.

timy2shoes · 2024-09-26T01:24:53 1727313893

“the King stay the King.” —- D’Angelo Barksdale

sirspacey · 2024-09-26T04:35:53 1727325353

“Original King Julius is on the line.” - Sacha Baron Cohen

selcuka · 2024-09-26T05:54:49 1727330089

King Julien

macintux · 2024-09-26T01:16:22 1727313382

“How do you shoot the devil in the back? What if you miss?”

ionwake · 2024-09-25T21:49:58 1727300998

the real OG comment here

ropable · 2024-09-26T01:25:26 1727313926

"When you play the game of thrones, you win or you die." - Cersei Lannister

bg24 · 2024-09-25T20:25:21 1727295921

This is the likely scenario. Every conflict at exec level comes with a "messaging" aspect, with there being a comms team, and board to manage that part.

amenhotep · 2024-09-25T20:17:40 1727295460

Failed coup? Altman managed to usurp the board's power, seems pretty successful to me

xwowsersx · 2024-09-25T20:29:13 1727296153

I think OP means the failed coup in which they attempted to oust Altman?

jordanb · 2024-09-25T20:50:33 1727297433

Yeah the GP's point is the board was acting within its purview by dismissing the CEO. The coup was the successful counter-campaign against the board by Altman and the investors.

jeremyjh · 2024-09-25T21:20:33 1727299233

The successful coup was led by Satya Nadella.

ethbr1 · 2024-09-25T21:14:29 1727298869

Let's be honest: in large part by Microsoft.

llamaimperative · 2024-09-26T00:20:48 1727310048

Does it matter? The board made a decision and the CEO reversed it. There is no clearer example of a corporate coup.

optimalsolver · 2024-09-25T20:21:32 1727295692

[flagged]

richbell · 2024-09-25T20:25:13 1727295913

For fun:

> In the sentence, the people responsible for the coup are implied to be Murati and Ilya. The phrase "Murati (and Ilya for that matter) took their shot to remove Sam" suggests that they were the ones who attempted to remove Sam (presumably a leader or person in power) but failed, leading to a situation where they had to cooperate temporarily despite tensions.

nopromisessir · 2024-09-25T20:46:31 1727297191

Highly speculative.

Also highly cynical.

Some folks are professional and mature. In the best organisations, the management team sets the highest possible standard, in terms of tone and culture. If done well, this tends to trickle down to all areas of the organization.

Another speculation would be that she's resigning for complicated reasons which are personal. I've had to do the same in my past. The real pro's give the benefit of the doubt.

itsoktocry · 2024-09-25T20:55:11 1727297711

What leads you to believe that OpenAI is one of the best managed organizations?

nopromisessir · 2024-09-25T21:05:10 1727298310

Many hours of interviews.

Organizational performance metrics.

Frequency of scientific breakthroughs.

Frequency and quality of product updates.

History of consistently setting the state of the art in artificial intelligence.

Demonstrated ability to attract world class talent.

Released the fastest growing software product in the history of humanity.

kranke155 · 2024-09-25T21:12:01 1727298721

We have to see if they’ll keep executing in a year, considering the losses in staff and the non technical CEO.

nopromisessir · 2024-09-26T00:59:37 1727312377

I don't get this.

I could write paragraphs...

Why the rain clouds?

dfgtyu65r · 2024-09-25T20:51:09 1727297469

This feels naive, especially given what we now know about Open AI.

nopromisessir · 2024-09-25T21:15:00 1727298900

If you care to detail supporting evidence, I'd be keen to see.

Please no speculative pieces, rumor nor hearsay.

apwell23 · 2024-09-25T21:24:14 1727299454

Well why was sam altman fired. it was never revealed.

CEOs get fired all the time and company puts out a statement.

I've never seen "we won't tell you why we fired our CEO" anywhere.

now he is back making totally ridiculous statments like 'AI is going to solve all of physics' or that 'AI is going to clone my brain by 2027'

This is a strange company.

alephnerd · 2024-09-25T21:33:13 1727299993

> This is a strange company.

Because the old guard wanted it to remain a cliquey non-profit filled to the brim with EA, AI Alignment, and OpenPhilanthropy types, but the current OpenAI is now an enterprise company.

This is just Sam Altman cleaning house after the attempted corporate coup a year ago.

llamaimperative · 2024-09-26T00:33:01 1727310781

When the board fires the CEO and the CEO reverses the decision, that is the coup.

The board’s only reason to exist is effectively to fire the CEO.

apwell23 · 2024-09-26T00:34:55 1727310895

I think thats some rumors that they spread to make this look like a "conflict of philosophy" type bs.

There are some juicy rumors about what actually happened too. much more belivable lol .

sverhagen · 2024-09-25T20:51:59 1727297519

Did you also try to oust the CEO of a multi-billion dollar juggernaut?

nopromisessir · 2024-09-25T21:08:33 1727298513

Sure didn't.

Neither did she though... To my knowledge.

Can you provide any evidence that she tried to do that? I would ask that it be non-speculative in nature please.

alephnerd · 2024-09-25T21:24:50 1727299490

https://www.nytimes.com/2023/11/17/technology/openai-sam-alt...

nopromisessir · 2024-09-26T00:56:38 1727312198

Below are exerts from the article you link. I'd suggest a more careful read through. Unless out of hand, you give zero credibility to first hand accounts given to the NYT by both Mirati and Sustkever...

This piece is built on conjecture from a source whose identify is withheld. The sources version of events is openly refuted by the parties in question. Offering it as evidence that Mirati intentionally made political moves in order to get Altman ousted is an indefensible position.

'Mr. Sutskever’s lawyer, Alex Weingarten, said claims that he had approached the board were “categorically false.”'

'Marc H. Axelbaum, a lawyer for Ms. Murati, said in a statement: “The claims that she approached the board in an effort to get Mr. Altman fired last year or supported the board’s actions are flat wrong. She was perplexed at the board’s decision then, but is not surprised that some former board members are now attempting to shift the blame to her.” In a message to OpenAI employees after publication of this article, Ms. Murati said she and Mr. Altman “have a strong and productive partnership and I have not been shy about sharing feedback with him directly.”

She added that she did not reach out to the board but “when individual board members reached out directly to me for feedback about Sam, I provided it — all feedback Sam already knew,” and that did not mean she was “responsible for or supported the old board’s actions.”'

This part of NYT piece is supported by evidence:

'Ms. Murati wrote a private memo to Mr. Altman raising questions about his management and also shared her concerns with the board. That move helped to propel the board’s decision to force him out.'

INTENT matters. Mirati says the board asked for her concerns about Altmans. She provided it and had already brought it to Altmans attention... in writing. Her actions demonstrate transparency and professionalism.

jsheard · 2024-09-25T20:07:05 1727294825

> It is hard for me to square "This company is a few short years away from building world-changing AGI"

Altmans quote was that "it's possible that we will have superintelligence in a few thousand days", which sounds a lot more optimistic on the surface than it actually is. A few thousand days could be interpreted as 10 years or more, and by adding the "possibly" qualifier he didn't even really commit to that prediction.

It's hype with no substance, but vaguely gesturing that something earth-shattering is coming does serve to convince investors to keep dumping endless $billions into his unprofitable company, without risking the reputational damage of missing a deadline since he never actually gave one. Just keep signing those 9 digit checks and we'll totally build AGI... eventually. Honest.

ben_w · 2024-09-25T20:41:22 1727296882

Between 1 and 10 thousands of days, so 3 to 27 years.

A range I'd agree with; for me, "pessimism" is the shortest part of that range, but even then you have to be very confident the specific metaphorical horse you're betting on is going to be both victorious in its own right and not, because there's no suitable existing metaphor, secretly an ICBM wearing a patomime costume.

dimitri-vs · 2024-09-26T00:36:53 1727311013

Just in time for them to figure out fusion to power all the GPUs.

But really. o1 has been very whelming, nothing like the step up from 3.5 to 4. Still prefer sonnet3.5 and opus.

zooq_ai · 2024-09-25T21:06:29 1727298389

1 you use 1

2 (or even 3) you use "a couple"

A few is almost always > 3 and one could argue that upper limit 15

So, 10 years to 50 years

usaar333 · 2024-09-26T04:26:17 1727324777

few is not > 3. Literally it's just >= 2, though I think >= 3 is the common definition.

15 is too high to be a "few" except in contexts of a few out of tens of thousands of items.

Realistically I interpret this as 3-7 thousands of days (8 to 19 years), which is largely consensus prediction range anyway.

rsynnott · 2024-09-26T09:38:17 1727343497

While it's not really _wrong_ to describe two things as 'a few', as such, it's unusual and people don't really do it in standard English.

That said, I think people are possibly overanalysing this very vague barely-even-a-claim just a little. Realistically, when a tech company makes a vague claim about what'll happen in 10 years, that should be given precisely zero weight; based on historical precedent you might as well ask a magic 8-ball.

ben_w · 2024-09-25T21:37:45 1727300265

Personally speaking, above 10 thousand I'd switch to saying "a few tens of thousands".

But the mere fact you say 15 is arguable does indeed broaden the range, just as me saying 1 broadens it in the opposite extent.

fvv · 2024-09-25T22:09:29 1727302169

You imply that he knows exactly when which imo is not and could even be next year for what we knows.. Who know every paper yet to be published??

015a · 2024-09-25T23:35:59 1727307359

Because as we all know: Full Self Driving is just six months away.

squarefoot · 2024-09-25T23:55:04 1727308504

Thanks, now I cannot unthink of this vision: developers activate the first ASI, and after 3 minutes it spits out full code and plans for a working Full Self Driving car prototype:)

blitzar · 2024-09-26T06:18:29 1727331509

I thought super-intelligence was to say self driving would be fully operational next year for 10 consecutive years?

squarefoot · 2024-09-26T15:39:15 1727365155

My point was that only super intelligence could possibly solve a problem that we can only pretend to have solved.

petre · 2024-09-25T21:42:57 1727300577

> it's possible that we will have superintelligence in a few thousand days

Sure, a few thousand days and a few trillion $ away. We'll also have full self driving next month. This is just like the fusion is the energy of the future joke: it's 30 years away and it will always be.

actionfromafar · 2024-09-25T22:16:01 1727302561

Now it’s 20 years away! It took 50 years for it to go from 30 to 20 years away. So maybe, in another 50 years it will be 10 years away?

z7 · 2024-09-25T20:16:23 1727295383

>Altmans quote was that AGI "could be just a few thousand days away" which sounds a lot more optimistic on the surface than it actually is.

I think he was referring to ASI, not AGI.

umeshunni · 2024-09-25T20:28:45 1727296125

Isn't ASI > AGI?

ben_w · 2024-09-25T20:50:41 1727297441

Both are poorly defined.

By all the standards I had growing up, ChatGPT is already AGI. It's almost certainly not as economically transformative as it needs to be to meet OpenAI's stated definition.

OTOH that may be due to limited availability rather than limited quality: if all the 20 USD/month for Plus gets spent on electricity to run the servers, at $0.10/kWh, that's about 274 W average consumption. Scaled up to the world population, that's approximately the entire global electricity supply. Which is kinda why there's also all the stories about AI data centres getting dedicated power plants.

Spivak · 2024-09-25T21:49:08 1727300948

Don't know why you're being downvoted, these models meet the definition of AGI. It just looks different than perhaps we expected.

We made a thing that exhibits the emergent property of intelligence. A level of intelligence that trades blows with humans. The fact that our brains do lots of other things to make us into self-contained autonomous beings is cool and maybe answers some questions about what being sentient means but memory and self-learning aren't the same thing as intelligence.

I think it's cool that we got there before simulating an already existing brain and that intelligence can exist separate from consciousness.

CaptainFever · 2024-09-25T20:43:31 1727297011

Is the S here referring to Sentient or Specialised?

ben_w · 2024-09-25T20:56:01 1727297761

Super(human).

Old-school AI was already specialised. Nobody can agree what "sentient" is, and if sentience includes a capacity to feel emotions/qualia etc. then we'd only willingly choose that over non-sentient for brain uploading not "mere" assistants.

jrflowers · 2024-09-25T22:39:48 1727303988

Scottish.

romanhn · 2024-09-25T20:48:58 1727297338

Super, whatever that means

saalweachter · 2024-09-26T12:05:24 1727352324

Actually, the S means hope.

bottlepalm · 2024-09-25T22:41:38 1727304098

Given that ChatGPT is already smarter and faster than humans in many different metrics. Once the other metrics catch up with humans it will still be better than humans in the existing metrics. Therefore there will be no AGI, only ASI.

threeseed · 2024-09-25T22:56:37 1727304997

My fridge is already smarter and faster than humans in many different metrics.

Has been this way since calculation machines were invented hundreds of years ago.

rsynnott · 2024-09-26T09:39:26 1727343566

_Thousands_; an abacus can outperform any unaided human at certain tasks.

vasco · 2024-09-25T21:26:07 1727299567

OpenAI is a Microsoft play to get into power generation business, specifically nuclear, which is a pet interest of Bill Gates for many years.

There, that's my conspiracy theory quota for 2024 in one comment.

kolbe · 2024-09-26T00:31:26 1727310686

I don't think Gates has much influence on Microsoft these days.

basementcat · 2024-09-26T00:46:16 1727311576

He controls approximately 1% of the voting shares of MSFT.

kolbe · 2024-09-26T00:54:14 1727312054

And I would argue his "soft power" is greatly diminished as well

PoignardAzur · 2024-09-26T05:27:49 1727328469

It's kinda cool as a conspiracy theory. It's just reasonable enough if you don't know any of the specifics. And the incentives mostly make sense, if you don't look too closely.

theGnuMe · 2024-09-26T02:40:49 1727318449

To paraphrase a notable example: We will have full self driving capability next year..

blihp · 2024-09-26T01:14:23 1727313263

This was the company that made all sorts of noise about how they couldn't release GPT-2 to the public because it was too dangerous[1]. While there are many very useful applications being developed, OpenAI's main deliverable appears to be hype that I suspect when it's all said and done they will fail to deliver on. I think the main thing they are doing quite successfully is cashing in on the hype before people figure it out.

[1] https://slate.com/technology/2019/02/openai-gpt2-text-genera...

johnfn · 2024-09-26T01:24:00 1727313840

GPT-2 and descendants have polluted the internet with AI spam. I don't think that this is too unreasonable of a claim.

shmatt · 2024-09-25T20:57:39 1727297859

I feel like this is stating the obvious - but i guess not to many - but a probabilistic syllable generator is not intelligence, it does not understand us, it cannot reason, it can only generate the next syllable

It makes us feel understood in the same ways John Edward used to in daytime tv, its all about how language makes us feel

true AGI...unfortunately we're not even close

lumenwrites · 2024-09-26T00:01:23 1727308883

"Intelligence" is a poorly defined term prone to arguments about semantics and goalpost shifting.

I think it's more productive to think about AI in terms of "effectiveness" or "capability". If you ask it, "what is the capital of France?", and it replies "Paris" - it doesn't matter whether it is intelligent or not, it is effective/capable at identifying the capital of France.

Same goes for producing an image, writing SQL code that works, automating some % of intellectual labor, giving medical advice, solving an equation, piloting a drone, building and managing a profitable company. It is capable of various things to various degrees. If these capabilities are enough to make money, create risks, change the world in some significant way - that is the part that matters.

Whether we call it "intelligence" or "probabilistically generaring syllables" is not important.

atleastoptimal · 2024-09-26T00:50:16 1727311816

it can actually solve problems though, its not just an illusion of intelligence if it does the stuff we considered mere years ago sufficient to be intelligent. But you and others keep moving the goalposts as benchmarks saturate, perhaps due to a misplaced pride in the specialness of human intelligence.

I understand the fear, but the knee jerk response “its just predicting the next token thus could never be intelligent” makes you look more like a stochastic parrot than these models are.

ssnistfajen · 2024-09-26T06:17:43 1727331463

It solves problems because it was trained with the solutions to these problems that have been written down a thousand times before. A lot of people don't even consider the ability to solve problems to be a reliable indicator of human intelligence, see the constantly evolving discourse regarding standardized tests.

Attempts at autonomous AI agents are still failing spectacularly because the models don't actually have any thought or memory. Context is provided to them via prefixing the prompt with all previous prompts which obviously causes significant info loss after a few interaction loops. The level of intellectual complexity at play here is on par with nematodes in a lab (which btw still can't be digitally emulated after decades of research). This isn't a diss on all the smart people working in AI today, bc I'm not talking about the quality of any specific model available today.

atleastoptimal · 2024-09-27T00:13:11 1727395991

You're acting like 99% of humans aren't very much dependent on that same scaffolding. Humans spend 12+ years in school, their brains being hammered with the exact rules of math, grammar, and syntax. To perform our jobs, we often consult documentation or other people performing the same task. Only after much extensive, deep thought can we extrapolate usefully beyond our training set.

LLM's do have memory and thought. I've invented a few somewhat unusual games, described it to Sonnet 3.5 and it reproduces it in code almost perfectly. Likewise its memory has been scaling. Just a couple years ago context windows were 8000 tokens maximum, now they're reaching the millions.

I feel like you're approaching all these capabilities with a myopic viewpoint, then playing semantic judo to obfuscate the nature of these increases as "not counting" since they can be vaguely mapped to something that has a negative connotation.

>A lot of people don't even consider the ability to solve problems to be a reliable indicator of intelligence

That's a very bold statement, as lots of smart people have said that the very definition of intelligence is the ability to solve problems. If fear of the effectiveness of LLM's in behaving genuinely intelligently leads you to making extreme sweeping claims on what intelligence doesn't count as, then you're forcing yourself into a smaller and smaller corner as AI SOTA capabilities predictably increase month after month.

caconym_ · 2024-09-26T03:22:22 1727320942

The "goalposts" are "moving" because now (unlike "mere years ago") we have real AI systems that are at least good enough to be seriously compared with human intelligence. We aren't vaguely speculating about what such an AI system might be like^[1]; we have the real thing now, and we can test its capabilities and see what it is like, what it's good at, and what it's not so good at.

I think your use of the "goalposts" metaphor is telling. You see this as a team sport; you see yourself on the offensive, or the defensive, or whatever. Neither is conducive to a balanced, objective view of reality. Modern LLMs are shockingly "smart" in many ways, but if you think they're general intelligence in the same way humans have general intelligence (even disregarding agency, learning, etc.), that's a you problem.

^[1] I feel the implicit suggestion that there was some sort of broad consensus on this in the before-times is revisionism.

atleastoptimal · 2024-09-27T00:18:46 1727396326

> but if you think they're general intelligence in the same way humans have general intelligence (even disregarding agency, learning, etc.), that's a you problem.

How is it a me problem? The idea of these models being intelligent is shared with a large number of researchers and engineers in the field. Such is clearly evident when you can ask o1 some random completely novel question about a hypothetical scenario and it gets the implication you're trying to make with it very well.

I feel that simultaneously praising their abilities while claiming that they still aren't intelligent "in the way humans are" is just obscure semantic judo meant to stake an unfalsifiable claim. There will always be somewhat of a difference between large neural networks and human brains, but the significance of the difference is a subjective opinion depending on what you're focusing on. I think it's much more important to focus on the realm of "useful, hard things that are unique to intelligent systems and their ability to understand the world" is more important than "Possesses the special kind of intelligence that only humans have".

caconym_ · 2024-09-27T06:05:23 1727417123

> I think it's much more important to focus on the realm of "useful, hard things that are unique to intelligent systems and their ability to understand the world" is more important than "Possesses the special kind of intelligence that only humans have".

This is a common strawman that appears in these conversations—you try to reframe my comments as if I'm claiming human intelligence runs on some kind of unfalsifiable magic that a machine could never replicate. Of course, I've suggested no such thing, nor have I suggested that AI systems aren't useful.

HeatrayEnjoyer · 2024-09-25T21:39:01 1727300341

This overplayed knee jerk response is so dull.

svara · 2024-09-25T21:45:21 1727300721

I truly think you haven't really thought this through.

There's a huge amount of circuitry between the input and the output of the model. How do you know what it does or doesn't do?

Humans brains "just" output the next couple milliseconds of muscle activation, given sensory input and internal state.

Edit: Interestingly, this is getting downvotes even though 1) my last sentence is a precise and accurate statement of the state of the art in neuroscience and 2) it is completely isomorphic to what the parent post presented as an argument against current models being AGI.

To clarify, I don't believe we're very close to AGI, but parent's argument is just confused.

015a · 2024-09-25T23:47:04 1727308024

Did you seriously just use the word "isomorphic"? No wonder people believe AI is the next crypto.

svara · 2024-09-26T06:42:40 1727332960

Well, AI clearly is the next crypto, haha.

Apologies for the wording but I think you got it and the point stands.

I'm not a native speaker and mostly use English in a professional science related setting, that's why I sound like that sometimes.

isomorphic - being of identical or similar form, shape, or structure (m-w). Here metaphorically applied to the structure of an argument.

edouard-harris · 2024-09-26T08:08:02 1727338082

In what way was their usage incorrect? They simply said that the brain just predicts next-actions, in response to a statement that an LLM predicts next-tokens. You can believe or disbelieve either of those statements individually, but the claims are isomorphic in the sense that they have the same structure.

015a · 2024-09-26T14:49:46 1727362186

Its not that it was used incorrectly: Its that it isn't a word actual humans use, and its one of a handful of dog whistles for "I'm a tech grifter who has at best a tenuous grasp on what I'm talking about but would love more venture capital". The last time I've personally heard it spoken was from Beff Jezos/Guillaume Verdon.

svara · 2024-09-26T15:29:18 1727364558

You know, you can just talk to me about my wording. Where do I meet those gullible venture investors?

NoGravitas · 2024-09-26T15:05:35 1727363135

I think we should delve further into that analysis.

HarHarVeryFunny · 2024-09-26T02:09:23 1727316563

> There's a huge amount of circuitry between the input and the output of the model

Yeah - but it's just a stack of transformer layers. No looping, no memory, no self-modification (learning). Also, no magic.

svara · 2024-09-26T06:41:38 1727332898

No looping, but you can unroll loops to a fixed depth and apply the model iteratively. There obviously is memory and learning.

Neuroscience hasn't found the magic dust in our brains yet, either. ;)

HarHarVeryFunny · 2024-09-26T12:46:46 1727354806

Zero memory inside the model from one input (ie token output) to the next (only the KV cache, which is just an optimization). The only "memory" is what the model outputs and therefore gets to re-consume (and even there it's an odd sort of memory since the model itself didn't exactly choose what to output - that's a random top-N sampling).

There is no real runtime learning - certainly no weight updates. The weights are all derived from pre-training, and so the runtime model just represents a frozen chunk of learning. Maybe you are thinking of "in-context learning", which doesn't update the weights, but is rather the ability of the model to use whatever is in the context, including having that "reinforced" by repetition. This is all a poor substitute for what an animal does - continuously learning from experience and exploration.

The "magic dust" in our brains, relative to LLMs, is just a more advanced and structure architecture, and operational dynamics. e.g. We've got the thalamo-cortical loop, massive amounts of top-down feedback for incremental learning from prediction failure, working memory, innate drives such as curiosity (prediction uncertainty) and boredom to drive exploration and learning, etc, etc. No magic, just architecture.

svara · 2024-09-26T15:22:23 1727364143

I'm not entirely sure what you're arguing for. Current AI models can still get a lot better, sure. I'm not in the AGI in 3 years camp.

But, people in this thread are making philosophically very poor points about why that is supposedly so.

It's not "just" sequence prediction, because sequence prediction is the very essence of what the human brain does.

Your points on learning and memory are similarly weak word play. Memory means holding some quantity constant over time in the internal state of a model. Learning means being able to update those quantities. LLMs obviously do both.

You're probably going to be thinking of all sorts of obvious ways in which LLMs and humans are different.

But no one's claiming there's an artificial human. What does exist is increasingly powerful data processing software that progressively encroaches on domains previously thought to be that of humans only.

And there may be all sorts of limitations to that, but those (sequences, learning, memory) aren't them.

HarHarVeryFunny · 2024-09-27T00:30:04 1727397004

> It's not "just" sequence prediction, because sequence prediction is the very essence of what the human brain does.

Agree wrt the brain.

Sure, LLMs are also sequence predictors, and this is a large part of why they appear intelligent (intelligence = learning + prediction). The other part is that they are trained to mimic their training data, which came from a system of greater intelligence than their own, so by mimicking a more intelligent system they appear to be punching above their weight.

I'm not sure that "JUST sequence predictors" is so inappropriate though - sure sequence prediction is a powerful and critical capability (the core of intelligence), but that is ALL that LLMs can do, so "just" is appropriate.

Of course additionally not all sequence predictors are of equal capability, so we can't even say, "well, at least as far as being sequence predictors goes, they are equal to humans", but that's a difficult comparison to make.

> Your points on learning and memory are similarly weak word play. Memory means holding some quantity constant over time in the internal state of a model. Learning means being able to update those quantities. LLMs obviously do both.

Well, no...

1) LLMs do NOT "hold some quantity constant over time in the internal state of the model". It is a pass-thru architecture with zero internal storage. When each token is generated it is appended to the input, and the updated input sequence is fed into the model and everything is calculated from scratch (other than the KV cache optimization). The model appears to be have internal memory due to the coherence of the sequence of tokens it is outputting, but in reality everything is recalculated from scratch, and the coherence is due to the fact that adding one token to the end of a sequence doesn't change the meaning of the sequence by much, and most of what is recalculated will therefore be the same as before.

2) If the model has learnt something, then it should have remembered it from one use to another, but LLMs don't do this. Once the context is gone and the user starts a new conversation/session, then all memory of the prior session is gone - the model has NOT updated itself to remember anything about what happened previously. If this was an employee (an AI coder, perhaps) then it would be perpetual groundhog day. Every day it came to work it'd be repeating the same mistakes it made the day before, and would have forgotten everything you might have taught it. This is not my definition of learning, and more to the point the lack of such incremental permanent learning is what'll make LLMs useless for very many jobs. It's not an easy fix, which is why we're stuck with massively expensive infrequent retrainings from scratch rather than incremental learning.

HeatrayEnjoyer · 2024-09-26T12:59:02 1727355542

>no memory, no self-modification (learning).

This is also true of those with advanced Alzheimer's disease. Are they not conscious as well? If we believe they are conscious then memory and learning must not be essential ingredients.

HarHarVeryFunny · 2024-09-26T14:15:20 1727360120

I'm not sure what you're trying to say.

I thought we're talking about intelligence, not consciousness, and limitations of the LLM/transformer architecture that limit their intelligence compared to humans.

In fact LLMs are not only architecturally limited, but they also give the impression of being far more intelligent than they actually are due to mimicking training sources that are more intelligent than the LLM itself is.

If you want to bring consciousness into the discussion, then that is basically just the brain modelling itself and the subjective experience that gives rise to. I expect it arose due to evolutionary adaptive benefit - part of being a better predictor (i.e. more intelligent) is being better able to model your own behavior and experiences, but that's not a must-have for intelligence.

og_kalu · 2024-09-27T01:01:41 1727398901

LLMs are predictors not imitators. They don't "mimick". They predict and that's a pretty big difference.

HarHarVeryFunny · 2024-09-27T05:24:04 1727414644

Well, it's typically going to be a collective voice, not an individual, but they are certainly mimicking ... they are trying to predict what the collective voice will say next - to mimick it.

og_kalu · 2024-09-27T23:16:19 1727478979

No it's more like they are trying to predict what some given human might say(#amongst other things).

a pretrained transformer in the limit does not converge on any collective or consensus state in that sense and in fact, pre-training actually punishes this. It learns to predict the words of Feynman as readily as the dumbass across the street.

When i say that GPT does not mimic, i mean that the training objective literally optimizes for beyond that.

Consider <Hash, plaintext> pairs. You can't predict this without cracking the hash algorithm, but you could easily fool a GAN's discriminator(one that has learnt to compute hash functions) just by generating typical instances.

# Consider that some of the text on the Internet isn't humans casually chatting or extemporaneous speech. It's the results section of a science paper. It's news stories that say what happened on a particular day. It's text that people crafted over hours or days.

lewhoo · 2024-09-26T13:57:50 1727359070

I don't think that's a good example. People with Alzheimer's have, to put it simply, damaged memory, but not complete lack of. We're talking about a situation where a person wouldn't be even conscious of being a human/person unless they were told so as part of the current context window. Right ?

CooCooCaCha · 2024-09-25T21:06:16 1727298376

I'm not saying you're wrong but you could use this reductive rhetorical strategy to dismiss any AI algorithm. "It's just X" is frankly shallow criticism.

timr · 2024-09-25T21:13:26 1727298806

And you can dismiss any argument with your response.

"Your argument is just a reductive rhetorical strategy."

CooCooCaCha · 2024-09-25T22:38:23 1727303903

Sure if you ignore context.

"a probabilistic syllable generator is not intelligence, it does not understand us, it cannot reason" is a strong statement and I highly doubt it's backed by any sort of substance other than "feelz".

timr · 2024-09-26T00:32:12 1727310732

I didn't ignore any more context than you did, but just I want to acknowledge the irony that "context" (specifically, here, any sort of memory that isn't in the text context window) is exactly what is lacking with these models.

For example, even the dumbest dog has a memory, a strikingly advanced concept model of the world [1], a persistent state beyond the last conversation history, and an ability to reason (that doesn't require re-running the same conversation sixteen bajillion times in a row). Transformer models do not. It's really cool that they can input and barf out realistic-sounding text, but let's keep in mind the obvious truths about what they are doing.

[1] "I like food. Something that smells like food is in the square thing on the floor. Maybe if I tip it over food will come out, and I will find food. Oh no, the person looked at me strangely when I got close to the square thing! I am in trouble! I will have to do it when they're not looking."

CooCooCaCha · 2024-09-26T01:13:51 1727313231

> that doesn't require re-running the same conversation sixteen bajillion times in a row

Lets assume the dog visual systems run at 60 frames per second. If it takes 1 second to flip a bowl of food over then that's 60 datapoints of cause-effect data that the dog's brain learned from.

Assuming it's the same for humans, lets say I go on a trip to the grocery store for 1 hour. That's 216,000 data points from one trip. Not to mention auditory data, touch, smell, and even taste.

> ability to reason [...] Transformer models do not

Can you tell me what reasoning is? Why can't transformers reason? Note I said transformers not llm's. You could make a reasonable (hah) case that current LLMs cannot reason (or at least very well) but why are transformers as an architecture doomed?

What about chain of thought? Some have made the claim that chain of thought adds recurrence to transformer models. That's a pretty big shift, but you've already decided transformers are a dead end so no chance of that making a difference right?

iLoveOncall · 2024-09-25T21:10:08 1727298608

And there's nothing wrong about that: the fact that _artificial intelligence_ will never lead to general intelligence isn't exactly a hot take.

CooCooCaCha · 2024-09-25T22:40:35 1727304035

That's both a very general and very bold claim. I don't think it's unreasonable to say that's too strong of a claim given how we don't know what is possible yet and there's frankly no good reason to completely dismiss the idea of artificial general intelligence.

NoGravitas · 2024-09-26T15:08:41 1727363321

I think the existence of biological general intelligence is a proof-by-existence for artificial general intelligence. But at the same time, I don't think LLM and similar techniques are likely in the evolutionary path of artificial general intelligence, if it ever comes to exist.

CooCooCaCha · 2024-09-26T23:11:57 1727392317

That's fair. I think it could go either way. It just bugs me when people are so certain and it's always some shallow reason about "probability" and "it just generates text".

dr_dshiv · 2024-09-25T22:45:16 1727304316

It’s almost trolling at this point, though.

paxys · 2024-09-25T21:12:49 1727298769

> to dismiss any AI algorithm

Or even human intelligence

ttul · 2024-09-25T21:45:45 1727300745

While it's true that language models are fundamentally based on statistical patterns in language, characterizing them as mere "probabilistic syllable generators" significantly understates their capabilities and functional intelligence.

These models can engage in multistep logical reasoning, solve complex problems, and generate novel ideas - going far beyond simply predicting the next syllable. They can follow intricate chains of thought and arrive at non-obvious conclusions. And OpenAI has now showed us that fine-tuning a model specifically to plan step by step dramatically improves its ability to solve problems that were previously the domain of human experts.

Although there is no definitive evidence that state-of-the-art language models have a comprehensive "world model" in the way humans do, several studies and observations suggest that large language models (LLMs) may possess some elements or precursors of a world model.

For example, Tegmark and Gurnee [1] found that LLMs learn linear representations of space and time across multiple scales. These representations appear to be robust to prompting variations and unified across different entity types. This suggests that modern LLMs may learn rich spatiotemporal representations of the real world, which could be considered basic ingredients of a world model.

And even if we look at much smaller models like Stable Diffusion XL, it's clear that they encode a rich understanding of optics [2] within just a few billion parameters (3.5 billion to be precise). Generative video models like OpenAI's Sora clearly have a world model as they are able to simulate gravity, collisions between objects, and other concepts necessary to render a coherent scene.

As for AGI, the consensus on Metaculus is that it will arrive in 2023. But consider that before GPT-4 arrived, the consensus was that full AGI was not coming until 2041 [3]. The consensus for the arrival date of "weakly general" AGI is 2027 [4] (i.e AGI that doesn't have a robotic physical world component). The best tool for achieving AGI is the transformer and its derivatives; its scaling keeps going with no end in sight.

Citations:

[1] https://paperswithcode.com/paper/language-models-represent-s...

[2] https://www.reddit.com/r/StableDiffusion/comments/15he3f4/el...

[3] https://www.metaculus.com/questions/5121/date-of-artificial-...

[4] https://www.metaculus.com/questions/3479/date-weakly-general...

iLoveOncall · 2024-09-25T22:18:37 1727302717

> Generative video models like OpenAI's Sora clearly have a world model as they are able to simulate gravity, collisions between objects, and other concepts necessary to render a coherent scene.

I won't expand on the rest, but this is simply nonsensical.

The fact that Sora generates output that matches its training data doesn't show that it has a concept of gravity, collision between object, or anything else. It has a "world model" the same way a photocopier has a "document model".

svara · 2024-09-25T22:43:10 1727304190

My suspicion is that you're leaving some important parts in your logic unstated. Such as belief in a magical property within humans of "understanding", which you don't define.

The ability of video models to generate novel video consistent with physical reality shows that they have extracted important invariants - physical law - out of the data.

It's probably better not to muddle the discussion with ill defined terms such as "intelligence" or "understanding".

I have my own beef with the AGI is nigh crowd, but this criticism amounts to word play.

phatfish · 2024-09-25T23:17:25 1727306245

It feels like if these image and video generation models were really resolving some fundamental laws from the training data they should at least be able to re-create an image at a different angle.

some1else · 2024-09-26T00:45:07 1727311507

"Allegory of the cave" comes to mind, when trying to describe the understanding that's missing from diffusion models. I think a super-model with such qualifications would require a number of ControlNets in a non-visual domains to be able to encode understanding of the underlying physics. Diffusion models can render permutations of whatever they've seen fairly well without that, though.

svara · 2024-09-26T09:14:12 1727342052

I'm very familiar with the allegory of the cave, but I'm not sure I understand where you're going with the analogy here.

Are you saying that it is not possible to learn about dynamics in a higher dimensional space from a lower dimensional projection? This is clearly not true in general.

E.g., video models learn that even though they're only ever seeing and outputting 2d data, objects have different sides in a fashio that is consistent with our 3d reality.

The distinctions you (and others in this thread) are making is purely one of degree - how much generalization has been achieved, and how well - versus one of category.

PollardsRho · 2024-09-26T00:00:58 1727308858

> its scaling keeps going with no end in sight.

Not only are we within eyesight of the end, we're more or less there. o1 isn't just scaling up parameter count 10x again and making GPT-5, because that's not really an effective approach at this point in the exponential curve of parameter count and model performance.

I agree with the broader point: I'm not sure it isn't consistent with current neuroscience that our brains aren't doing anything more than predicting next inputs in a broadly similar way, and any categorical distinction between AI and human intelligence seems quite challenging.

I disagree that we can draw a line from scaling current transformer models to AGI, however. A model that is great for communicating with people in natural language may not be the best for deep reasoning, abstraction, unified creative visions over long-form generations, motor control, planning, etc. The history of computer science is littered with simple extrapolations from existing technology that completely missed the need for a paradigm shift.

versteegen · 2024-09-26T02:16:54 1727317014

The fact that OpenAI created and released o1 doesn't mean they won't also scale models upwards or don't think it's their best hope. There's been plenty said implying that they are.

I definitely agree that AGI isn't just a matter of scaling transformers, and also as you say that they "may not be the best" for such tasks. (Vanilla transformers are extremely inefficient.) But the really important point is that transformers can do things such as abstract, reason, form world models and theories of minds, etc, to a significant degree (a much greater degree than virtually anyone would have predicted 5-10 years ago), all learnt automatically. It shows these problems are actually tractable for connectionist machine learning, without a paradigm shift as you and many others allege. That is the part I disagree with. But more breakthroughs needed.

ttul · 2024-09-26T14:55:48 1727362548

To whit: OpenAI was until quite recently investigating having TSMC build a dedicated semiconductor fab to produce OpenAI chips [1]:

(Translated from Chinese) > According to industry insiders, OpenAI originally actively negotiated with TSMC to build a dedicated wafer factory. However, after evaluating the development benefits, it shelved the plan to build a dedicated wafer factory. Strategically, OpenAI sought cooperation with American companies such as Broadcom and Marvell for its own ASIC chips. Development, among which OpenAI is expected to become Broadcom's top four customers.

[1] https://money.udn.com/money/story/5612/8200070 (Chinese)

Even if OpenAI doesn't build its own fab -- a wise move, if you ask me -- the investment required to develop an ASIC on the very latest node is eye watering. Most people - even people in tech - just don't have a good understanding of how "out there" semiconductor manufacturing has become. It's basically a dark art at this point.

For instance, TSMC themselves [2] don't even know at this point whether the A16 node chosen by OpenAI will require using the forthcoming High NA lithography machines from ASML. The High NA machines cost nearly twice as much as the already exceptional Extreme Ultraviolet (EUV) machines do. At close to $400M each, this is simply eye watering.

I'm sure some gurus here on HN have a more up to date idea of the picture around A16, but the fundamental news is this: If OpenAI doesn't think scaling will be needed to get to AGI, then why would they be considering spending many billions on the latest semiconductor tech?

Citations: [1] https://www.phonearena.com/news/apple-paid-twice-as-much-for... [2] https://www.asiabusinessoutlook.com/news/tsmc-to-mass-produc...

Erem · 2024-09-25T21:51:13 1727301073

The only useful way to define an AGI is based on its capabilities, not its implementation details.

Based on capabilities alone, current LLMs demonstrate many of the capabilities practitioners ten years ago would have tossed into the AGI bucket.

What are some top capabilities (meaning inputs and outputs) you think are missing on the path between what we have now and AGI?

paxys · 2024-09-25T20:29:39 1727296179

Regardless of where AI currently is and where it is going, you don't simply quit as CTO of the company that is leading the space by far in terms of technology, products, funding, revenue, popularity, adoption and just about everything else. She was fired, plain and simple.

rvnx · 2024-09-25T21:12:17 1727298737

You can leave and be happy with 30M+ USD in stocks and prospects of easy to find a job also.

noiwillnot · 2024-09-26T10:34:03 1727346843

> leading the space by far in terms of technology, products, funding, revenue, popularity, adoption and just about everything else

I am not 100% sure that they are still clearly leading the technology part, but agree in all other accounts.

piuantiderp · 2024-09-26T05:30:26 1727328626

Or you are disgusted and leave. Are there things more important than money? You'd certainly be certain the OpenAI founders sold themselves as, not'in'it'for'the money.

f0e4c2f7 · 2024-09-25T20:23:53 1727295833

There is one clear answer in my opinion:

There is a secondary market for OpenAI stock.

It's not a public market so nobody knows how much you're making if you sell, but if you look at current valuations it must be a lot.

In that context, it would be quite hard not to leave and sell or stay and sell. What if oai loses the lead? What if open source wins? Keeping the stock seems like the actual hard thing to me and I expect to see many others leave (like early googlers or Facebook employees)

Sure it's worth more if you hang on to it, but many think "how many hundreds of M's do I actually need? Better to derisk and sell"

chatcode · 2024-09-25T20:44:39 1727297079

What would you do if

a) you had more money than you'll ever need in your lifetime

b) you think AI abundance is just around the corner, likely making everything cheaper

c) you realize you still only have a finite time left on this planet

d) you have non-AGI dreams of your own that you'd like to work on

e) you can get funding for anything you want, based on your name alone

Do you keep working at OpenAI?

orionsbelt · 2024-09-25T20:01:47 1727294507

Maybe she thinks the _world_ is a few short years away from building world-changing AGI, not just limited to OpenAI, and she wants to compete and do her own thing (and easily raise $1B like Ilya).

xur17 · 2024-09-25T20:12:11 1727295131

Which is arguably a good thing (having AGI spread amongst multiple entities rather than one leader).

tomrod · 2024-09-25T20:23:00 1727295780

The show Person of Interest comes to mind.