Hacker News new | past | comments | ask | show | jobs | submit login
[flagged] It's all unraveling at OpenAI (again) (businessinsider.com)
42 points by Bluestein 19 days ago | hide | past | favorite | 42 comments




No new information in this article, just spamming all the recent newsworthy events for OpenAI.



I’m not convinced there isn’t a pretty defined ceiling on how clever LLMs can get using current algorithms and hardware.. this isn’t how we think and they all still fall apart on most reasoning tasks. If they haven’t seen lots of examples of something in the training they can’t generalise anything let alone comprehend a book on aerodynamics and design a formula one car. They are more like a parrot than a doctor right now.


The strangeness of this argument is its unnecessary for any one technique to be all in capable because a mixture of techniques is more powerful than one alone. A LLM forming an abductive “plan” and synthesizing results from agent functions that do the deductive reasoning, optimization, math, and other functions LLMs are bad at is remarkably powerful and is incredibly flexible in new semantic contexts. Who cares if LLMs can solve partial differential equations if they can establish a partial differential equation to solve given semantic context and a sufficient training set of semantically grounded PDEs ? That can be delegated to the last 70 years of research on numerical methods. Similarly we don’t expect a recursive decent parser to play chess, but you can absolutely use a parser to make a grammar that exposes a chess playing solver. The combination of the solver and the grammar that enables its use is more powerful than the sum of its parts.

LLMs neatly solve an enormous number of hitherto unsolved problems and yield untapped solutions to many more. Generative AI in general composes in powerful ways to unlock even more - semantic analysis of visual information expressible directly in structured language alone is powerful, as is exposing generative robotic motion and audio processing, paired with solvers, optimizers, etc - what’s the residual? Synthetic agency? We’ve had multi level goal based agents for a long time. One that’s reprogrammable by the generative model assembly?


(Personally, I am partial to the "alien chef" metaphor for AI: AIs are just reacting the way a chef from Earth would if transported to an alien planet where all the vegetables look the same (and ingredients) but are in reality very different from Earth, and have different effects, including poisonous. They think they are coming up with dishes fit for consumers, but the chef has no understanding of what's going on ...


A lot of the hype around LLMs and the unfounded promises of them solving complex tasks autonomously seem to stem from a lack of understanding on how these models actually work.

The tech is impressive and has its uses, but we shouldn’t pretend like a token predictor can somehow reason and plan.


I don't know how people can still be making this misguided comment. Predicting tokens well requires reasoning and planning.

"Brains are impressive and have their uses, but we shouldn't pretend like a muscle controller can somehow reason and plan."


> Predicting tokens well requires reasoning and planning.

Reason and planning regarding token prediction

It cannot reason about the context of its output. It only infers the most likely token to follow up.


Of course it can reason about the context of its output. I'm honestly not sure what you're trying to say.

Can you give me an example of a prompt that requires your definition of "reasoning"?


Have you ever tried to get whatever fancy LLM to write some code for you and had it generate code that was at the same time plausible to look at but complete bullshit?

This is what I am talking about. It can reason and plan for what is the likely token based off its training data. It is completely unable to evaluate the logic of the code it was generating. It can not reason on the context (in this case, programming).

The same is true for other domains. Law, Medicine, etc. the stricter the field, the less reliable LLMs are, because it cannot reason about the context of what it is writing.

That said, I like LLMs, and I think it was an interesting productivity tool, only having been hyped beyond any reasonable expectations. I find it more useful for less strict contexts (for example, creative writing).


I mean obviously predicting tokens in the context of LLM output requires planning. But planning tokens doesn’t generalize. This is evident when you train an LLM on a small data set and ask it about any unseen variation of it.

There are many examples of slightly modified popular riddles that are easy to solve by reasoning about them, but LLMs always fall back to the most likely output from their training data.


Sure the current generation, but what happens if we expand its scope, so instead of predicting the next token one at a time, it was allowed greater scope? What happens when you feed it summaries of its work as training data so it has long a short term memory, what happens when you feed several networks together creating a dialogue similar to the theories of a bicameral mind, what if we instead of having the llm halt after every prompt it was put in a loop? Its own output along with outside information used as the prompts for fallowing loop


There's no reason they can't. In fact LLMs do seem to be the best at artificial general-purpose reasoning of any attempt so far (pretty much every other attempt has failed miserably). That doesn't mean they're particularly good compared to humans, though. I'd characterize them as very high knowledge, low intelligence.


They absolutely can reason and plan; how do you suppose they predict the next token?

That they’re not autonomously solving complex tasks is a bit of a straw man though, and with a bit of creativity we can easily imagine them being combined with models and modalities that do provide executive function and autonomy.


This is one of those things I’d love to be wrong on. I think your point on combining them with other models is interesting.

Do you think that Markov chains can reason and plan? Dijkstra’s algorithm? Curious where you draw the line


Well, yes, reasoning and planning abilities exist on a spectrum, so it isn’t so much a matter of where to draw the line as a question of degree. As for LLMs, I think their reasoning and planning is some of the most powerful and human-like we’ve seen so far, even if the hidden mechanisms and constraints are different (in some cases, more limited, but in others, vastly superior).

Our brains however are highly modular (a “committee of idiots”) so who’s to say a portion, and even a significant one, doesn’t operate on similar principles?


For thought: Can DNA reason and plan?


Can a collection of around 1.5 billion interconnected cells that predictably respond to signals in their environment using simple rules? How about 86 billion? 36 trillion?

These are ballpark counts of cells in crow’s brain, a human’s brain, and a human body. The question is, is it the cells themselves doing the reasoning and planning, or are they just the machinery this disembodied process happens to be running on? I’d argue intelligence is a distributed phenomenon that our DNA is as much a party to as our brains.

Certainly the question of whether humans use DNA to reproduce or DNA uses humans is a matter of perspective.


AI safety, like banning AIs from pretending to be humans, require government action, regulations. There is naturally an arms race for AI organizations and companies and open source groups to be “the best”, and it is impractical and naive to expect individual organizations to act in humanity’s best interest.


My tin foil hat is telling me that no publicity is bad publicity and LLMs are great at spinning dramatic yarns.


Also, most of the signers in that letter probably still have OpenAI equity. They are incentivized to pump it. I'm not saying that they are doing this in bad faith. I'm just saying that the incentive is perverse in this case.


I dunno. Quite frankly I’m bored of it and don’t care any more. I suspect it’s just attrition. I have better things to do than deal with some transient technology provider’s existence.

Other than to bitch about this meta point of course :)

Really the metric is: if they died tomorrow would my life be materially impacted. Apple yes, OpenAI nope!


Their idea of AI safety is pretty heavy-handed anyway, so I’m glad they’re focusing on it less.

LLMs aren’t gonna take over the world, the worst they could do is tell you something you could find with Google.

I just want them to make a useful, reliable product and sell it to me for a reasonable price.


I know Sam Altman is revered as a tech god in this forum and and I will likely get downvoted, but there's something deep within his character that doesn't seem genuine or sit well with me. It's almost a tinge of elizabeth holmes (the aspirational part) and likely stems from his desire from a young age to emulate a bill gates, elon musk, or the cohort of silicon valley VC and founders that came before him. I believe this deeply drives his speech and motivation. He pulls certain levers that he knows will shape his persona and plays the tech visionary role. As an outsider, it seems manufactured and disingenuous.


> Sam Altman is revered as a tech god in this forum

I don’t think that’s true.

> I will likely get downvoted, but there's something deep within his character that doesn't seem genuine or sit well with me

That’s actually an extremely popular opinion; see for example just about every recent article that’s been posted about him.


Half of this story seems to be about the ScarJo "Her" voice story, which was more or less debunked by the Washington Post --- OpenAI hired several voice actors, months before reaching out to ScarJo, and the voice suspected to be a clone or performance of ScarJo's own voice was the natural speaking voice of one of those actors, who was not instructed to mimic ScarJo or given any reference to "Her". †

The rest of it is about "safety" whistleblowers. That's something that seems like it could be a real problem for OpenAI, if you believe that transformer models are something that raise genuine safety issues, or if you take seriously OpenAI's founding commitments to be something more like a monastic society than a company.

For some time now, I've mentally bucketed OpenAI in with all the other tech companies, and I find that perspective pretty refreshing; there's just not much to think about here, apart from the technology, because, to my eyes, it's not reasonable to expect a lot more of OpenAI than I would from Apple or Google. Is OpenAI becoming less and less like a religion and more like a hyper-rational technology business? To me: that's a good thing. It is fine that we disagree.

I anticipate many responses about the subtle angles of this silly voice actor story that I am not capturing, like Altman's professed love for the movie "Her", and to all that I'll preemptively respond: you do you, we (again) can simply disagree, I would have delighted in a news cycle where a giant tech company actually gets shredded by Rebecca from Ghost World, but I found the WaPo story dispositive.


I don't feel the ScarJo story was debunked. Those are OpenAI's counterpoints, sure, but it's a huge leap to accept they just completely, coincidently hired the one voice actor with an eerily similar voice, all the while continuing to reference the movie and continuing to reach out to Scarlett Johansson.


It’s been downgraded to “a preponderance of evidence” from “beyond a reasonable doubt”.


"A preponderance of evidence" is the standard for a civil case which is what this would be, "beyond a reasonable doubt" is the standard for a criminal case.


The issue with the ScarJo case, irrespective of the facts behind the development of OpenAI's voices, is that a) ScarJo's team indicated an intent to sue and b) legal precedent exists that can give the performer ownership if it's uncannily similar.

It's not as clear cut as non-lawyer tech people say it is even if it will be an uphill battle for ScarJo, which is what makes it more interesting. She did fight Disney and won, after all.


As far as I can tell, all of that precedent involves the "copier" of the voice giving specific instruction to the "mimic", which didn't happen here.


If the series of events was: They wanted Scar Jo, she said no. They found someone similar. That would be the end of the story. The question then was, why did they reach out to her again immediately before putting it out there? That seems pretty strange to me.


I don't have more to say here other that I'm aware of the legal precedent being discussed here, read 2 cases, and in both the fact pattern cited in the decision involved specific instruction for the performer to mimic someone else's performance. That's all.


They made a voice model from Scarlett Johansson recordings and hoped she would change her mind maybe.


They did no such thing. The Washington Post found the voice actor they used, it's her natural speaking voice, she was not asked to perform ScarJo, and the movie "Her" wasn't mentioned.


Sky was 1 voice model. OpenAI made and released 4 others. It is plausible they made and did not release at least 1.

> The Washington Post found the voice actor they used

They found the voice actor? Or OpenAI produced the actor's agent?


Is there reporting about this happening, or is this simply something you think could have happened? Because a lot of things could have happened.


What does maybe mean?

They said it seemed strange OpenAI contacted Johansson when they did. I countered this assumption.

I believe an unreleased Scarlett Johansson voice model would not infringe her publicity rights. Do you disagree?


parallel construction?


> and the voice suspected to be a clone or performance of ScarJo's own voice was the natural speaking voice of one of those actors, who was not instructed to mimic ScarJo

Yeah, but I bet out of all the voice actors available, I can guess why that specific one was chosen. There's more than one way to intentionally copy a voice.


My immediate reaction is that the media just needs something to talk about.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: