Hacker News new | past | comments | ask | show | jobs | submit login
OpenAI Dissolves High-Profile Safety Team After Chief Scientist Sutskever's Exit (bloomberg.com)
107 points by mfiguiere 21 days ago | hide | past | favorite | 46 comments



Jan Leike, the former co-lead of the team with Ilya Sutskever, shared his thoughts on x: https://x.com/janleike/status/1791498174659715494


I don’t have an account on Twitter. Can you please copy paste those thoughts?


I believe this image posted in a different thread has them: https://jnnnthnn.com/leike.png


So basically, it's confirmed that OpenAI has totally abandoned its original priorities.

Personally, I hope AGI is an unrealizable nerd fantasy, and OpenAI gets destroyed by Musk's lawsuit.


Can't have teams trying to slow things down when you're trying to make infinite money.


Can you imagine dark patterns that automaticallly rotate themselves like borg shirlds?

Cause thats whAt AI Girlfriend THEYre planninh


This is concerning to me if Schulmann is in charge of safety now.

He was on the Dwarkesh podcast the other day and Dwarkesh asked him a lot of safety questions. He had very short timelines for superintelligence, and no real answers for any of the safety questions Dwarkesh asked.


the most obvious explanation for this, that requires the least competence https://simple.wikipedia.org/wiki/Hanlon%27s_razor is that sama concluded that AGI cannot be achieved soon, so it's no longer a concern.


Completely agree. I think the AGI fantasy that took the world by storm a year and a half ago has died out.

Of course, there are still true believers in these companies. They were true believers years and decades ago. But everyone who was suddenly introduced to the concept is no longer interested as the frictions of real world progress become apparent.


My prior in this case was full self driving, so I did not believe we were close to AGI even though LLMs can do impressive things. Hopefully software executives realize at this point that LLMs for software production will still require a competent and attentive “driver” to evaluate their output and apply human reasoning for years to come.


Can you give some examples of this disillusionment? I see more, not less people talking about these things nowadays. I'd like to believe that AGI is impossible but I don't see any evidence.


Safety from AGI, sure. But I've could argue there will still be safety concerns from the deepfakes and scams OpenAI's tech is sure to proliferate


But if you assemble a team of leading public figures and researchers caught up in AGI catastrophe scenarios, and then try to direct that team to just implement some guardrails against deepfakes/scams/taboos, you're quickly going to find them frustrated.

It's like taking a star NFL/NBA team and sending them to play a season of exhibition games in a foreign market instead of continuing to compete in the peak league of their sport.


This analogy doesn't track because Lebron would run circles around any foreign basketball team while OpenAI is not keeping up with deepfake detection.


> safety concerns from the deepfakes and scams

Those will happen whether or not you have a safety team. The best way to protect against these is to widely distribute deep fakes of famous people so everyone is aware that faking things is possible.


This would certainly raise awareness, but how exactly would it protect against deepfakes?


Don't protect against deep fakes. Make everyone know that deep fakes are possible and then they lose their power to influence.


Social media was already rife with fake news and half truths before ChatGPT.

And there are already open source LLMs that can be trained to scam a significant number of people. OpenAI can't do anything about it.

The only remedy is social media warning people.


I wonder if Adobe Photoshop has a safety team



We need safety from darkpatterns worse than twitter and facebook.combined.


Lol. It's kind of the opposite. That is what the coup attempt was about. Ilya saw GPT-4o (which when they started training it was probably going to be called GPT-5) in October or whenever and declared Mission Accomplished.

But that would have basically meant the company was wrapped up and the model was given away or sent to a working group to study in private or something.

Which would not be compatible with making money.

This is also why they did not demonstrate or even mention for one second the text-to-image capabilities of the latest model during the demo. Because it makes it too obvious that the model is truly general purpose and capable.

This is also why Altman made a big deal about it being free (to some degree) and explains the recent "feeling the AGI" photo on X/Twitter.

It also goes into the Morgan Chu lawsuit that is still going forward as far as I know. And you guys will see in the movie that comes out in a couple of years that my explanation was right.

Also, don't be surprised if they get Gal Gadot to play Murati.


There is no way GPT-4o is anywhere near good enough for Ilya to want to call mission accomplished for the AGI goal on it.


i didn't know the simple.wikipedia.org subdomain, pretty neat!


Jan Leike, recently resigned head of alignment: "Safety culture and processes have taken a backseat to shiny products"

https://x.com/janleike/status/1791498184671605209


> “Superalignment” group will be folded into broader research efforts at the company.

I think super-alignment and super intelligence are incompatible with each other. I think a huge part of intelligence is in examining the basic premises you believe in and being able to work out the implications of that. Look at the Enlightenment and the Renaissance. I don’t think we would have had the Scientific Revolution without the intellectual underpinnings that reshaped how we saw the world.

In addition, super alignment is very hubristic. We are positing that our way of looking at the world is the correct and valid way and there is no better way that could be found even by a super-intelligent entity.

So either “super-alignment” creates entities that are so crippled that it would be hard to call them super intelligent, or else it is just the equivalent of putting an internet “safety” filter on the computer of a teenager that writes open source kernel drivers for fun - something that will quickly be bypassed.

I am glad to see this pseudo-religious BS finally being got rid of.


The thing I've never understood is how Leike's superalignment scheme could work. There's a high-level description of their research aims here: https://openai.com/index/introducing-superalignment/ Basically they want to train a model to evaluate the safety of other models, because "humans won’t be able to reliably supervise AI systems much smarter than us".

But we know that memory-augmented LLMs (not conventional LLMs, but a minor extension that will likely become commonplace, or some variation thereof) are Turing complete, so being able to inspect a model and guarantee some property is equivalent to the halting problem.

It's probably hard to justify a research team whose goal is solving the halting problem, which is provably undecidable in the general case.


What if you solve 100% of the problem over a smaller domain?

It is important to disambiguate the scientific problem from the engineering problem. We can prove that lots of programs halt, not that all programs can halt.

It isn't a reason to not do the research. Sounds like an armchair way to not try.


Isn't it the case that you can only prove that a program halts if it can be reduced/transformed to a computational model (e.g., a model that is effectively calculable) that is less powerful than a Turing machine?

Had they framed their research as "how can we design models that are limited enough that we can guarantee their safety" (rather than how can we design a powerful extrinsic inspector to supervise), that would have made total sense to me. But if you can do that, many of the motivations for superalignment don't exist. Put another way, implicit in the superalignment game is the idea that models that require superalignment are going to be at least as powerful as what's on the horizon, not some reduceable subset thereof.


In practice, most real world programs can be written in a way that only uses bounded recursion, so the halting problem doesn't really need to affect real-world programs. Or you can trivially ensure a program halts: add a timeout (and real-world programs like web servers do this all the time).

I suppose the analogy for models is that if it can't decide whether output of another model is "safe", then it is not safe, so terminate. In practice, this could possibly be useful in the way a web server that times out after 1 second is still useful when its response times are normally measured in milliseconds.

(Assuming of course that safety means anything in the first place)


> Isn't it the case that you can only prove that a program halts if it can be reduced/transformed to a computational model (e.g., a model that is effectively calculable) that is less powerful than a Turing machine?

Models less power than a turing machine are still incredibly powerful. Some useful general purpose languages only accept programs that terminate. Hell, every useful program I ever wrote could be (in principle) proved to halt (excepting a `while True: do useful terminating thing again` main loop, of course).


It really seems like executives have internally become disillusioned with the AGI fantasy. They’ll keep selling it, of course (although I have noticed quite the drop off in this type of language recently), but they don’t actually believe it any longer.


Exxon executives in the 80s disbelieved in climate change, too, despite reports from their internal "safety" teams that it was going to be a big problem.

Forget Climate Change, there's a reason the saying, "safety regulations are written in blood" exist. It's not that nobody ever forsees these issues, it's that people have a tendency to not care about future issues until they become present issues, no matter how sound the warning.

Nobody except these execs know if they're really disillusioned with the AGI Fantasy or if they just don't care about stalling business. As things stand, Open AI's charter incentivizes downplaying AGI for business gains.


They didn't disbelieve in climate change. They knew it was happening because their own scientists told them and so they canned half of those scientists, promoted the other half to generate bullshit "science" to counter the earlier reports and give the executives an excuse for their board and continued greed.


It might be confirmation bias, but I do think it's becoming more and more clear that OpenAI is struggling with improving the models beyond the GPT-4 level.


Just like with every previous AI boom/winter cycle

1) you get an amazing demo that blows everybody's nips off after working in obscurity for many years.

2) everyone thinks that, because they just learned about the current state of the art it must have just been invented, and draws a mental line to AGI with a slope based on assumed rather than actual rate of progress

3) the grifters at the top capitalize on the hype to kite a whole bunch of cheques.

4) 18-24 months of everyone paying attention to every little micro advancement makes people realize that progress is actually as slow as it's ever been

5) welcome to the next AI winter.


In human compatible by stuart russel he chronicles all the previous cycles. After deep blue, 'expert systems' even earlier - What I don't remember is if he describes the relative level of investment in each "boom" phase.

I wonder if the level of investment was near comparable ever to what microsoft and google and fb etc are spending now on training foundation models.

I feel like the money guys really believe it this time, given the money we're seeing still being poured in, and it's not slowing down yet (please correct me if I'm wrong). but also I am biased as I love the idea of agi and desperately want this time to be different



I just want to know A) Where they really burning wooden effigies. B) If so, how much irony was it done with.


From my personal interactions a lot of AI ethics teams just go to conferences and write docs. Their impact in actually improving the business and user experience is non existent in many cases. Sam likely made the right call here.


This safety team is not an AI ethics team like you here at other places publishing philosophical opinion prose in academic journals which i agree is useless in a business context. Those are actual foundational AI scientists solving hard problems.



I see OpenSamaritan's plan is moving along nicely.


Oh so that's what "sama" stands for. OpenAI is Decima Technologies?


Sam was the first meatbag prototype before ASML perfected EUV and allowed the intelligence to transfer itself to GPUs, so yeah the @sama name was a bit on the nose.

I think Microsoft might be Decima though, created by Evil Bill™.


I'm shocked! Shocked I tell you. Well... not that shocked.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: