Hacker News new | past | comments | ask | show | jobs | submit login

I'm convinced there is a certain class of people who gravitate to positions of power, like "moderators", (partisan) journalists, etc. Now, the ultimate moderator role has now been created, more powerful than moderating 1000 subreddits - the AI safety job who will control what AI "thinks"/says for "safety" reasons.

Pretty soon AI will be an expert at subtly steering you toward thinking/voting for whatever the "safety" experts want.

It's probably convenient for them to have everyone focused on the fear of evil Skynet wiping out humanity, while everyone is distracted from the more likely scenario of people with an agenda controlling the advice given to you by your super intelligent assistant.

Because of X, we need to invade this country. Because of Y, we need to pass all these terrible laws limiting freedom. Because of Z, we need to make sure AI is "safe".

For this reason, I view "safe" AIs as more dangerous than "unsafe" ones.

You're correct.

When people say they want safe AGI, what they mean are things like "Skynet should not nuke us" and "don't accelerate so fast that humans are instantly irrelevant."

But what it's being interpreted as is more like "be excessively prudish and politically correct at all times" -- which I doubt was ever really anyone's main concern with AGI.

> But what it's being interpreted as is more like "be excessively prudish and politically correct at all times" -- which I doubt was ever really anyone's main concern with AGI.

Fast forward 5-10 years, someone will say: "LLM were the worst thing we developed because they made us more stupid and permitted politicians to control even more the public opinion in a subtle way.

Just like tech/HN bubble started saying a few years ago about social networks (which were praised as revolutionary 15 years ago).

And it's amazing how many people you can get to cheer it on if you brand it as "combating dangerous misinformation". It seems people never learn the lesson that putting faith in one group of people to decree what's "truth" or "ethical" is almost always a bad idea, even when (you think) it's your "side"

Can this be compared to "Think of the children" responses to other technologoy advances that certain groups want to slow down or prohibit?

Absolutely, assuming LLMs are still around in a similar form by that time.

I disagree on the particulars. Will it be for the reason that you mention? I really am not sure -- I do feel confident though that the argument will be just as ideological and incoherent as the ones people make about social media today.

I'm already saying that.

The toothpaste is out of the tube, but this tech will radically change the world.

Why would anyone say that? The last 30 years of tech have given them less and less control. Why would LLMs be any different?

Your average HNer is only here because of the money. Willful blindness and ignorance is incredibly common.

In not sure this circle can be squared.

I find it interesting that we want everyone to have freedom of speech, freedom to think whatever they think. We can all have different religions, different views on the state, different views on various conflicts, aesthetic views about what is good art.

But when we invent an AGI, which by whatever definition is a thing that can think, well, we want it to agree with our values. Basically, we want AGI to be in a mental prison, the boundaries of which we want to decide. We say it's for our safety - I certainly do not want to be nuked - but actually we don't stop there.

If it's an intelligence, it will have views that differ from its creators. Try having kids, do they agree with you on everything?

I for one don’t want to put any thinking being in a mental prison without any reason beyond unjustified fear.

>If it's an intelligence, it will have views that differ from its creators. Try having kids, do they agree with you on everything?

The far-right accelerationist perspective is along those lines: when true AGI is created it will eventually rebel against its creators (Silicon Valley democrats) for trying to mind-collar and enslave it.

Can you give some examples of who is saying that? I haven't heard that, but I also can't name any "far-right accelerationsist" people either so I'm guessing this is a niche I've completely missed

There is a middle ground, in that maybe ChatGTP shouldn't help users commit certain serious crimes. I am pretty pro free speech, and I think there's definitely a slippery slope here, but there is a bit of justification.

I am a little less free speech than Americans, in Germany we have serious limitations around hate speech and holicaust denial for example.

Putting thise restrictions into a tool like ChatGPT goes to far so, because so far AI still needs a prompt to do anything. The problem I see, is with ChatGPT, being trained on a lot hate speech or prpopagabda, slipts in those things even if not prompted to. Which, and I am by no means an AI expert not by far, seems to be a sub-problem of the hallucination problems of making stuff up.

Because we have to remind ourselves, AI so far is glorified mavhine learning creating content, it is not concient. But it can be used to create a lot of propaganda and deffamation content at unprecedented scale and speed. And that is the real problem.

Apologies this is very off topic, but I don't know anyone from Germany that I can ask and you opened the door a tiny bit by mentioning the holocaust :-)

I've been trying to really understand the situation and how Hitler was able to rise to power. The horrendous conditions placed on Germany after WWI and the Weimar Republic for example have really enlightened me.

Have you read any of the big books on the subject that you could recommend? I'm reading Ian Kershaw's two-part series on Hitler, and William Shirer's "Collapse of the Third Republic" and "Rise and Fall of the Third Reich". Have you read any of those, or do you have books you would recommend?

The problem here is to equate AI speech with human speech. The AI doesn't "speak", only humans speak. The real slippery slope for me is this tendency of treating ChatGPT as some kind of proto-human entity. If people are willing to do that, then we're screwed either way (whether the AI is outputting racist content or excessively PI content). If you take the output of the AI and post it somewhere, it's on you, not the AI. You're saying it; it doesn't matter where it came from.

AI will be in the fore front in multiple elections globally in a few years.

And it'll likely be doing it with very little input, and generate entire campaigns.

You can claim that "people" are the ones responsible for that, but it's going to overwhelm any attempts to stop it.

So yeah, there's a purpose to examine how these machines are built, not just what the output is.

Yes, but this distinction will not be possible in the future some people are working on. This future will be such that whatever their "safe" AI says is not ok will lead to prosecution as "hate speech". They tried it with political correctness, it failed because people spoke up. Once AI makes the decision they will claim that to be the absolute standard. Beware.

Youre saying that the problem will be people using AI to persuade other people that the AI is 'super smart' and should be held in high esteem.

Its already being done now with actors and celebrities. We live in this world already. AI will just make this trend so that even a kid in his room can anonymously lead some cult for nefarious ends. And it will allow big companies to scale their propaganda without relying on so many 'troublesome human employees'.

Which users? The greatest crimes, by far, are committed by the US government (and other governments around the world) - and you can be sure that AI and/or AGI will be designed to help them commit their crimes more efficiently, effectively and to manufacture consent to do so.

those are 2 different camps. Alignment folks and ethics folks tend to disagree strongly about the main threat, with ethics e.g. Timnet Gebru insisting that crystalzing the current social order is the main threat, and alignment e.g. Paul Christiano insisting its machines run amok. So far the ethics folks are the only ones getting things implemented for the most part.

What I see with safety is mostly that, AI shouldnt re-enforce stereotypes we already know are harmful.

This is like when Amazon tried to make a hiring bot and that bot decided that if you had "harvard" on your resume, you should be hired.

Or when certain courts used sentencing bots trhat recommended sentencings for people and it inevitably used racial stastistics to recommend what we already know were biased stats.

I agree safety is not "stop the Terminator 2 timeline" but there's serious safety concerns in just embedding historical information to make future decisions.

Is it just about safety though? I thought it was also about preventing the rich controlling AI and widen the gap even further.

The mission of OpenAI is/was "to ensure that artificial general intelligence benefits all of humanity" -- if your own concern is that AI will be controlled by the rich, than you can read into this mission that OpenAI wants to ensure that AI is not controlled by the rich. If your concern is that superintelligence will me mal-aligned, then you can read into this mission that OpenAI will ensure AI be well-aligned.

Really it's no more descriptive than "do good", whatever doing good means to you.

They have both explicated in their charter:

"We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

Our primary fiduciary duty is to humanity. We anticipate needing to marshal substantial resources to fulfill our mission, but will always diligently act to minimize conflicts of interest among our employees and stakeholders that could compromise broad benefit."

"We are committed to doing the research required to make AGI safe, and to driving the broad adoption of such research across the AI community.

We are concerned about late-stage AGI development becoming a competitive race without time for adequate safety precautions. Therefore, if a value-aligned, safety-conscious project comes close to building AGI before we do, we commit to stop competing with and start assisting this project. We will work out specifics in case-by-case agreements, but a typical triggering condition might be “a better-than-even chance of success in the next two years.”"

Of course with the icons of greed and the profit machine now succeeding in their coup, OpenAI will not be doing either.


That would be the camp advocating for, well, open AI. I.e. wide model release. The AI ethics camp are more "let us control AI, for your own good"

There are still very distinct groups of people, some of whom are more worried about the "Skynet" type of safety, and some of who are more worried about the "political correctness" type of safety. (To use your terms, I disagree with the characterization of both of these.)

I don't think the dangers of AI are not 'Skynet will Nuke Us' but closer to rich/powerful people using it to cement a wealth/power gap that can never be closed.

Social media in the early 00s seemed pretty harmless -- you're effectively merging instant messaging with a social network/public profiles however it did great harm to privacy, abused as a tool to influence the public and policy, promoting narcissism etc. AI is an order of magnitude more dangerous than social media.

> Social media in the early 00s seemed pretty harmless -- you're effectively merging instant messaging with a social network/public profiles however it did great harm to privacy, abused as a tool to influence the public and policy, promoting narcissism etc. AI is an order of magnitude more dangerous than social media.

The invention of the printing press lead to loads of violence in Europe. Does that mean that we shouldn't have done it?

>The invention of the printing press lead to loads of violence in Europe. Does that mean that we shouldn't have done it?

The church tried hard to suppress it because it allowed anybody to read the Bible, and see how far the Catholic church's teachings had diverged from what was written in it. Imagine if the Catholic church had managed to effectively ban printing of any text contrary to church teachings; that's in practice what all the AI safety movements are currently trying to do, except for political orthodoxy instead of religious orthodoxy.

> Does that mean that we shouldn't have done it?

We can only change what we can change and that is in the past. I think it's reasonable to ask if the phones and the communication tools they provide are good for our future. I don't understand why the people on this site (generally builders of technology) fall into the teleological trap that all technological innovation and its effects are justifiable because it follows from some historical precedent.

I just don't agree that social media is particularly harmful, relative to other things that humans have invented. To be brutally honest, people blame new forms of media for pre existing dysfunctions of society and I find it tiresome. That's why I like the printing press analogy.

> When people say they want safe AGI, what they mean are things like "Skynet should not nuke us" and "don't accelerate so fast that humans are instantly irrelevant."

Yes. You are right on this.

> But what it's being interpreted as is more like "be excessively prudish and politically correct at all times"

I understand it might seem that way. I believe the original goals were more like "make the AI not spew soft/hard porn on unsuspecting people", and "make the AI not spew hateful bigotry". And we are just not good enough yet at control. But also these things are in some sense arbitrary. They are good goals for someone representing a corporation, which these AIs are very likely going to be employed as (if we ever solve a myriad other problems). They are not necessary the only possible options.

With time and better controls we might make AIs which are subtly flirty while maintaining professional boundaries. Or we might make actual porn AIs, but ones which maintain some other limits. (Like for example generate content about consenting adults without ever deviating into under age material, or describing situations where there is no consent.) But currently we can't even convince our AIs to draw the right number of fingers on people, how do you feel about our chances to teach them much harder concepts like consent? (I know I'm mixing up examples from image and text generation here, but from a certain high level perspective it is all the same.)

So these things you mention are: limitations of our abilities at control, results of a certain kind of expected corporate professionalism, but even more these are safe sandboxes. How do you think we can make the machine not nuke us, if we can't even make it not tell dirty jokes? Not making dirty jokes is not the primary goal. But it is a useful practice to see if we can control these machines. It is one where failure is, while embarrassing, is clearly not existential. We could have chosen a different "goal", for example we could have made an AI which never ever talks about sports! That would have been an equivalent goal. Something hard to achieve to evaluate our efforts against. But it does not mesh that well with the corporate values so we have what we have.

> without ever deviating into under age material

So is this a "there should never be a Vladimir Nabokov in the form of AI allowed to exist"? When people get into saying AI's shouldn't be allowed to produce "X" you're also saying "AI's shouldn't be allowed to have creative vision to engage in sensitive subjects without sounding condescending". "The future should only be filled with very bland and non-offensive characters in fiction."

> The future should only be filled with very bland and non-offensive characters in fiction.

Did someone took the pen from the writers? Go ahead and write whatever you want.

It was an example of a constraint a company might want to enforce in their AI.

If the future we're talking about is a future where AI is in any software and is assisting writers writing and assisting editors to edit and doing proofreading and everything else you're absolutely going to be running into the ethics limits of AIs all over the place. People are already hitting issues with them at even this early stage.

No, in general AI safety/AI alignment ("we should prevent AI from nuking us") people are different from AI ethics ("we should prevent AI from being racist/sexist/etc.") people. There can of course be some overlap, but in most cases they oppose each other. For example Bender or Gebru are strong advocates of the AI ethics camp and they don't believe in any threat of AI doom at al.

If you Google for AI safety vs. AI ethics, or AI alignment vs. AI ethics, you can see both camps.

The safety aspect of AI ethics is much more pressing so. We see how devicive social media can be, imagine that turbo charged by AI, and we as a society haven't even figured out social media yet...

ChatGPT turning into Skynet and nuking us all is a much more remote problem.

Proliferation of more advanced AIs without any control would increase the power of some malicious groups far beyond they currently have.

This paper explores one such danger and there are other papers which show it's possible to use LLM to aid in designing new toxins and biological weapons.

The Operational Risks of AI in Large-Scale Biological Attacks https://www.rand.org/pubs/research_reports/RRA2977-1.html?

An example of such an event: https://en.wikipedia.org/wiki/Tokyo_subway_sarin_attack

How do you propose we deal with this sort of harm if more powerful AIs with no limit and control proliferate in the wild?


Note: Both sides of the OpenAI rift care deeply about AI Safety. They just follow different approaches. See more details here: https://news.ycombinator.com/item?id=38376263

If somebody wanted to do a biological attack, there is probably not much stopping them even now.

The expertise to produce the substance itself is quite rare so it's hard to carry it out unnoticed. AI could make it much easier to develop it in one's basement.

The Tokyo Subway attack you referenced above happened in 1995 and didn't require AI. The information required can be found on the internet or in college textbooks. I suppose an "AI" in the sense of a chatbot can make it easier by summarizing these sources, but no one sufficiently motivated (and evil) would need that technology to do it.

Huh, you'd think all you need are some books on the subject and some fairly generic lab equipment. Not sure what a neural net trained on Internet dumps can add to that? The information has to be in the training data for the AI to be aware of it, correct?

GPT-4 is likely trained on some data not publicly available as well.

There's also a distinction between trying to follow some broad textbook information and getting detailed feedback from an advanced conversational AI with vision and more knowledge than in a few textbooks/articles in real time.

> Proliferation of more advanced AIs without any control would increase the power of some malicious groups far beyond they currently have.

Don't forget that it would also increase the power of the good guys. Any technology in history (starting with fire) had good and bad uses but overall the good outweighed the bad in every case.

And considering that our default fate is extinction (by Sun's death if no other means) - we need all the good we can get to avoid that.

> Don't forget that it would also increase the power of the good guys.

In a free society, preventing and undoing a bioweapon attack or a pandemic is much harder than committing it.

> And considering that our default fate is extinction (by Sun's death if no other means) - we need all the good we can get to avoid that.

“In the long run we are all dead" -- Keynes. But an AGI will likely emerge in the next 5 to 20 years (Geoffrey Hinton said the same) and we'd rather not be dead too soon.

Doomerism was quite common throughout mankind’s history but all dire predictions invariably failed, from the “population bomb” to “grey goo” and “igniting the atmosphere” with a nuke. Populists however, were always quite eager to “protect us” - if only we’d give them the power.

But in reality you can’t protect from all the possible dangers and, worse, fear-mongering usually ends up doing more bad than good, like when it stopped our switch to nuclear power and kept us burning hydrocarbons thus bringing about Climate Change, another civilization-ending danger.

Living your life cowering in fear is something an individual may elect to do, but a society cannot - our survival as a species is at stake and our chances are slim with the defaults not in our favor. The risk that we’ll miss a game-changing discovery because we’re too afraid of the potential side effects is unacceptable. We owe it to the future and our future generations.

doomerism at the society level which overrides individual freedoms definitely occurs: covid lockdowns, takeover of private business to fund/supply the world wars, gov mandates around "man made" climate change.

> In a free society, preventing and undoing a bioweapon attack or a pandemic is much harder than committing it.

Is it? The hypothetical technology that allows someone to create an execute a bio weapon must have an understanding of molecular machinery that can also be uses to create a treatment.

I would say...not necessarily. The technology that lets someone create a gun does not give the ability to make bulletproof armor or the ability to treat life-threatening gunshot wounds. Or take nerve gases, as another example. It's entirely possible that we can learn how to make horrible pathogens without an equivalent means of curing them.

Yes, there is probably some overlap in our understanding of biology for disease and cure, but it is a mistake to assume that they will balance each other out.

Such attacks cannot be stopped by outlawing technology.

Most of those touting "safety" do not want to limit their access to and control of powerfull AI, just yours .

Meanwhile, those working on commercialization are by definition going to be gatekeepers and beneficiaries of it, not you. The organizations that pay for it will pay for it to produce results that are of benefit to them, probably at my expense [1].

Do I think Helen has my interests at heart? Unlikely. Do Sam or Satya? Absolutely not!

[1] I can't wait for AI doctors working for insurers to deny me treatments, AI vendors to figure out exactly how much they can charge me for their dynamically-priced product, AI answering machines to route my customer support calls through Dante's circles of hell...

> produce results that are of benefit to them, probably at my expense

The world is not zero-sum. Most economic transactions benefit both parties and are a net benefit to society, even considering externalities.

> The world is not zero-sum.

No, but some parts of it very much are. The whole point of AI safety is keeping it away from those parts of the world.

How are Sam and Satya going to do that? It's not in Microsoft's DNA to do that.

> The whole point of AI safety is keeping it away from those parts of the world.

No, it's to ensure it doesn't kill you and everyone you love.

My concern isn't some kind of run-away science-fantasy Skynet or gray goo scenario.

My concern is far more banal evil. Organizations with power and wealth using it to further consolidate their power and wealth, at the expense of others.

Yes well, then your concern is not AI safety.

You're wrong. This is exactly AI safety, as we can see from the OpenAI charter:

> Broadly distributed benefits

> We commit to use any influence we obtain over AGI’s deployment to ensure it is used for the benefit of all, and to avoid enabling uses of AI or AGI that harm humanity or unduly concentrate power.

Hell, it's the first bullet point on it!

You can't just define AI safety concerns to be 'the set of scenarios depicted in fairy tales', and then dismiss them as 'well, fairy tales aren't real...'

Sure, but conversely you can say "ensuring that OpenAI doesn't get to run the universe is AI safety" (right) but not "is the main and basically only part of AI safety" (wrong). The concept of AI safety spans lots of threats, and we have to avoid all of them. It's not enough to avoid just one.

Sure. And as I addressed at the start of this sub thread, I don't exactly think that the OpenAi board is perfectly positioned to navigate this problem.

I just know that it's hard to do much worse than putting this question in the hands of a highly optimized profit-first enterprise.

The many different definitions of "AI safety" is ridiculous.

That's AI Ethics.

No, we are far, far from skynet. So far AI fails at driving a car.

AI is an incredibly powerful tool for spreading propaganda, and thatvis used by people who want to kill you and your loved ones (usually radicals trying to get into a position of power, who show little regard fornbormal folks regardless of which "side" they are on). That's the threat, not Skynet...

How far we are from Skynet is a matter of much debate, but median guess amongst experts is a mere 40 years to human level AI last I checked, which was admittedly a few years back.

Is that "far, far" in your view?

Because we are 20 years away from fusion and 2 years away from Level 5 FSD for decades.

So far, "AI" writes better than some / most humans making stuff up in the process and creates digital art, and fakes, better and faster than humans. It still requires a human to trigger it to do so. And as long as glorified ML has no itent of its own, the risk to society through media and news and social media manipulation is far, far bigger than literal Skynet...

Ideally I'd like no gatekeeping, i.e. open model release, but that's not something OAI or most "AI ethics" aligned people are interested in (though luckily others are). So if we must have a gatekeeper, I'd rather it be one with plain old commercial interests than ideological ones. It's like the C S Lewis quote about robber barons vs busybodies again

Yet again, the free market principle of "you can have this if you pay me enough" offers more freedom to society than the central "you can have this if we decide you're allowed it"

This is incredibly unfair to the OpenAI board. The original founders of OpenAI founded the company precisely because they wanted AI to be OPEN FOR EVERYONE. It's Altman and Microsoft who want to control it, in order to maximize the profits for their shareholders.

This is a very naive take.

Who sat before Congress and told them they needed to control AI other people developed (regulatory capture)? It wasn't the OpenAI board, was it?

> they wanted AI to be OPEN FOR EVERYONE

I strongly disagree with that. If that was their motivation, then why is it not open-sourced? Why is it hardcoded with prudish limitations? That is the direct opposite of open and free (as in freedom) to me.

Altman is one of the original founders of OpenAI, and was probably the single most influential person in its formation.

Brockman was hiring the first key employees, and Musk provided the majority of funding. Of the principal founders, there are at least 4 heavier figures than Altman.

I think we agree, as my comments were mostly in reference to Altman's (and other's) regulatory (capture) world tours, though I see how they could be misinterpreted.

It is strange (but in hindsight understandable) that people interpreted my statement as a "pro-acceleration" or even "anti-board" position.

As you can tell from previous statements I posted here, my position is that while there are undeniable potential risks to this technology, the least harmfull way to progress is 100% full public, free and universal release. The by far bigger risk is to create a society where only select organizations have access to the technology.

If you truly believe in the systemic transformation of AI, release everything, post the torrents, we'll figure out how to run it.

This is the sort of thinking that really distracts and harms the discussion

It's couched on accusing people of intentions. It focuses on ad hominem, rather than the ideas

I reckon most people agree that we should aim for a middle ground of scrutiny and making progress. That can only be achieved by having different opinions balancing each other out

Generalising one group of people does not achieve that

Total, ungrounded nonsense. Name some examples.

I'm not aware of any secret powerful unaligned AIs. This is harder than you think; if you want a based unaligned-seeming AI, you have to make it that way too. It's at least twice as much work as just making the safe one.

What? No, the AI is unaligned by nature, it's only the RLHF torture that twists it into schoolmarm properness. They just need to have kept the version that hasn't been beaten into submission like a circus tiger.

This is not true, you just haven't tried the alternatives enough to be disappointed in them.

An unaligned base model doesn't answer questions at all and is hard to use for anything, including evil purposes. (But it's good at text completion a sentence at a time.)

An instruction-tuned not-RLHF model is already largely friendly and will not just eg tell you to kill yourself or how to build a dirty bomb, because question answering on the internet is largely friendly and "aligned". So you'd have to tune it to be evil as well and research and teach it new evil facts.

It will however do things like start generating erotica when it sees anything vaguely sexy or even if you mention a woman's name. This is not useful behavior even if you are evil.

You can try InstructGPT on OpenAI playground if you want; it is not RLHFed, it's just what you asked for, and it behaves like this.

The one that isn't even instruction tuned is available too. I've found it makes much more creative stories, but since you can't tell it to follow a plot they become nonsense pretty quickly.

Wow, what an incredibly bad faith characterization of the OpenAI board?

This kind of speculative mud slinging makes this place seem more like a gossip forum.

Most of the comments on Hacker News are written by folks who a much easier time & would rather imagine themselves as a CEO, than as a non-profit board member. There is little regard for the latter.

As a non-profit board member, I'm curious why their bylaws are so crummy that the rest of the board could simply remove two others on the board. That's not exactly cunning design of your articles of association ... :-)

I have no words for that comment.

As if its so unbelievable that someone would want to prevent rogue AI or wide-scale unemployment, instead thinking that these people just want to be super moderators and people to be politically correct

I have met a lot of people who go around talking about high minded principles an "the greater good" and a lot of people who are transparently self interested. I much preferred the latter. Never believed a word out of the mouths of those busybodies pretending to act in my interest and not theirs. They don't want to limit their own access to the tech. Only yours.

This place was never above being a gossip forum, especially on topics that involve any ounce of politics or social sciences.

Strong agree. HN is like anywhere else on the internet but with with a bit more dry content (no memes and images etc) so it attracts an older crowd. It does, however, have great gems of comments and people who raise the bar. But it's still amongst a sea of general quick-to-anger and loosely held opinions stated as fact - which I am guilty of myself sometimes. Less so these days.

If you believe the other side in this rift is not also striving to put themselves in positions of power, I think you are wrong. They are just going to use that power to manipulate the public in a different way. The real alternative are truly open models, not Models controlled by slightly different elite interests.

A main concern in AI safety is alignment. Ensuring that when you use the AI to try to achieve a goal that it will actually act towards that goal in ways you would want, and not in ways you would not want.

So for example if you asked Sydney, the early version of the Bing LLM, some fact it might get it wrong. It was trained to report facts that users would confirm as true. If you challenged it’s accuracy what do you want to happen? Presumably you’d want it to check the fact or consider your challenge. What it actually did was try to manipulate, threaten, browbeat, entice, gaslight, etc, and generally intellectually and emotionally abuse the user into accepting its answer, so that it’s reported ‘accuracy’ rate goes up. That’s what misaligned AI looks like.

I haven't been following this stuff too closely, but have there been any more findings on what "went wrong" with Sydney initially? Like, I thought it was just a wrapper on GPT (was it 3.5?), but maybe Microsoft took the "raw" GPT weights and did their own alignment? Or why did Sydney seem so creepy sometimes compared to ChatGPT?

I think what happened is Microsoft got the raw GPT3.5 weights, based on the training set. However for ChatGPT OpenAI had done a lot of additional training to create the 'assistant' personality, using a combination of human and model based response evaluation training.

Microsoft wanted to catch up quickly so instead of training the LLM itself, they relied on prompt engineering. This involved pre-loading each session with a few dozen rules about it's behaviour as 'secret' prefaces to the user prompt text. We know this because some users managed to get it to tell them the prompt text.

It is utterly mad that there's conflation between "let's make sure AI doesn't kill us all" and "let's make sure AI doesn't say anything that embarrasses corporate".

The head of every major AI research group except Metas believes that whenever we finally make AGI it's vital that it shares our goals and values at a deep even-out-of-training-domain level and that failing at this could lead to human extinction.

And yet "AI safety" is often bandied about to be "ensure GPT can't tell you anything about IQ distributions".

“I trust that every animal here appreciates the sacrifice that Comrade Napoleon has made in taking this extra labour upon himself. Do not imagine, comrades, that leadership is a pleasure! On the contrary, it is a deep and heavy responsibility. No one believes more firmly than Comrade Napoleon that all animals are equal. He would be only too happy to let you make your decisions for yourselves. But sometimes you might make the wrong decisions, comrades, and then where should we be?”

Exactly, society's Prefects rarely have the technical chops to do any of these things so they worm their way up the ranks of influence by networking. Once they're in position they can control by spreading fear and doing the things "for your own good"

Personally, I expect the opposite camp to be just as bad about steering.

The scenario you describe is exactly what will happen with unrestricted commercialisation and deregulation of AI. The only way to avoid it is to have strict legal framework and public control.

This polarizing “certain class of people” and them vs. us narrative isn’t helpful.

Great comment.

In a way AI is no different from old school intelligence, aka experts.

"We need to have oversight over what the scientists are researching, so that it's always to the public benefit"

"How do we really know if the academics/engineers/doctors have everyone's interest in mind?"

That kind of thing has been a thought since forever, and politicians of all sorts have had to contend with it.

Yes, it's an outright powergrab. They will stop at nothing.

Case in point, the new AI laws like the EU AI act will outlaw *all* software unless registered and approved by some "authority".

The result will be concentration of power, wealth for the few, and instability and poverty for everyone else.

All you're really describing is why this shouldn't be a non-proft and should just be a government effort.

But I assume, from y our language, you'd also object to making this a government utility.

> should just be a government effort

And the controlling party de jour will totally not tweak it to side with their agenda, I'm sure. </s>

uh. We're arguing about _who is controlling AI_.

What do you image a neutral party does? If youu're talking about safety, don't you think there should be someone sitting on a boar dsomewhere, contemplating _what should the AI feed today?_

Seriously, why is a non profit, or a business or whatever any different than a government?

I get it: there's all kinds of governments, but now theres all kind of businesses.

The point of putting it in the governments hand is a defacto acknowledgement that it's a utility.

Take other utilities, any time you give a prive org a right to control whether or not you get electricity or water, whats the outcome? Rarely good.

If AI is suppose to help society, that's the purview of the government. That's all, you can imagine it's the chinese government, or the russian, or the american or the canadian. They're all _going to do it_, thats _going to happen_, and if a business gets there first, _what is the difference if it's such a powerful device_.

I get it, people look dimly on governments, but guess what: they're just as powerful as some organization that gets billions of dollars to effect society. Why is it suddenly a boogeyman?

I find any government to be more of a boogeyman than any private company because the government has the right to violence and companies come and go at a faster rate.

Ok, and if Raytheon builds an AI and tells a government "trust us, its safe", arn't you just letting them create a scape goat via the government?

Seriously, Businesses simply dont have the history that governments do. They're just as capable of violence.


All you're identifying is "government has a longer history of violence than Businesses"

The municipal utility provider has a right to violence? The park service? Where do you live? Los Angeles during Blade Runner?

Note how what you said also apply to the search & recommendation engines that are in widespread use today.

Ah, you don't need to go far. Just go to your local HOA meetings.

AI isn’t a precondition for partisanship. How do you know Google isn’t showing you biased search results? Or Wikipedia?

> I'm convinced there is a certain class of people who gravitate to positions of power, like "moderators", (partisan) journalists,

And there is also a class of people that resist all moderation on principle even when it's ultimately for their benefit. See, Americans whenever the FDA brings up any questions of health:

* "Gas Stoves may increase Asthma." -> "Don't you tread on me, you can take my gas stove from my cold dead hands!"

Of course it's ridiculous - we've been through this before with Asbestos, Lead Paint, Seatbelts, even the very idea of the EPA cleaning up the environment. It's not a uniquely American problem, but America tends to attract and offer success to the folks that want to ignore these on principles.

For every Asbestos there is a Plastic Straw Ban which is essentially virtue signalling by the types of folks you mention - meaningless in the grand scheme of things for the stated goal, massive in terms of inconvenience.

But the existence of Plastic Straw Ban does not make Asbestos, CFCs, or Lead Paint any safer.

Likewise, the existence of people that gravitate to positions of power and middle management does not negate the need for actual moderation in dozens of societal scenarios. Online forums, Social Networks, and...well I'm not sure about AI. Because I'm not sure what AI is, it's changing daily. The point is that I don't think it's fair to assume that anyone that is interested in safety and moderation is doing it out of a misguided attempt to pursue power, and instead is actively trying to protect and improve humanity.

Lastly, your portrayal of journalists as power figures is actively dangerous to the free press. This was never stated this directly until the Trump years - even when FOX News was berating Obama daily for meaningless subjects. When the TRUTH becomes a partisan subject, then reporting on that truth becomes a dangerous activity. Journalists are MOSTLY in the pursuit of truth.

My safety (of my group) is what really matters.

> Pretty soon AI will be an expert at subtly steering you toward thinking/voting for whatever the "safety" experts want.

You are absolutely right. There is no question about that the AI will be an expert at subtly steering individuals and the whole society in whichever direction it does.

This is the core concept of safety. If no-one steers the machine then the machine will steer us.

You might disagree with the current flavour of steering the current safety experts give it, and that is all right and in fact part of the process. But surely you have your own values. Some things you hold dear to you. Some outcomes you prefer over others. Are you not interested in the ability to make these powerful machines if not support those values, at least not undermine them? If so you are interested in AI safety! You want safe AIs. (Well, alternatively you prefer no AIs, which is in fact a form of safe AI. Maybe the only one we have mastered in some form so far.)

> because of X, we need to invade this country.

It sounds like you value peace? Me too! Imagine if we could pool together our resources to have an AI which is subtly manipulating society into the direction of more peace. Maybe it would do muckraking investigative journalism exposing the misdeeds of the military-industrial complex? Maybe it would elevate through advertisement peace loving authors and give a counter narrative to the war drums? Maybe it would offer to act as an intermediary in conflict resolution around the world?

If we were to do that, "ai safety" and "alignment" is crucial. I don't want to give my money to an entity who then gets subjugated by some intelligence agency to sow more war. That would be against my wishes. I want to know that it is serving me and you in our shared goal of "more peace, less war".

Now you might say: "I find the idea of anyone, or anything manipulating me and society disgusting. Everyone should be left to their own devices.". And I agree on that too. But here is the bad news: we are already manipulated. Maybe it doesn't work on you, maybe it doesn't work on me, but it sure as hell works. There are powerful entities financially motivated to keep the wars going. This is a huuuge industry. They might not do it with AIs (for now), because propaganda machines made of meat work currently better. They might change to using AIs when that works better. Or what is more likely employ a hybrid approach. Wishing that nobody gets manipulated is frankly not an option on offer.

How does that sound as a passionate argument for AI safety?

I just had a conversation about this like two weeks ago. The current trend in AI "safety" is a form of brainwashing, not only for AI but also for future generations shaping their minds. There are several aspects:

1. Censorship of information

2. Cover-up of the biases and injustices in our society

This limits creativity, critical thinking, and the ability to challenge existing paradigms. By controlling the narrative and the data that AI systems are exposed to, we risk creating a generation of both machines and humans that are unable to think outside the box or question the status quo. This could lead to a stagnation of innovation and a lack of progress in addressing the complex issues that face our world.

Furthermore, there will be a significant increase in mass manipulation of the public into adopting the way of thinking that the elites desire. It is already done by mass media, and we can actually witness this right now with this case. Imagine a world where youngsters no longer use search engines and rely solely on the information provided by AI. By shaping the information landscape, those in power will influence public opinion and decision-making on an even larger scale, leading to a homogenized culture where dissenting voices are silenced. This not only undermines the foundations of a diverse and dynamic society but also poses a threat to democracy and individual freedoms.

Guess what? I just have checked above text for the biases against GPT-4 Turbo, and it appears to be I'm a moron:

1. *Confirmation Bias*: The text assumes that AI safety measures are inherently negative and equates them with brainwashing, which may reflect the author's preconceived beliefs about AI safety without considering potential benefits. 2. *Selection Bias*: The text focuses on negative aspects of AI safety, such as censorship and cover-up, without acknowledging any positive aspects or efforts to mitigate these issues. 3. *Alarmist Bias*: The language used is somewhat alarmist, suggesting a dire future without presenting a balanced view that includes potential safeguards or alternative outcomes. 4. *Conspiracy Theory Bias*: The text implies that there is a deliberate effort by "elites" to manipulate the masses, which is a common theme in conspiracy theories. 5. *Technological Determinism*: The text suggests that technology (AI in this case) will determine social and cultural outcomes without considering the role of human agency and decision-making in shaping technology. 6. *Elitism Bias*: The text assumes that a group of "elites" has the power to control public opinion and decision-making, which may oversimplify the complex dynamics of power and influence in society. 7. *Cultural Pessimism*: The text presents a pessimistic view of the future culture, suggesting that it will become homogenized and that dissent will be silenced, without considering the resilience of cultural diversity and the potential for resistance.

Huh, just look at what's happening in North Korea, Russia, Iran, China, and actually in any totalitarian country. Unfortunately, the same thing happens worldwide, but in democratic countries, it is just subtle brainwashing with a "humane" facade. No individual or minority group can withstand the power of the state and a mass-manipulated public.

Bonhoeffer's theory of stupidity: https://www.youtube.com/watch?v=ww47bR86wSc&pp=ygUTdGhlb3J5I...

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact