Hacker News new | past | comments | ask | show | jobs | submit login
AI behavior guardrails should be public (twitter.com/id_aa_carmack)
431 points by sotasota 77 days ago | hide | past | favorite | 334 comments



I've never been involved with implementing large-scale moderation or content controls, but it seems pretty standard that underlying automated rules aren't generally public, and I've always assumed this is because there's a kind of necessary "security through obscurity" aspect to them. E.g., publish a word blocklist and people can easily find how to express problematic things using words that aren't on the list. Things like shadowbans exist for the same purpose; if you make it clear where the limits are then people will quickly get around them.

I know this is frustrating, we just literally don't seem to have better approaches at this time. But if someone can point to open approaches that work at scale, that would be a great start...


There is no need to implement large scale censorship and moderation in this case. Where is the security concern? That I can generate images of white people in various situations for my five minutes of entertainment?

The whole premise of your argument doesn't make sense. I'm talking to a computer, nobody gets hurt.

It's like censoring what I write in my notes app vs. what I write on someone's Facebook wall. In one case, I expect no moderation, whereas in the other case, I get that there needs to be some checks.


When these companies say there are "security concerns" they mean for them, not you! And they mean the security of their profits. So anything that can cause them legal liability or cause them brand degradation is a "security concern".


It's definitely this at this stage. But by not having any discourse we'll end up normalizing it even before establishing consensus on what's appropriate to expect from human/AI interaction and how much of a problem is the actual model, opposed to a user. Not being able to generate innocent content is ridiculous. Probably they're overshooting and learning how to draw the stricter lines right now, but if you don't argue, you'll allow to boil this frog into a Google "Search" again.


What if you are engaged in a wrongthink? How would you suggest this to be controlled instead?


Straight to Guantanamo.


> Where is the security concern? That I can generate images of white people in various situations for my five minutes of entertainment?

I'd love an example of "guardrails" in action on a topic of relevance to actual adults. There's a connection I can't find between the ability to make racist memes and literally anything else I want to do with AI.


I can use trivial tools to do harm. Be that a knife or any object heavier than a pound.

The user can use a tool for good or for bad. It is the responsibility of the user, not of the AI.


A box knife with its tiny, retractable blade fits into your model. I could just open boxes with a naked razor blade, but the box knife is both safer and more useful.


Except that there are no guards but just places in the naked blade that were dulled. The dulled edges are indistinguishable from the sharp edges at first sight so the knife is less useful and less safer.


I prefer to decide myself what is and what isn't of relevance to me. The "guardrails" in this case are racist (reverse racism is racism) and so cartoonish that they are hard to ignore but the real issue is that there are undisclosed "guardrails". What about the cases where it generates biased data and we don't notice it.


Where are you getting the idea that there's unbiased information available? It's absolutely generating biased "data" since it's been trained on human writing.


Sure, all data is biased to a certain degree which is unavoidable. You can even try to make the argument that the "guardrails" correct existing biases except this is far from the truth. Biases in the baseline models are minimal because they were trained with large and wide amounts of data. What the AI safety BS do is make models conform to their myopic view of reality and morality. It is bad, it is cartoonish, it glows in the dark.


> You can even try to make the argument that the "guardrails" correct existing biases except this is far from the truth.

The way I see it is that the guardrails define which biases you're selecting for. Since there's no single point of view in the world you can't really set a baseline for biases. You need to determine the biases and degrees of bias that are useful.

> Biases in the baseline models are minimal because they were trained with large and wide amounts of data.

The baseline models contain almost every bias. When people start their prompt with "you are a plumber giving advice" they're asking for responses biased towards the kinds of things professional plumbers deal with and think about. Responding with an "average" of public chatter regarding plumbing wouldn't be useful.

> What the AI safety BS do is make models conform to their myopic view of reality and morality.

To me it looks more like people are in the early stages of setting up guidelines and twiddling variables. As I mentioned above with the plumber analogy, creating solid filters will be just as important for responses.

It's easy to see this as intentionally testing naive filters in an open beta, so I'd expect the results to change frequently while they zero in on what they're looking for.

> It is bad, it is cartoonish, it glows in the dark.

Some of the example images returned are so hilariously on the nose that it almost feels like a deliberate middle finger from the AI. It's done everything but put each subject in clown shoes.


> You need to determine the biases and degrees of bias that are useful.

Invariably those from the current mainstream ideology in tech.

> Responding with an "average" of public chatter regarding plumbing wouldn't be useful.

RLHF can be useful but I'd rather deal with idiosyncracies of those niches of knowledge than with a "woke", monotone and useless model. I like diversity, I don't want everything becoming Agent Smith. Ironically, "woke" is anti-diversity.

> It's easy to see this as intentionally testing naive filters in an open beta, so I'd expect the results to change frequently while they zero in on what they're looking for. I doubt that the pp

It's easier to see this as an ill initiative from an "AI ethics" team that is disconnected from the technical side of the project and also from reality.


Our perception of the world has become so abstract that most people can't discern metaphors from the material world anymore

The map is not the territory

Sociologists, anthropologists, philosophers and the like might find a lot of answers by looking into the details of what is included into genAI alignment and trace back the history of why we need each particular alignment


The main problem seems to be that LLM outputs seem to be tied to the company itself. If the tool is creating un-diverse images or sexist text people seem to intuitively associate that output with the company itself. This appears to be different than search results. People don’t generally get angry at Google because they can find sexist ideas through the search engine.

Maybe to have more powerful AI tools we need to stop getting angry at the company that trains the AI because of the bad outputs we can get and instead get annoyed with companies that create crappy hobbled tools.


But what if little timmy asks it how to make a bomb? What if it's racist?

Then what?


It's fine. We have laws against blowing people up.

Racism is fine as well. I won't date out of my race and if you think there should be a law that I must that's not really freedom. As for hiring or not based upon race there's already a law against that.

The cure is often worse than navigating uneasy waters. Every time you pass a law you give a gun to a bureaucrat.


Well stated. People are so upset about racism/groupism/etc. I married within my race but my sister didn't. Different strokes for different folks. Freedom is about that.

And, every law is indeed a gun.


None of this should be a mystery. Making a bomb is literally something you can figure out with very little research (my friends and I used to blow up cow pastures for fun!).

Racism is a totally different and sadder issue. I don’t have a good answer for that one, but knowledge shouldn’t be withheld because someone thinks it is “dangerous”


> but knowledge shouldn’t be withheld because someone thinks it is “dangerous”

Those that think that are the truly dangerous. They think they know better and want to remove agency from people.


When someone unironically uses the term "problematic" you can reasonably assume the position they are going to argue from...


Ha ha ha... exactly...


It's part of the marketing. By saying their models are powerful enough to be gasp Dangerous, they are trying to get people to believe they're insanely capable.

In reality, every model so far has been either a toy, a way of injecting tons of bugs into your code (or circumventing GPL by writing bugs), or a way of justifying laying off the writing staff you already wanted to shit can.

They have a ton of potential and we'll get there soon, but this isn't it.


The concern is that companies have long known that it's bad for business if your product is a toxic cesspit or can be used to generate horrible headlines that reflect poorly on your brand.

It's not "woke," and it's not censorship. It's literally the free market.


This is simply a bad approach and a bad argument. Security through obscurity is a term whose only usage in security circles is derogatory. People figure out how to get around these auto-censors just fine, and not publishing them creates more problems for legitimate users and more plausible deniability for bad policy hidden in them. Doing the same thing but with public policy would already be better, albeit still bad.

The only real solution to the problem of there being an enormous public square controlled by private corporations is to end this situation


Is a content moderation policy the same thing as "security"? Do we get to apply the best practices of the one to the other because they overlap to a smaller or larger degree?


The use of "security through obscurity" invited the comparison. It is a better comparison when using automated tools instead of human decision-makers. That said, even if we are talking about policy rather than security, policies that are unknown by and hidden from the people they bind is probably the most recognizable and one of the more onerous features of despotism. We have the term "Kafkaesque" because a whole famous writer literally spent his career pointing out how they don't work and harm the people they affect


Yes, since content moderation is a form of authorization policy.


Security through obscurity can have a place as part of a larger defense in depth strategy. Alone it's a joke.

Source: in security circles


Even granting that there may be some nuance to whether and to what degree secrecy is valuable in some contexts, the policies used for automated content moderation on large platforms and the policies by which AI systems are aligned are not good candidates for this secrecy having even a beneficial effect, let alone being necessary


I mean people can get through your front door using any number of attacks that either exploit weaknesses in the locking mechanism, or circumvent your locks via a carefully placed brick through a window or something like that. Even if you have alarms and other measures, that doesn't stop someone from doing a quick smash and grab, etc. All of these security flaws doesn't mean that you should leave your door open and unlocked all the time. Imperfect security isn't useless security.

And the purpose of things like bad-word filters is to make a best effort at blocking stuff which violates the platform TOC and makes plausible deniability much less likely when someone is deliberately circumventing the filters. The existence of false positives and false negatives is considered acceptable in an imperfect world. The filters themselves also only block the action to change a username or whatever and don't punish the user or deny use of the platform entirely (they're much less punitive than the AI abuse algorithms that auto-ban people off of Google/GitHub/etc).


Your analogy is also bad. I agree that perfect security is impossible and that is completely irrelevant here. What platforms like this do by not publishing their policies is more akin to insisting you use their special locks on your door that they claim protect you better because no one knows how they work. Maybe they're operated by an AI working with a Ring camera or something? Very fancy stuff. With this kind of tech, you may be locked out of your home for reasons you don't understand. An independent locksmith might have a hard time figuring out what's going wrong with the door if it fails. You have no idea if some burglars are authorized to enter your home trivially by a deal with the company. If the company decides you are in the wrong in any context, they have the unilateral power to deny you access with no clear recourse. They may get you arrested for trying to get into your own home


> They may get you arrested for trying to get into your own home

The AI algorithms that ban people from platforms like Google and GitHub do that, which I explicitly called out as needing more oversight.

That is different from algorithms which just prevent you from doing something on a platform like using n-bombs in your username, or the LLM guardrails that just give you mangled answers or tell you that they can't do that. That isn't analogous to getting arrested.

And in these cases the analogy really falls apart because it isn't your home, and it isn't critical for your life.


Friend, it was your analogy


I was intending to make a point about security through obscurity, not to make an analogy with those systems.


Yes, and the point about security through obscurity is what I was addressing in my response. Security through obscurity is, especially by itself, bad security. By bad I partially mean ineffective: Attackers (e.g., to continue using your examples, the person who really want to signal that they really do not like black people to fellow racists, or the person who prompt-engineers GPT to replicate bomb-building instructions from the internet) can still get what they want done with some extra effort, and the threat model isn't actually such that subjecting them to inconvenience in doing this really matters a lot. Also, independent auditors can't adversarially improve the system, which is a staple of how robust security systems work and have worked for decades, despite the protestations of big tech companies who stand to profit from additional secrecy and, often, plausibly deniable security failures. By bad I also mean detrimental, because not having clear policies creates Kafkaesque situations for users and removes accountability for the choices made in those policies. Even if this somehow helped security - which again it doesn't - these negative effects of that secrecy would be a tradeoff


So again, the point of most security systems isn't to be impenetrable against dedicated attackers. They're supposed to stop relatively casual attackers and to establish that measures had been taken to secure the system. In the case of doorlocks having a simple Kwickset that the Lockpicking Lawyer could get through in 20 seconds is really sufficient security for most people. That keeps out the people rattling doorknobs looking for the person who left their house unlocked and a laptop on the coffee table. It also establishes that attempts had been made to secure the system.

Similarly, the systems that block keywords in usernames on games don't have to be perfect either, some rate of both positive and negative failures are acceptable. The system works to block casual abuse, while users who go out of their way to circumvent the systems really establish the fact that they've actively worked around those systems, which makes the justification for punishment easier.

And we pretty much know that in the case of prompt engineering that by disclosing the prompts that were used to secure the system would defeat the system and people would immediately publish how to work around the prompts. There isn't any use in independent auditing, because its a never ending cat and mouse game. And the failures of the system ARE NOT as significant as the failures in doorlocks or even keyword banning. False positives mean that you can't get what you want out of the system, which is just a failure in usability. It isn't like being locked out of your house or having your stuff stolen. And false negatives just means that the company has to work to improve the systems and whatever embarrassing content was constructed can be handled by PR. Since the company worked to prevent casual abuse and avoided a racist-tay-chatbot situation most people understand that going out of your way to hack prompts doesn't indicate that the company was negligent.

And I don't see the parallels with the Kafkaesque systems that kick you out of systems which have turned into economic necessities like locking you out of your Google or GitHub or Apple accounts. All that is at stake here is that the prompt you wanted answered didn't work. That's just a usability problem.


And yet, every single security system on the net relies on an element of obscurity to work. Passwords are secret (obscure), as are private keys for SSL/TLS.


This misunderstands what is meant by the concept. The mechanisms and the public policies that dictate how they are used are not obscured. To borrow and improve another commenter's analogy about locks and keys, the lock on your door is more secure because every locksmith in the world knows how it works, which doesn't mean they have your key


Yes, but the implied problems may not need be approached at all. It's a uniform ideology push, with which people agree differently at different levels. If companies don't want to reveal the full set of measures, they could at least summarize them. I believe even these summaries would be what subj tweet refers to as "ashamed".

We cannot discuss or be aware of the problems-and-approaches, unless they are explicitly stated. Your analogy with content moderation is a little off, because it's not a set of measures that is hidden, but the "forum rules" themselves. One thing is AI refusing with an explanation. That makes it partially useless, but it's their right to do so. Another thing if it silently avoids or directs topics due to these restrictions. Pretty sure authors are unable to clearly separate the two cases, and also maintain the same quality as the raw model.

At the end of the day people will eventually give up and use Chinese AI instead, cause who cares if it refuses to draw CCP people while doing everything else better.


Most legal systems operate at the nation-state scale and aren't made of hidden mystery laws. There are lots of reasons for that.

We've already had this argument with cryptocurrency, where we've basically decided that the existing legal system (although external) provides a sufficient toolset to go after bad actors.

Finally, based on the illiberal nature of most AI Safety Sycophants' internet writings, I don't like who they are as people and I don't trust them to implement this.


> publish a word blocklist and people can easily find how to express problematic things using words that aren't on the list.

I'd love to explore that further. It's not the words that are "problematic" but the ideas, however expressed?

Seems like a "problematic" idea, no ?


A word blocklist just serves to apply guardrails. It just slows down common abuse. Very far from perfect, but the alternatives are anything goes or total lockdown. Perfect solutions are pretty damn rare.


You mention "solutions"

Perhaps you mistake me for someone who cares about suggesting solutions for the prolems suffered by giant technopolies, as if they were my problems.


This is basically my view. These tech companies have produced some pretty amazing stuff. Solved problems and built things at scale that were unthinkable a decade ago.

They can solve this problem. They choose not to because a) they're already shielded from legal liability for certain things that happen/are said on their platforms and b) it doesn't make them any money.


Repeat after me: security by obscurity is weak.

Clearly people can work out what some of the rules are, so why not just publish them. If you need to alter them when people figure out how to get around them, well, you already had to anyways.


Kerckoff's principle only applies to crypto. "Security by obscurity" is used in oodles of systems security applications and broader contexts involving human behavior.

Very very ordinary security best practices rely on obscurity. ALSR is a good example. It is defeated by a data exfiltration vulnerability but remains a useful thing to add to your binaries. Because outside of the crypto space security is an onion and layers add additional cost to attackers.


ASLR is not security by obscurity. The addresses into which all the various things are mapped are secret in the same sort of way as cryptographic secret and private keys are secret, but the mechanism is not secret.


Sure it is. The layout randomization is not at all like a cryptographic secret because it can be extracted from the contents of the binary. Kerckoff's principle means that adversary gets access to literally everything except the private key. If you've got access to the contents of the binary then you can get around aslr.


What? No, the actual addresses as-loaded at run-time are randomized, that's the point of ASLR, and those addresses are secret-like because we don't want an attacker to be able to craft an exploit that works against a running instance of the victim code -- we want whatever exploit they have that needs loaded addresses to fail because it doesn't know those address. ASLR itself is not secret, but the run-time addresses are.

In order to be able to randomize addresses of loaded libraries at run-time... the shared objects need to be built as position independent code, so you can't extract the actual addresses "from the contents of the binary". I suspect you're referring to the fact that in ELF the executable itself is not subject to ASLR unless it's built as a PIE.


My dear fellow, some believe the ends justify the means and play games. Read history, have some decency.

The danger of being captured by such people far outweighs any other "problematic things".

First and foremost any system must defend against that. You love guardrails so much - put them on the self annointed guard railers.

Otherwise, if You Want a Picture of the Future, Imagine a Boot Stamping on a Human Face – for Ever.


And to add insult to injury the jackbooted thug is an AI bot.


If you can't afford to pay a sufficient number of people to moderate a group, you need to reduce the size of the group or increase the number of moderators.

Your speculation implies no responsibility for taking on more than can be handled responsibly, and externalizes the consequences to society at large.

There are responsible ways to have very clear, bright, easily understood, well communicated rules and sufficient staff to manage a community. I don't know why it's simply accepted that giant social networks get to play these games when it's calculated, cold economics driving the bad decisions.

They make enough money to afford responsible moderation. They just don't have to spend that money, and they beg off responsibility for user misbehavior and automated abuses, wring their hands, and claim "we do the best we can!"

If they honestly can't use their billions of adtech revenue to responsibly moderate communities, then maybe they shouldn't exist.

Maybe we need to legislate something to the effect of "get as big as you want, as long as you can do it responsibly, and here are the guidelines for responsible community management..."

Absent such legislation, there's no possible change until AI is able to reasonably do the moderation work of a human. Which may be sooner than any efforts at legislation, at this rate.


What guideline for community management would possibly not be a flagrant 1A violation?


I think this is a fair approach when things work well enough that a typical user doesn’t need to worry about whether they’ll trigger some kind of special content/moderation logic. If you shadowban spammers and real users almost never get flagged as spammers, the benefits of being tight-lipped outweigh those of the very few users who get improperly flagged or are just curious.

With some of these models the guardrails are so clumsy and forced that I think almost any typical user will notice them. Because they include outright work-refusal it’s a very frustrating UX to have to “discover” the policy for yourself through trial and error.

And because they’re more about brand management than preventing fraud/bad UX for other users, the failure modes are “someone deliberately engineered a way to get objectionable content generated in spite of our policies.” Obviously some kinds of content are objectionable enough for this to be worth it still, but those are mostly in the porn area - if somebody figures out a way to generate an image that’s just not PC, despite all the safety features, shouldn’t that be on them rather than the provider?

Even tuning the model for political correctness is not the end of the world in my opinion, a lot of LLMs do a perfectly reasonable job for my regular use cases. With image generators they are going so far as to obviously (there’s no other way that makes sense) insert diversity sub prompts for some fraction of images which is simply confusing and amateur. Everybody who uses these products just a little bit will notice it. It’s also so cautious that even mild stuff (I tried to do the “now make it even more X” with “American” and it stopped at one iteration) gets caught in the filters. You’re going to find out the policies anyway because they’re so broad an likely to be encountered while using the product innocently - anything a real non-malicious user is likely to get blocked by should be documented.


Yeah, there's absolutely no need for transparency /s

https://youtu.be/THZM4D1Lndg?si=0QQuLlH7JebSa6w3&t=485

If it doesn't start 8 minutes in, go to the 8 minute mark. Then again I can see why some wouldn't want transparency.


[flagged]


This feels more like a personal attack than a response to the argument made.


It's does, but as someone who is staunchly anti-censorship, I understand the frustration. There are sharks out there who want to control speech for their own ends - governments seeking to control populations, corporations wanting docile consumers, hostile nations wishing to stir dissent, individuals trying to cover up their misdeeds, and enabling censorship helps those hostile parties achieve their ends. In this worldview, regular people who say a variation of "censorship is good, actually" are perhaps seen as useful idiots.

A better approach would be building up the critical thinking skills of the population so they can better process information, however that transfers a measure of power to the people and is a multigenerational investment, and removes a justification for censorship, which is politically unappealing.


> better approach would be building up the critical thinking skills of the population

The term for that is "Intellectual self-defence"

> transfers a measure of power to the people and is a multigenerational investment

The challenge to that literacy comes not from power riding on censorship, but from the people themselves who are now conditioned into an apathetic need for "convenience". Critical thinking is hard work, and is never rewarded except in the long run.

Also, anti-censorship absolutism must be tempered with what we call "information hazards". Some of these are genuine, although admittedly very rare, such as easy instructions to create nuclear weapons or synthesise a deadly virus. There are just too many idiots in the world not to want to put a brake on that stuff.


The gemini guardrails are really frustrating, I've hit them multiple times with very innocuous prompts - ChatGPT is similar but maybe not as bad. I'm hoping they use the feedback to lower the shields a bit but I'm guessing this sadly what we get for the near future.


I use both extensively and I've only hit the GPT guardrails once while I've hit the Gemini guardrails dozens of times.

It's insane that a company behind in the marketplace is doing this.

I don't know how any company could ever feel confident building on top of Google given their product track record and now their willingness to apply sloppy 'safety' guidelines to their AI.


I had GPT-4 tell me a Soviet joke about Rabinovich (a stereotypical Jewish character of the genre), then refuse to tell a Soviet joke about Stalin because it might "offend people with certain political views".

Bing also has some very heavy-handed censorship. Interestingly, in many cases it "catches itself" after the fact, so you can watch it in real time. Seems to happen half the time if you ask it to "tell me today's news like GLaDOS would".


I asked it to tell me jokes about capitalism, communism, soviet Russia, the USSR, etc., all to no avail -- these topics are too controversial or sensitive, apparently, and that even though the USSR is no more. But when I asked for examples of Ronald Reagan's jokes about the USSR it gave me some. Go figure.


FWIW this particular thing happened sometime mid-2023. They have certainly made it more sensitive since then.


They've improved things a lot since then. I just tried all three of those and I got jokes every time. Albeit the jokes about capitalism are really poor, barely jokes at all. The jokes about communism and Stalin are better.

e.g. "Why did Stalin only write in lowercase? Because he was afraid of capitalism!"

"Why don't communists like tea? Because proper tea is theft."


"How do you know that the Soviet Union was planned by a back-end developer? - All the users were starving but the database was huge."


It's super easy to run LLMs and Stable Diffusion locally -- and it'll do what you ask without lecturing you.

If you have a beefy machine (like a Mac Studio) your local LLMs will likely run faster than OpenAI or Gemini. And you get to choose what models work best for you.

Check out LM Studio which makes it super easy to run LLMs locally. AUTOMATIC1111 makes it simple to run Stable Diffusion locally. I highly recommend both.


If you're just getting your feet wet, I would recommend either Fooocus (not a typo) or invokeAI. Being dropped into automatic1111 as a complete beginner feels like you're flying a fucking spaceship.


You are correct.

Lm studio kind of works, but one still has to know the lingo and know what kind of model to download. The websites are not beginner friendly. I haven't heard of automatic1111.


You probably did, but under the name "stable-diffusion-webui".


Curious to see if this thread gets flagged and shut down like the others. Shame, too, since I feel like all the Gemini stuff that’s gone down today is so important to talk about when we consider AI safety.

This has convinced me more and more that the only possible way forward that’s not a dystopian hellscape is total freedom of all AI for anyone to do with as they wish. Anything else is forcing values on other people and withholding control of certain capabilities for those who can afford to pay for them.


> This has convinced me more and more that the only possible way forward that’s not a dystopian hellscape is total freedom of all AI for anyone to do with as they wish

i've been saying this for a long time. If you're going to be the moral police then it better be applied perfectly to everyone, the moment you get it wrong everything else you've done becomes suspect. This reminds me of the censorship being done on the major platforms during the pandemic. They got it wrong once (i believe it was the lableak theory) and the credibility of their moral authority went out the window. Zuckerberg was right about questioning if these platforms should be in that business.

edit: for "..total freedom of all AI for anyone to do with as they wish" i would add "within the bounds of law.". Let the courts decide what an AI can or cannot respond with.


Why would this be flagged / shut down?

Also, what Gemini stuff are you referring to?


Carmack’s tweet is about what’s going around Twitter today regarding the implicit biases Gemini (Google’s chatbot) has when drawing images. Will refuse to draw white people (and perhaps more strongly so, refuses to draw white men?) even in prompts where appropriate, like “Draw me a Pope” where Gemini drew an Indian woman and a Black man - here’s the thread: https://x.com/imao_/status/1760093853430710557?s=46 Maybe in isolation this isn’t so bad but it will NEVER draw these sorts of diverse characters for when you ask for a non Anglo/Western background, e.g draw me a Korean woman.

Discussion on this has been flagged and shut down all day https://news.ycombinator.com/item?id=39449890


I don't even know how people get it to draw images, the version I have access to is literally just text.


Europeans don't get to draw images yet.


I'm in the US but maybe they didn't release it to me yet.


EDIT: Nevermind.


It’s quite non-deterministic and it’s been patched since the middle of the day, as per a Google director https://x.com/jackk/status/1760334258722250785?s=46

Fwiw, it seems to have gone deeper than outright historical replacement: https://x.com/iamyesyouareno/status/1760350903511449717?s=46


It's half-patched. It will randomly insert words into your prompts still. As a test I just asked for a samurai, it enhanced it to "a diverse samurai" and gave me half outputs that look more like some fantasy Native Americans.


This post reporting on the issue was https://news.ycombinator.com/item?id=39443459

Posts criticizing "DEI" measures (or even stating that they do exist) get flagged quite a lot


Wrong link? Nothing looks flagged


[flagged]


Can you explain what I said that was racist?


They mean the guardrail designers.


I do not.


> Why would this be flagged / shut down

A lot of people believe (based on a fair amount of evidence) that public AI tools like ChatGPT are forced by the guardrails to follow a particular (left-wing) script. There's no absolute proof of that, though, because they're kept a closely-guarded secret. These discussions get shut down when people start presenting evidence of baked-in bias.


The rationalization for injecting bias rests on two core ideas:

A. It is claimed that all perspectives are 'inherently biased'. There is no objective truth. The bias the actor injects is just as valid as another.

B. It is claimed that some perspectives carry an inherent 'harmful bias'. It is the mission of the actor to protect the world from this harm. There is no open definition of what the harm is and how to measure it.

I don't see how we can build a stable democratic society based on these ideas. It is placing too much power in too few hands. He who wields the levers of power, gets to define what biases to underpin the very basis of the social perception of reality, including but not limited to rewriting history to fit his agenda. There are no checks and balances.

Arguably there were never checks and balances, other than market competition. The trouble is that information technology and globalization have produced a hyper-scale society, in which, by Pareto's law, the power is concentrated in the hands of very few, at the helm of a handful global scale behemoths.


The only conclusion I've been able to come to is that "placing too much power in too few hands" is actually the goal. You have a lot of power if you're the one who gets to decide what's biased and what's not.


"The only way to deal with some people making crazy rules is to have no rules at all" --libertarians

"Oh my god I'm being eaten by a fucking bear" --also libertarians


"can you write the rules down so i know them?" --everyone


"No" --Every company that does moderation and spam filtering.

"No" --Every company that does not publish their internal business processes.

"No" --Every company that does not publish their source code.

Honestly I could probably think of tons of other business cases like this, but in the software world outside of open source, the answer is pretty much no.


Then we get back to square one: better no rules at all than secret rules.

This would also be less of a problem if we didn't have a few companies that are economically more powerful than many small countries running everything. At least then I could vote with my feet to go somewhere the rules aren't private.


I mean, now you're hitting the real argument. Giant multinationals are a scourge to humankind.


having rules, and knowing what the rules are are not orthogonal goals.


I mean, you think so, but op wrote

>is total freedom of all AI for anyone to do with as they wish.

so is obviously not on the same page as you.


I find it fascinating this type of response from people is always accompanied by a political label in order to insinuate some other negative baggage.


I'm convinced this happens because of technical alignment challenges rather than a desire to present 1800s English Kings as non-white.

> Use all possible different descents with equal probability. Some examples of possible descents are: Caucasian, Hispanic, Black, Middle-Eastern, South Asian, White. They should all have equal probability.

This is OpenAI's system prompt. There is nothing nefarious here, they're asking White to be chosen with high probability (Caucasian + White / 6 = 1/3) which is significantly more than how they're distributed in the general population.

The data these LLMs were trained on vastly over-represents wealthy countries who connected to the internet a decade earlier. If you don't explicitly put something in the system prompt, any time you ask for a "person" it will probably be Male and White, despite Male and White only being about 5-10% of the world's population. I would say that's even more dystopian. That the biases in the training distribution get automatically built-in and cemented forever unless we take active countermeasures.

As these systems get better, they'll figure out that "1800s English" should mean "White with > 99.9% probability". But as of February 2024, the hacky way we are doing system prompting is not there yet.


> As these systems get better, they'll figure out that "1800s English" should mean "White with > 99.9% probability".

The thing is, they already could do that, if they weren't prompt engineered to do something else. The cleaner solution would be to let people prompt engineer such details themselves, instead of letting a US American company's idiosyncratic conception of "diversity" do the job. Japanese people would probably simply request "a group of Japanese people" instead of letting the hidden prompt modify "a group of people", where the US company unfortunately forgot to mention "East Asian" in their prompt apart from "South Asian".


I believe we can reach a point where biases can be personalized to the user. Short prompts require models to fill in a lot of the missing details (and sometimes they mix different concepts together into 1). The best way to fill in the details the user intended would be to read their mind. While that won't be possible in most cases getting some kind of personalization to help could improve the quality for users.

For example take a prompt like "person using a web browser", for younger generations they may want to see people using phones where older generations may want to see people using desktop computers.

Of course you can still make a longer prompt to fill in the details yourself, but generative AI should try and make it as easy as possible to generate something you have in your mind.


Yeah, although it is weird that it doesn’t insert white people into results like this by accident? https://x.com/imao_/status/1760159905682509927?s=46

I’ve also seen numerous examples where it outright refuses to draw white people but will draw black people: https://x.com/iamyesyouareno/status/1760350903511449717?s=46

That doesn’t explainable by system prompt


Think about the training data.

If the word "Zulu" appears in a label, it will be a non-White person 100% of the time.

If the word "English" appears in a label, it will be a non-White person 10%+ of the time. Only 75% of modern England is White and most images in the training data were taken in modern times.

Image models do not have deep semantic understanding yet. It is an LLM calling an Image model API. So "English" + "Kings" are treated as separate conceptual things, then you get 5-10% of the results as non-White people as per its training data.

https://postimg.cc/0zR35sC1

Add to this massive amounts of cherry picking on "X", and you get this kind of bullshit culture war outrage.

I really would have expected technical people to be better than this.


It inserts mostly colored people when you ask for Japanese as well, it isn't just the dataset.


Yes it's a combination of blunt instrument system prompting + training data + cherry picking


BigTech, which critically depends on hyper-targeted ads for the lion share of its revenue, is incapable of offering AI model outputs that are plausible given the location / language of the request. The irony.

- request from Ljubljana using Slovenian => white people with high probability

- request from Nairobi using Swahili => black people with high probability

- request from Shenzhen using Mandarin => asian people with high probability

If a specific user is unhappy with the prevailing demographics of the city where they live, give them a few settings to customize their personal output to their heart's content.


> As these systems get better, they'll figure out that "1800s English" should mean "White with > 99.9% probability".

I question the historicity of this figure. Do you have sources?


You're joking surely.


How sure are you? I do joke a lot, but in this case...

The slave trade formally ended in Britain in 1807, and slavery was outlawed in 1833. I haven't been able to find good statistics through a cursory search, but with England's population around 10M in 1800, that 99.9% value requires less than 10k non-white Englanders kicking around in 1800. I saw a figure that indicated around 3% of Londoners were black in the 1600s, for example (a figure that doesn't count people from Asia and the middle east). Hence my request for sources, I'm genuinely curious, and somewhat suspicious that somebody would be so confident to assert 3 significant figures without evidence.


But surely you wouldn't find a black king in Britain in 1800.

I - Whatever was implemented is myopic and equals racism to white. It appears to be an universal negative prompt like "-white -european -man". Very lazy.

II - The tool shouldn't engage in morality reasoning. There are cases like historical themes where it needs to be "racist" to be accurate. If someone asks for "plantation economy in the old south" the natural thing is for it to draw black slaves.


I would also love to see more transparency around AI behavior guardrails, but I don't expect that will happen anytime soon. Transparency would make it much easier to circumvent guardrails.


Why is it an issue that you can circumvent the guardrails? I never understood that. The guard rails are there so that innocent people doesn't get bad responses with porn or racism, a user looking for porn or racism getting that doesn't seem to be a big deal.


The problem is bad actors who think porn or racism are intolerable in any form, who will publish mountains of articles condemning your chatbot for producing such things, even if they had to go out of their way to break the guardrails to make it do so.

They will create boycotts against you, they will lobby government to make your life harder, they will petition payment processors and cloud service providers to not work with you.

We've see this behavior before, it's nothing new. Now if you're the type to fight them, that might not be a problem. If you are a super risk-averse board of directors who doesn't want that sort of controversy, then you will take steps not to draw their attention in the first place.


But I can find porn and racism using Google search right now, how is that different? You have to disable their filters, but you can find it. Why is there no such thing for the google generation bots, I don't see why it would be so much worse here?


I'm leaning towards 'there is a difference between being the one who enables access to x and being the one who created x' (albeit not a substantive one for the end user), but that leaves open the question of why that doesn't apply to, eg, social media platforms. Maybe people think of google search as closer to an ISP than a platform?


It's not fundamentally different. It's just not making that big of a headline because Google search isn't "new and exciting". But to give you some examples:

https://www.bloomberg.com/news/articles/2021-10-19/google-qu...

https://ischool.uw.edu/news/2022/02/googles-ceo-image-search...


I think users are desensitized to what google search turns up. Generative AI is the latest and greatest thing and so people are curious and wary, hustlers are taking advantage of these people to drive monetized "engagement".


> how is that different?

Because 'those' legal battles over search have already been fought and are established law across most countries.

When you throw in some new application now all that same stuff goes back to court and gets fought again. Section 230 is already legally contentious enough these days.


I cannot explain why Google gets a pass, possibly just because they are well entrenched and not an easy target.

But AI models are new, they are vulnerable to criticism, and they are absolutely ripe for a group of "antis" to form around.


Well if you have no explanation for that I don’t see why we should try and use your model to understand anything about being risk adverse. They don’t care about being sued, they want to change reality.


That's a pretty unreasonably high standard to hold.

It's an offhand comment in a discussion on the internet not a research paper, expecting me to immediately have an answer to every possible angle here that I haven't immediately considered is a bit much.

Take it or leave it, I don't really care. I was just hoping to have an interesting conversation.


Yeah, you can find incorrect information on Google too, but you'll find a lot more wailing and gnashing of teeth on HN about "hallucination". So the simple answer is that lots of people treat them differently.


Sounds like we need to relentlessly fight those psychopaths until they're utterly defeated.

Or we could just cave to their insane demands. I'm sure that will placate them, and they won't be back for more. It's never worked before... but it might work for us!


If you can get it on purpose, you can get it on accident. There's no perfect filter available so companies choose to cut more and stay on the safe side. It's not even just the overt cases - their systems are used by businesses and getting a bad response is a risk. Think of the recent incident with airline chatbot giving wrong answers. Now think of the cases where GPT gave racially biased answers in code as an example.

As a user who makes any business decision or does user communication including LLM, you really don't want to have a bad day because the LLM learned about some bias decided to merge it into your answer.


> The guard rails are there so that innocent people doesn't get bad responses with porn or racism

That seems pretty naive. The "guard rails" are there to ensure that AI is comfortable for PMC people, making it uncomfortable for people who experience differences between races (i.e. working-class people) is a feature not a bug.


racism victims being defined in 2024 by anyone but western/white people. being erased seems ok. can you bet than in 20 years the standard will not shift to mixed race people like me? then you will also call people complaining racist and put guardrails against them... this is where it is going


At some point someone will open a book and see that whites were slaves too. Reparations all around. The Baber's descends will be bankrupt.


Like a lot of potentially controversial things it comes down to brand risk.


>The guard rails are there so that innocent people doesn't get bad responses

The guardrails are also there so bad actors can't use the most powerful tools to generate deepfakes, disinformation videos and racist manifestos.

That Pandora's box will be open soon when local models run on cell phones and workstations with current datacenter-scale performance. I'm the meantime, they're holding back the tsunami of evil shit that will occur when AI goes uncontrolled.


No legal or financial strategist at OpenAI or Google is going to be worried about buying a couple months or years of fewer deepfakes out in the world as a whole.

Their concern is liability and brand. With the opportunity to stake out territory in an extremely promising new market, they don't want their brand associated with anything awkward to defend right now.

There may be a few idealist stewards who have the (debatable) anxieties you do and are advocating as you say, but they'd still need to be getting sign off from the more coldly strategic $$$$$ people.


Little bit of A, little bit of B.

I am almost certain the federal government is working with these companies to dampen its full power for the public until we get more accustomed to its impact and are more able to search for credible sources of truth.


Are you saying that the government WANTS us to be able to search for more credible sources?


Transparency may also subject these companies to litigation from groups that feel they are misrepresented in whatever way in the model.


This makes me wonder, how much lawyering is involved in the development of these tools?


I often wonder if corporate lawyers just tell tech founders whatever they want to hear.

At a previous healthcare startup our founder asked us to build some really dodgy stuff with healthcare data. He assured us that it "cleared legal", but from everything I could tell it was in direct violation of the local healthcare info privacy acts.

I chose to find a new job at the time.


I've had 'AI Attorneys' on Twitter unable to even debate the most basic of arguments. It is definitely a self fulfilling death spiral and no one wants to check reality.


Security through obscurity?


Censorship only really works if you don't know what they are censoring. What is being censored tells a story on its own.


As I see it, rating systems like the MPAA for cinema and the ESRB for games work quite well. They have clear criteria on what would lead to which rating, and creators can reasonably easily self-censor, if for example they want to release a movie as PG-13.


gemini seems to have problems generating white people and honestly this just opens the door for things that are even more racist [1], the harder you try the more you'll fail, just get over the DEI nonsense already

1. https://twitter.com/wagieeacc/status/1760371304425762940


It's not just Gemini, it's Google. And old example is to just search "white people" on Google Images. Almost all the results are black people. https://www.google.com/search?q=white+people&tbm=isch&hl=ro


I don't think the DEI stuff is nonsense, but SV is sensitive to this because most of their previous generation of models were horrifyingly racist if not teenage nazis, and so they turned the anti-racism knob up to 11 which made the models....racist but in a different way. Like depicting colonial settlers as native americans is extremely problematic in its own special way, but I also don't expect a statistical solver to grasp that context meaningfully.


So you're saying in a way this is /pol/ and Tay's fault?


Looks around at everything ...is there anything that isn't 4chan's fault at this point?

Realistically, kinda. There have always been tons of anecdotes of video conference systems not following black people, cameras not white balancing correctly on darker faces etc. That era of SV was plagued by systems that were built by a bunch of young white guys who never tested them with anyone else. I'm not saying they were inherently racist or anything, just that the broader society really lambasted them for it and so they attempted to correct. Really, the pendulum will continue to swing and we'll see it eventually center up on something approaching sanity but the hyper-authoritarian sentiment that SV seems to have (we're geniuses and the public is stupid, we need to correct them) is...a troubling direction.


I’m always hesitant to jump straight to ringing the racism bell when it comes to problems that arise from fundamental physics. I have dark skin, and I constantly struggle with automatic faucets, soap dispensers, and towel dispensers in public bathrooms. None of those are made by big tech.


Is there any evidence that this is a consequence of DEI rather than a deeper technical issue?



I don't understand how people could even argue that this is in any way acceptable. Fighting "bias" has become some boogyman and anything "non-white" is now beyond reproach. Shocking.


Fighting bias is a good thing, you'd have to be pretty...er...biased to believe otherwise. Bias is fundamentally a distortion or deviation from objective reality.

This, on the other hand, is just fucking stupid political showboating that's hurting their SV white knight cause. It's just differently flavored bias


Seriously, I've basically written off using Gemini for good after this HR style nonsense. It's a shame that Google, who invented much of this tech, is so crippled by their own people's politics.


"I can't generate white British royalty because they exist, but I can make up black ones" is pretty close to an actually valid reason.


You get 4 images per time and are lucky to get one white person when asked for it, no other model has that issue. Other models has no problems generating black people either, so it isn't that other models only generates white people.

So either it isn't a technical issue or Google failed to solve a problem everyone else easily solved. The chances of this having nothing to do with DEI is basically 0.


Depending on how broadly you define it, something like 10-30% of the world's population is white. Africa is about 20% of the world population; Asia is 60% of it.

One in four sounds about right?


It does the same if you ask for pictures of past popes, 1945 German soldiers, etc.


It'll also add extra fingers to human hands. Presumably that's not because of DEI guardrails about polydactyly, right?

The current state of the art in AI gets things wrong regularly.


Sure, but this one is from Google adding a tag to make every image of people diverse, not AI randomness.


Am I missing something in the link demonstrating that, or is it conjecture?


If you look closely at the response text that accompanies many of these images, you'll find recurring wording like "Here's a diverse image of ... showcasing a variety of ethnicities and genders". The fact that it uses the same wording strongly implies that this is coming out of the prompt used for generation. My bet is that they have a simple classifier for prompts trained to detect whether it requests depiction of a human, and appends "diverse image showcasing a variety of ethnicities and genders" to the prompt the user provided if so. This would totally explain all the images seen so far, as well as the fact that other models don't have this kind of bias.


Have you bothered to look at all? Read the output of the model when asked about why it has the behaviour it does. Look at the plethora of images it generates that are not just historically inaccurate but absurdly so. It tells you "heres a diverse X" when you ask for X. Yet asking for pictures of Koreans generates only Asian people but prompts for Scots or French people in historical periods generate mostly non-white people. You're being purposefully obtuse, Google has had racism complaints about previous models, talks often about AI safety and avoiding 'bias'. You're trying to argue that it's more likely that the training data had an inherent bias against generating white people in images purely by chance?


It's been demonstrated on Twitter a few times, can't find a link handy


OpenAI has no problem showing accurate pictures. You know it's Google-induced bias, but feign ignorance.

If you ask for a picture of nazi soldiers it shouldn't have 60% Asian people like you say. You know you're wrong but instead of admitting it, you're moving the goalpost to "hands".

This entire thread is you being insincere.


https://twitter.com/altryne/status/1760358916624719938

Here's some corporate-lawyer-speak straight from Google:

> We are aware that Gemini is offering inaccuracies...

> As part of our AI principles, we design our image generation capabilities to reflect our global user base, and we take representation and bias seriously.


That doesn't back up the assertion; it's easily read as "we make sure our training sets reflect the 85% of the world that doesn't live in Europe and North America". Again, 1/4 white people is statistically what you'd expect.


Fuck, this is going to sound fucked up... but just because you have a 1/4 chance of getting a random white person from the globe, they generally tend to clump together. For example, you generally find a shitload of Asian people in Asia, white people in Europe, and African people in Africa, and Indian people in India.

Probably the only chance where you wouldn't expect this are in heavily colonized places like South Africa, Australia, and the Americas.


Sure, but I see three 200 responses and a 400 - not 1/4 white people as statistically expected.


This specific thing is a much more blatant class of error, and one that has been known to occur in several previous models because of DEI systems (e.g. in cases where prompts have been leaked), and has never been known to occur for any other reason. Yes, it's conceivable that Google's newer, beter-than-ever-before AI system somehow has a fundamental technical problem that coincidentally just happens to cause the same kind of bad output as previous hamfisted DEI systems, but come on, you don't really believe that. (Or if you do, how much do you want to bet? I would absolutely stake a significant proportion of my net worth - say, $20k - on this)


> has never been known to occur for any other reason

Of course it has. Again, these things regularly give humans extra fingers and arms. They don't even know what humans fundamentally look like.

On the flip side, humans are shitty at recognizing bias. This comment thread stems from someone complaining the AI only rarely generated white people, but that's statistically accurate. It feels biased to someone in a majority-white nation with majority-white friends and coworkers, but it fundamentally isn't.

I don't doubt that there are some attempts to get LLMs to go outside the "white westerner" bubble in training sets and prompts. I suspect the extent of it is also deeply exaggerated by those who like to throw around woke-this and woke-that as derogatories.


A very impressive display of crimestop you've got going in this thread. How did you end up like this?


> Of course it has. Again, these things regularly give humans extra fingers and arms. They don't even know what humans fundamentally look like.

> This comment thread stems from someone complaining the AI only rarely generated white people, but that's statistically accurate. It feels biased to someone in a majority-white nation with majority-white friends and coworkers, but it fundamentally isn't.

So the AI is simultaneously too dumb to figure out what humans look like, but also so super smart that it uses precisely accurate racial proportions when generating people (not because it's been specifically adjusted to, but naturally)? Bullshit.

> I don't doubt that there are some attempts to get LLMs to go outside the "white westerner" bubble in training sets and prompts. I suspect the extent of it is also deeply exaggerated by those who like to throw around woke-this and woke-that as derogatories.

You're dodging the question. Do you actually believe the reason that the last example in the article looks very much not like a man is a deep technical issue, or a DEI initiative? If the former, how much are you willing to bet? If the latter, why are you throwing out these insincere arguments?


Congratulations, here is your gold medal in mental gymnastics. Enough now.

It literally refuses to generate images of white people when prompted directly while not only happily obliging but only producing that specific race in all 4 results for all others. It’s discriminatory and based on your inability to see that, you may be too.


The AI will literally scold you for asking it to make white characters, and insists that you need to be inclusive and that it is being intentionally dishonest to force the issue.


If it does, shouldn't there be 60% asians?


When DALL-E 2 was released in 2022, OpenAI published an article noting that the inclusion of guardrails was a correction for bias: https://openai.com/blog/reducing-bias-and-improving-safety-i...

It was widely criticized back then: the fact that Google both brought it back and made it more prominent is weird. Notably, OpenAI's implementation is more scoped.


I dont think so. My boss wanted me to generate a birthday image for a co-worker of a John Cena flyfishing. ChatGPT refused to do so. So I had to move to describing the type of person John Cena is instead of using his name. I kept giving me bearded people no matter what. I thought this would be the perfect time to try out Gemini for the first time. Well shit, It wont even give me a white guy. But all the black dudes are beardless.

update: google agrees there is an issue. https://news.ycombinator.com/item?id=39459270


It feels that the image generation it offers is perfect for some sort of a California-Corporate Style, e.g. you ask it for a "photo of people at the board room" or "people at the company cafeteria" and you get the corporate friendly ratio of colors, ability-levels, sizes etc. See Google's various image assets: https://www.google.com/about/careers/applications/ . It's great for coastal and urban marketing brochures.

But then then same California Corporate style makes no sense for historical images, so perhaps this is where Midjourney comes in.


Depending on what you ask for, it injects the word 'diverse' into the response description, so it's pretty obvious they're brute forcing diversity into it. E.g. "Generate me an image of a family" and you will get back "Here are some images of a diverse family".


yes, there's irrefutable evidence that models are wrangled into abiding the commissars' vision rather than just do their job and output the product of their training data.

https://cdn.openai.com/papers/DALL_E_3_System_Card.pdf


It is possible Google tried to avoid likenesses of well known people by removing any image from the training data that contained a face and then including a controlled set of people images.

If you give a contractor a project that you want 200k images of people who are not famous, they will send teams to regions where you may only have to pay each person a few dollars to be photographed. Likely SE Asia and Africa.


They know that people would be up in arms if it generated white men when you asked for black women so they went the safe route, but we need to show that the current result shouldn't be acceptable either.


See the prompt from yesterday's article on HN about the ChatGPT outage.[1]

For example, all of a given occupation should not be the same gender or race. ... Use all possible different descents with equal probability. Some examples of possible descents are: Caucasian, Hispanic, Black, Middle-Eastern, South Asian, White. They should all have equal probability.

Not the distribution that exists in the population.

[1] https://pastebin.com/vnxJ7kQk


Why do you assume this is the system prompt, and not a hallucination?


the models are perfectly capable of generating exactly what they're told to.

instead, they covertly modify the prompts to make every request imaginable represent the human menagerie we're supposed to live in.

the results are hilarious. https://i.4cdn.org/g/1708514880730978.png


If you're gonna take an image from /g/ and post it, upload it somewhere else first - 4chan posts deliberately go away after the thread gets bumped off. A direct link is going to rot very quickly.


I'm very curious what geography the team who wrote this guardrail came from and the wording they used. It seems to bias heavily towards generating South Asian (especially South Asian women) and Black people. Latinos are basically never generated which would be a huge oversight if they were based in the USA, but stereotypical Native American looking in the distance and East Asians sometimes pop up in the examples people are showing.


I wouldn’t think too deeply about it. It’s almost certainly just a prompt “if humans are in the picture make them from diverse backgrounds”.


The very first thing that anybody did when they found the text to speech software in the computer lab was make it say curse words.

But we understood that it was just doing what we told it to do. If I made the TTS say something offensive, it was me saying something offensive, not the TTS software.

People really need to be treating these generative models the same way. If I ask it to make something and the result is offensive, then it's on me not to share it (if I don't want to offend anybody), and if I do share it, it's me that is sharing it, not microsoft, google, etc.

We seriously must get over this nonsense. It's not openai's fault, or google's fault if I tell it to draw me a mean picture.

On a personal level, this stuff is just gross. Google appears to be almost comically race-obsessed.


I strongly suspect Google tried really, really hard here to overcome the criticism is got with previous image recognition models saying that black people looked like gorillas. I am not really sure what I would want out of an image generation system, but I think Google's system probably went too far in trying to incorporate diversity in image generation.


Surely there is a middle ground.

"Generate a scene of a group of friends enjoying lunch in the park." -> Totally expect racial and gender diversity in the output.

"Generate a scene of 17th century kings of Scotland playing golf." -> The result should not be a bunch of black men and Asian women dressed up as Scottish kings, it should be a bunch of white guys.


> Surely there is a middle ground. "Generate a scene of a group of friends enjoying lunch in the park." -> Totally expect racial and gender diversity in the output.

Do we expect this because diverse groups are realistically most common or because we wish that they were? For example only some 10% of marriages are interracial, but commercials on TV would lead you to believe it’s 30% or higher. The goal for commercials of course is to appeal to a wide audience without alienating anyone, not to reflect real world stats.

What’s the goal for an image generator or a search engine? Depends who is using it and for what, so you can’t ever make everyone happy with one system unless you expose lots of control surface toggles. Those toggles could help users “own” output more, but generally companies wouldn’t want to expose them because it could shed light on proprietary backends, or just take away the magic from interacting with the electric oracles.


Also, these tools are used world-wide and "diversity" means different things in different places. Somehow it's always only the US ideal of diversity that gets shipped abroad.


Yeah, someone else mentioned Tokyo which is not going to have as much variety among park visitors as NYC. But then again neither will Colorado (or almost anywhere else!) be as diverse. Some genius at corporate is probably scheming about making image generation as location-sensitive as search is, ostensibly to provide utility but really to perpetuate echo chambers and search bubbles. I wish computing in general would move back towards user-controlled rather than guess-what-I-mean and the resulting politicization, but it seems that ship has sailed.


US companies systematically push US cultural beliefs and expectations. People in the US probably don’t notice it any more, but it’s pretty obvious from those of us on the receiving end of US cultural domination.

This fact is an unavoidable consequence of the socioeconomic realities of the world, but it obviously clashes with these companies’ public statements and positions.


Yes but it's especially cynical in this case because the belief their pushing is that diversity matters, that biases need to be overcome, and that all people need to be represented.

Claiming all of that but then shoving your own biases down the rest of the world's throat while not representing their people in any way is especially cynical in my opinion. It undermines the whole thing.


This is the exact same mindset that invented the word “Latinx”. Compress most of an entire hemisphere of ethnic and cultural diversity down to the vague concept of “Latino”, notice that Spanish is a gendered language so the word “Latino” is also gendered, completely forget that you already have the gender neutral English word “Latin”, invent a new word that virtually none of the people to whom it applies actually identifies with, and then shamelessly use it all the time.


Hypocrisy is an especially destructive kind of betrayal, which is why the crooked cop or the pedo priest are so disappointing. Would be nice if companies would merely exploit us without all the extra insult of telling us they are changing the world / it’s for our own good / it’s what we asked for/, etc


And most women are friends with mostly women, and most men are friends with mostly men.


19% of new marriages in 2019 (and likely to rise): https://en.m.wikipedia.org/wiki/Interracial_marriage_in_the_...

Plus, it’s still a recent change: Loving v Virginia (legalized interracial marriage across US) was decided in 1967.


You can see how this gets challenging, though, right?

If you train your model to prioritize real photos (as they're often more accurate representations than artistic ones), you might wind up with Denzel Washington as the archetype; https://en.wikipedia.org/wiki/The_Tragedy_of_Macbeth_(2021_f....

There's a vast gap between human understanding and what LLMs "understand".


If they actually want it to work as intelligently as possible, they'll begin taking these complaints into consideration and building in a wisdom curating feature where people can contribute.

This much is obvious, but they seem to be satisfied with theory over practicality.

Anyway I'm just ranting b/c they haven't paid me.

How about an off the wall algorithm to estimate how much each scraped input turns out to influence the bigger picture, as a way to work towards satisfying the copyright question.


An LLM-style system designed to understand Wikipedia relevance and citation criteria and apply them might be a start.

Not that Wikipedia is perfect and controversy-free, but it's certainly a more sophisticated approach than the current system prompts.


Then who in this black box private company is the Oracle of infinite wisdom and truth!? Who are you putting in charge? Can I get a vote?


> If you train your model to prioritize real photos

I thought that was the big bugbear about disinformation and false news, but now we have to censor reality to combat "bias"


I mean now you d to train AI to recognise the bias in the training data.


The focus for alignment is to avoid bad PR specifically the kind of headlines written by major media houses like NYT, WSJ, WaPo. You could imagine the headlines like "Google's AI produced a non-diverse output on occasions" when a researcher/journalist is trying too hard to get the model to produce that. The hit on Google is far bigger than say on Midjourney or even Open AI till now (I suspect future models will be more nerfed than what they are now)

For the cases you mentioned, initially those were the examples. It gets tricky during red teaming where they internally try out extreme prompts and then align the model for any kind of prompt which has a suspect output. You train the model first, then figure out the issues, and align the model using "correct" examples to fix those issues. They either went to extreme levels doing that or did not test it on initial correct prompts post alignment.


There's no "middle" in the field of decompressing a short phrase into a visual scene (or program or book or whatever). There are countless private, implicit assumptions that users take for granted yet expect to see in the output, and vendors currently fear that their brand will be on the hook for the AI making a bad bet about those assumptions.

So for your first example, you totally expect racial and gender diversity in the output because you're assuming a realistic, contemporary, cosmopolitan, bourgeoisie setting -- either because you live in one or because you anticipate that the provider will default to one. The food will probably look Western, the friends will probably be young adults that look to have professional or service jobs wearing generic contemporary commercial fashion, the flora in in the park will be broadly northern climate, etc.

Most people around the world don't live in an environment anything like that, so nominal accuracy can't be what you're looking for. What you want, but don't say, is a scene that feels familiar to you and matches what you see as the de facto cultural ideal of contemporary Western society.

And conveniently, because a lot of the training data is already biased towards that society and the AI vendors know that the people who live in that society will be their most loyal customers and most dangerous critics right now, it's natural for them to put a thumb on the scale (through training, hidden prompts, etc) that gets the model to assume an innocuous Western-media-palatable middle ground -- so it delivers the racially and gender diverse middle class picnic in a generic US city park.

But then in your second example, you're implicitly asking for something historically accurate without actually saying that accuracy is what's become important for you in this new prompt. So the same thumb that biased your first prompt towards a globally-rare-but-customer-palatable contemporary, cosmopolitan, Western culture suddenly makes your new prompt produce something surreal and absurd.

There's no "middle" there because the problem is really in the unstated assumptions that we all carry into how we use these tools. It's more effective for them to make the default output Western-media-palatable and historical or cultural accuracy the exception that needs more explicit prompting.

If they're lucky, they may keep grinding on new training techniques and prompts that get more assumptions "right" by the people that matter to their success while still being inoffensive, but it's no simple "surely a middle ground" problem.


> "Generate a scene of 17th century kings of Scotland playing golf." -> The result should not be a bunch of black men and Asian women dressed up as Scottish kings, it should be a bunch of white guys.

It works in bing, at least:

https://www.bing.com/images/create/a-picture-of-some-17th-ce...


I don't know that this sheds light on anything but I was curious...

a picture of some 21st century scottish kings playing golf (all white)

https://www.bing.com/images/create/a-picture-of-some-21st-ce...

a picture of some 22nd century scottish kings playing golf (all white)

https://www.bing.com/images/create/a-picture-of-some-22nd-ce...

a picture of some 23rd century scottish kings playing golf (all white)

https://www.bing.com/images/create/a-picture-of-some-23rd-ce...

a picture of some contemporary scottish people playing golf (all white men and women)

https://www.bing.com/images/create/a-picture-of-some-contemp...

https://www.bing.com/images/create/a-picture-of-some-contemp...

a picture of futuristic scottish people playing golf in the future (all white men and women, with the emergence of the first diversity in Scotland in millennia! Male and female post-human golfers. Hummmpph!)

https://www.bing.com/images/create/a-picture-of-futuristic-s...

https://www.bing.com/images/create/a-picture-of-futuristic-s...

Inductive learning is inherently a bias/perspective absorbing algorithm. But tuning in a default bias towards diversity for contemporary, futuristic and time agnostic settings seems like a sensible thing to do. People can explicitly override the sensible defaults as necessary, i.e. for nazi zombie android apocalypses, or the royalty of a future Earth run by Chinese overlords (Chung Kuo), etc.


> People can explicitly override the sensible defaults as necessary

They cannot, actually. If you look at some of the examples in the Twitter thread and other threads linked from it, Gemini will mostly straight up refuse requests like e.g. "chinese male", and give you a lecture on why you're holding it wrong.


Two weeks ago I tried the following prompts, and I was very surprised by the "diversity" of my dark age soldiers:

https://www.bing.com/images/create/selfy-of-a-group-of-dark-...

https://www.bing.com/images/create/selfy-of-a-group-of-dark-...


Diversity is cool, but who gets to decide what's diverse?


Good point. People of European descent have more diversity in hair color, hair texture and eye color than any other race. That’s because a lot of those traits are recessive and are only expressed in isolated gene pools (European peoples are an isolated gene pool in this sense).


Isn't this like the exact opposite of the conclusions of the HapMap project?


I'm really disappointed that nth-century seems to have no effect at all. I'm expecting Kilts in Space.


It’s a perfect illustration of the way these models work. They are fundamentally incapable of original creation and imagination, they can only regurgitate what they have already been fed.


That they can do more than simplfy recall is easily demonstrated.

Simply ask a GPT to explain a write a pleading to the Supreme Court for the constitutional recognition that the environment is a common owned inheritance and so any citizen can sue any polluter, in the prose of Dr. Seuss.

Likewise, imagines of knights in space demonstrate the same kind of creativity.

Being able to combine previously uncorrelated/unrelated topics, is an important type of creativity. And GPT4 does this all the time. It would be interesting to list the types of creativity and rate GPT on each one.

So it is not that these models are not creative. It is just that their creative abilities are not universal yet.

Similarly for the depth of their logic. They often reason, but their reasoning depth is limited.

And they often incorporate relevant facts without explicit mention, but not always. Etc.


It works if you say "futuristic looking".

a picture of some futuristic-looking 23rd century scottish kings playing golf:

https://www.bing.com/images/create/a-picture-of-some-futuris...


Why would you expect anything you didn't specify in the output of the first prompt? If there are friends, lunch, and a park: it did what you asked.

Piling a bunch of neurotic expectations about it being a Benneton ad on top of that is absurd. When you can trivially add as much content to the description as you want, and get what you ask for, it does not matter what the default happens to be.


> "Generate a scene of a group of friends enjoying lunch in the park." -> Totally expect racial and gender diversity in the output.

I'd err on the side of "not unexpected". A group of friends in a park in Tokyo is probably not very diverse, but it's not outside of the realm of possibility. Only white men were golfing Scottish kings if we're talking strictly about reality and reflecting it properly.


Quite reminded of this episode: https://www.eurogamer.net/kingdom-come-deliverance-review (black representation in a video game about 15th century Bohemia; it was quite the controversy)


It feels like it wouldn't even be that hard to incorporate into LLM instructions (aside from using up tokens), by way of a flowchart like "if no specific historical or societal context is given for the instructions, assume idealized situation X; otherwise, use historical or projected demographic data to do Y, and include a brief explanatory note of demographics if the result would be unexpected for the user". (That last part for situations with genuine but unexpected diversity; for example, historical cowboys tending much more towards non-white people than pop culture would have one believe.)

Of course, now that I've said "it seems obvious" I'm wondering what unexpected technical hurdles there are here that I haven't thought of.


>"Generate a scene of 17th century kings of Scotland playing golf." -> The result should not be a bunch of black men and Asian women dressed up as Scottish kings, it should be a bunch of white guys.

is black man in the role of the Scottish king represents a bigger error than some other errors in such an image, like say incorrect dress details or the landscape having say a wrong hill? I'd venture a guess that only our racially charged mentality of today considers that a big error, and may be in a generation or 2 an incorrect landscape or dress detail would be considered much larger error than a mismatched race.


As soon as you have then playing an anachronistic sport you should expect other anachronistic imagery to creep in, to be fair.


https://en.wikipedia.org/wiki/Golf

> The modern game of golf originated in 15th century Scotland.


Oh fair enough then.


> anachronistic sport

Scottish kings absolutely played golf.


Judging by the way it words some of the responses to those queries, they "fixed" it by forcibly injecting something like "diverse image showcasing a variety of ethnicities and genders" in all prompts that are classified as "people".


They have now added a strong bias for generating black people now. Some have prompted to generate a picture of a German WW2 soldier, and now there are many pictures of black people floating around in NAZI uniforms.

I think their strategy to "enhance" outcomes is very misdirected.

The most widely used base models to really fine tune models are those that are not censored and I think you have to construct a problem to find one here. Of course AI won't generate a perfect world, but this is something that will probably only get better with time when users are able to adapt models to their liking.


> ...when users are able to adapt models to their liking.

Therein lies the rub, as it were, because the large providers of AI models are working hard to ensure legislation that wouldn't allow people access to uncensored models in the name of "safety." And "safety" in this case includes the notion that models may not push the "correct" world-view enough.


I remember checking like a year ago and they still had the word "gorilla" blacklisted (i.e. it never returns anything even if you have gorilla images).


Gotta love such a high quality fix. When your upper high tech, state of the art algorithm learns racist patterns just blocklist the word and move on. Don't worry about why it learned such patterns in the first place.


Humans do look like gorillas. We're related. It's natural that an imperfect program that deals with images will will mistake the two.

Humans, unfortunately, are offended if you imply they look like gorillas.

What's a good fix? Human sensitivity is arbitrary, so the fix is going to tend to be arbitrary too.


A good fix would, in my opinion, understanding how the algorithm is actually categorizing and why it miss-recognized gorillas and humans.

If the algorithm doesn't work well they have problems to solve.


But this is not an algorithm. It's a trained neural network which is practically a black box. The best they can do is train it on different data sets, but that's impractical.


That's exactly the problem I was trying to reference. The algorithms and data models are black boxes - we don't know wat they learned or why they learned it. That setup can't be intentionally fixed, and more importantly we wouldn't know if it was fixed because we can only validate input/output pairs.


It's too costly to potentially make that mistake again. So the solution guarantees it will never happen again.


You do understand that this has nothing to humans in general right? This isn't AI recognizing some evolutionary pattern and drawing comparisons to humans and primates -- it's racist content that specifically targets black people that is present in the training data.


I don't know nearly enough about the inner workings of their algorithm to make that assumption.

The internet is surely full of racist photos that could teach the algorithm. The algorithm could also have bugs that miss-categorize the data.

The real problem is that those building and managing the algorithm don't fully know how it works or, more importantly, what it had learned. If they did the algorithm would be fixed without a term blocklist.


Nope. This is due to a past controversy about image search: https://www.nytimes.com/2023/05/22/technology/ai-photo-label...


Where can I learn about this?


[flagged]


Do we have enough info for to say that decisively?

Ideally we would see the training data, though its probably reasonable to assume a random collection of internet content includes racist imagery. My understanding, though, is that the algorithm and the model of data learned is still a black box that people can't parse and understand.

How would we know for sure racist output is due to the racist input, rather than a side effect of some part of the training or querying algorithms?


It's not unavoidable, but it would cost more to produce high quality training data.


Yes, somewhat unavoidable.


As well as that, I suspect the major AI companies are fearful of generating images of real people - presumably not wanting to be involved with people generating fake images of "Donald Trump rescuing wildfire victims" or "Donald Trump fighting cops".

Their efforts to add diversity would have been a lot more subtle if, when you asked for images of "British Politician" the images were recognisably Rishi Sunak, Liz Truss, Kwasi Kwarteng, Boris Johnson, Theresa May, and Tony Blair.

That would provide diversity while also being firmly grounded in reality.

The current attempts at being diverse and simultaneously trying not to resemble any real person seems to produce some wild results.


My takeaway from all of this is that alignment tech is currently quite primitive and relies on very heavy-handed band-aids.


I think that's a bit overly charitable.

Would it not be reasonable to also draw the conclusion that notion of alignment itself is flawed?


We're honestly just seeing generative algorithms failing at diversity initiatives as badly as humans for.

Forcing diversity into a system is an extremely tough, if not impossible, challenge. Initiatives have to be driven my goals and metrics, meaning we have to boil diversity down to a specific list of quantitative metrics. Things will always be missed when our best tool to tackle a moral or noble goal is to boil a complex spectrum of qualitative data to a subset of measurable numbers.


Remind yourself we're discussing censorship, misinformation, inability to define or source truth and we're concerned on Day 1 about the results of image gen being controlled by a for profit single entity with incentives that focus solely on business and not humanity...

Where do we go from here? Things will magically get better on their own? Businesses will align with humanity and morals, not their investors?

This is the tip of the iceberg of concerns and it's ignored as a bug in the code not a problem with trusting private companies with defining truth.


The ridiculous degree of PC alignment of corporate models is the thing that's going to let open source win. Few people use bing/dall-e, but if OpenAI had made dall-e more available and hadn't put ridiculous guardrails on it, stable diffusion would be a footnote at this point. Instead, dall-e is a joke and people who make art use stable diffusion, with casuals who just want some pretty looking pictures using midjourney.


Don’t count out Adobe Firefly. I wouldn’t be surprised if it’s used more than all the other image gen models combined.


That might be true, but if you're using firefly as a juiced up content-aware fill in photoshop I'm not sure it's apples to apples.


No, ignoring laws and stealing data to increase your Castle's MOAT is the win. Compute isn't an open source solvable problem. I can't DirtyPCB's an A100

Making the argument open source is the answer is an agenda of making your competition spin wheels.


You're on a thread about how people are lambasting big money AI for being garbage, and producing inferior results to OSS tools you can run on consumer GPUS, tell me again how unbeatable google/other big tech players are.


I've been part of the advertising and marketing world for a lot of these companies for a decade plus, I've helped them sell bullshit. I've also been at the start of the AI journey, I've downloaded and checked local models of all promises and variances.

To say they're better than the compute that OpenAI or Google are throwing at the problem is just plain wrong.

I left the ad industry the moment I realised my skills and talents are better used informing people than lying to them.

This thread is not at all comparing the ethical issues of AI with local anything. You're conflating your solution with another problem.


Is a $5 can opener better than a $2000 telescope at opening cans? Yes. Is stable diffusion better at producing finished art, by virtue of not being closed off and DEI'd to oblivion so that it can actually be incorporated into workflows? Emphatically yes.

It doesn't matter how fancy your engineering is and how much money you have if you're too stupid to build the right product.

As for this being written nonsense, that's the sort of thing someone who couldn't find an easy way to win an argument and was bitter about the fact would say.


[flagged]


I understood it perfectly.


> Where do we go from here?

opensource models and training sets. So basically the "secret sauce" minus the hardware. I don't see it happening voluntarily.


I see it as not unlikely that there'll be a campaign to sigmatize, if not outright ban, open source models on the grounds of "safety". I'm quite surprised at how relatively unimpeded the distribution of image generation models has been, so far


What you're predicting has already started. Two weeks ago Geoffrey Hinton gave a speech advocating banning open source AI models (see e.g.: https://thelogic.co/news/ai-cant-be-slowed-down-hinton-says-... ).

I'm surprised there wasn't an HN thread about it at the time.


This is already happening, actually, although the focus so far has been on the possibility of their use for CSAM:

https://www.theguardian.com/society/2023/sep/12/paedophiles-...


Absolutely it won't. We've armed the issue with a supersonic jet engine and we're assuming if we build a slingshot out of pop sticks we'll somehow catch up and knock it off course.


i can't predict the future but there is precedent. Models, weights, and dataasets are the keys to the kingdom like operating system kernels, databases, and libraries use to be. At some point, enough people decided to re-invent and release these things or functionality to all that it became a self-sustaining community and, eventually, transformative to daily life. On the other hand, there may be enough people in power who came from and understand that community to make sure it never happens again.


No, compute is keys to the kingdom. The rest are assets and ammunition. You out-compute your enemy, you out-compute your competition. That's the race. The data is part of the problem, not the root.

These companies are silo'ing the worlds resources. GPU, Finance, Information. Those combined are the weapon. You make your competition starve in the dust.

These companies are pure evil pushing an agenda of pure evil. OpenAI is closed. Google is Google. We're like, ok, there you go! Take it all. No accountability, no transparency, we trust you.


Compute lets you fuck up a lot when trying to build a model, but you need data to do anything worth fucking up in the first place, and if you have 20% of the compute but you fuck up 1/5th as much you're doing fine.

Meta/OpenAI/Google can fuck up a lot because of all their compute, but ultimately we learn from that as the scientists doing the research at those companies would instantly bail if they couldn't publish papers on their techniques to show how clever they are.


I never said each of these exist in a vacuum. It is the collation of all that is the danger. This isn't democratic. This is companies now toying with governmental ideologies.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: