Hacker News new | past | comments | ask | show | jobs | submit login
Why Does This Horrifying Woman Keep Appearing in AI-Generated Images? (vice.com)
122 points by pseudolus on Sept 8, 2022 | hide | past | favorite | 112 comments



The woman appeared once, and they kept combining it with different prompts. So it stayed in the subsequent images.

https://twitter.com/supercomposite/status/156716492990339481... https://twitter.com/sheslostheplot/status/156737091948789350...


> I was ripping Loab apart, and putting her back together. She is an emergent island in the latent space that we don't know how to locate with text queries. But for the AI, Loab was an equally strong point of convergence as a verbal concept. And really, it was usually stronger!

The nonsense words of someone who really wants to be the creator of the next big creepypasta.


> The nonsense words of someone who really wants to be the creator of the next big creepypasta.

It took them about 24 hours to come up with a merch store[1] for this character, too.

[1] https://twitter.com/supercomposite/status/156756520182584525...


I suppose they might as well while they’ve got their 5 minutes of fame.


Or someone who's pressing buttons and has absolutely zero fucking ideas of what the underlying tech id and does


> But I can confirm Loab exists in multiple image-generation AI models

Or they're lying and this is just a stunt to generate buzz.


Reminds of most humanities.


That’s a very ignorant statement.


Italians have a precise word for that: "supercazzola". It's when someone voluntarily uses this nonsense bs to sell something or simulate some kind of knowledge.


That word makes me hungry


oh, really? it translates to "super cock sucker" :D


> She is an emergent island in the latent space that we don't know how to locate with text queries

That is definitely a tall statement short on evidence. We expect there to be countless islands in latent space - this is basically a related concept to grandmother neurons, although instead of "Jennifer Aniston neuron" we have this "Loab neuron" which has no particular obvious origin. But I don't think there is anything privileged or special about this cluster.

https://en.wikipedia.org/wiki/Grandmother_cell


They forgot to mention how the metaverse created her, but now she has escaped into the real world!


thanks for the links, boggles the mind the article about the twitter thread and the images would not link to the twitter thread nor show more than a couple images. All the references in the vice article are only to other vice articles by the same writer...

The URL should probably link to the twitter thread in question, the articles adds pretty much nothing and removes tens of images and commentary by the creator.


I had the opposite thought. Couldn’t properly see the images because I don’t have twitter.

I even signed up recently just so I could click links and they immediately locked my account for being a bot.


Replace “twitter.com” with “nitter.net” in the URL and you can see a mirror.


The interesting bit is not that it remains there, but that it seems to generate dark images (gore, violence) regardless of the input.


I think it is somewhat interesting that it remains there, apparently after many many "generations" of combining with other images.

It's also interesting to consider what this image was originally "far" away from.

There are a couple of good followup threads exploring how such a thing might come about in the "latent space" of the model:

https://twitter.com/jthteo/status/1567418273628655617

https://twitter.com/mattskala/status/1567300206969982979


For me it kinda makes sense: the portrait/person is an important feature of the image, so it's heavily weighted in the generation, producing another image with that portrait/person, and so on. I find it more interesting that the visage itself is still recognizable between generations, as if it's not "any person", but a very specific visage.


I know very little about this stuff, but doesn’t that just mean that the algorithm considers the original image to contain gore/violence?


It at least considers the original image adjacent to gore/violence, but there's something uncanny about how the "Loab" image itself is not a gory or violent image. There's something funky going on along a "concept axis" involving the upside-down triangles on the face, the rosacea, and the overall bleak composition of the image.


The image looks a lot like the haggard female protagonists or victims in a lot of horror movies so it could just be a composite of those. The rosacea might be due to it confusing blood splattered faces with medical images. Honestly it just looks like a still image from a horror trailer to me.


Isn’t this just a result of the negative weights?


It is also important to note that they are specifically using features distant from one another (they are manipulating the latent and not using textual prompts to reproduce the face). It really does feel engineered to be creepy. Looking at the images closely it really does seem like a big reason for the creepiness is that there is a combined masculine and feminine features. That both are strong and since our brains are good at detecting human faces we instantly realize that something is off. Then there are the shadows on the face. The lighting is all wonky. Shadows are cast at weird angles. Either dark on the eyes compared to what should be seen or the opposite. Same goes for the mouth and chin area. Stable Diffusion and Midjourney aren't very good at human faces (specifically eyes, mouth, and skin textures), so that's going to add an extra creepy factor, especially combined with the negative weighting.

But you are spot on in noting that Vice is engaging in clickbait and misinformation. The tweets are purposefully being creepy because that is the art style of cryptids. But as your second link clearly shows, people aren't going to randomly find images like this (and let's be real, I wouldn't recognize this as the same person unless you told me they were). Stop engaging in melodrama Vice. You can't do that if you're claiming to report news.


> Stop engaging in melodrama Vice.

what's the point in telling a fish to stop swimming


Maybe others will stop following the fish.


Is this just a new AI ghost story, or is there actually something scientifically repeatable here? Can we please get real nerdy about this?

The article broadly describes a series of prompts, but do we have enough information figure out which AI engine was used, reverse engineer some likely prompts, and try to produce similar results (not exactly the same, as that may not be possible with AI prompts)?

Is it even possible to ask an AI image generator to "produce the opposite" of a prompt?

Is this just an RTFM moment (for me)? or is "producing the opposite" a misunderstanding of how weights work. I only have experience in midjourney, but my understanding is that with midjouney you can weight various prompts as ratios. For example I can build up a prompt to generate an image that is 2 parts "autumn landscape" and 3 parts "birthday cake". But with ratios, isn’t it true that negative weights just discarding that prompt? They don’t produce “the opposite”, right?


from Supercomposite's original twitter post, we can see they claim to start off with a negatively weighted prompt of "Brando::-1"

To me, this looks like midjourney prompts. At least we can say, that is valid midjourney syntax for sure, and it is using weights. but probably not to the effect the author portrays.

for example, if I prompt "/imagine autumn landscape::2 birthday cake::3", I get this image[1].

But now if I tweak the prompt to "/imagine autumn landscape::-0.5 birthday cake::3", I get this image [2].

Critically, there is no trace of "opposite of an autumn landscape" in this image. It's all birthday cake... 100%.

Indeed, this lines up with some midjouney documentation[3]. A negative weight will try to remove the thing in the prompt.

BUT: what happens when we only have one prompt, and we use a weight to negate it in midjourney. i.e: "/imagine Brando::-1". At least for me, it get this error:

"Invalid parameter

The sum of all of the prompt weights must be positive"

So I'm inclined to conclude that Supercomposite's post is more an act of creative story telling, than an accurate portrayal of their interactions with Midjourney.

But I am still left wondering if there is a true "opposite of" operator in any AI image generator.

[1]: https://cdn.discordapp.com/attachments/1006400576067739749/1...

[2]: https://cdn.discordapp.com/attachments/1006400576067739749/1...

[3]: https://midjourney.gitbook.io/docs/imagine-parameters#prompt...


I mean, they might have modified the midjourney code to patch out the error message. Presumably the programmers put the check in because results for negative weight sums become nonsensical, not because it's impossible.

Thinking in terms of classifiers, an image is almost never categorized as exactly 0% something, instead it's a positive value. Negative weights would make the net optimize for the smallest possible percentage. For the birthday cake, the negative weight is not strong enough to favor any "anti-autumn-landscape" patterns, only to remove features associated with autumn landscapes. But for weights that are all negative, it's plausible that the system will produce all the features that, in the training data, are anticorrelated with the prompt.


There's a comment thread diving into why the concept is so persistent and why it appears on negative queries:

https://twitter.com/mattskala/status/1567300206969982979?t=C...


Following some stuff from that person, I came across this, a post trying to replicate Loab with various prompts. Shows visually what might be going on. https://twitter.com/chrysopoetics/status/1567673870433546242


This is all ghost stories. The internet is a schizophrenic and is deeply into examining itself


> Can we please get real nerdy about this?

So as someone who works in generative modeling I'll give my best guess as to what is done and what is happening. It is a guess because they don't say everything, but there are some hints. Scambier linked these two twitter threads[0][1] which can give us some insight.

> I'll explain negative prompt weights, in case you don't know. With these, instead of creating an image of the text prompt, the AI tries to make the image look as different from the prompt as possible.

What's important here is that the machine doesn't actually know what the opposite is. In fact, I would ague we don't either. What you can do is use Lp distances from a latent representation. This is where things start to make sense. If in that we find faces as a large distance away, it is also unsurprising that we find many different facial characteristics. These first images look like there is a high mixture between strong masculine features and strong feminine features. These are not things we typically see in reality and combine with our hyperactive brains for recognizing other human faces, we enter the uncanny valley.

Next I don't know if this was done on purpose or not, but there are very clear issues with scene lighting. I can totally believe that this is not on purpose because this is something generators are bad at already. So we have shadows cast along the face in unnatural ways. Upping the creepiness factors.

Now we need to look at important features for recognizing faces: eyes, mouth, and nose. You may have noticed that text to image generators are typically really bad at these. Generators are also typically bad at facial symmetry (why we're trying to get transformers in, but this still isn't working to the degree we would like). In fact, I actually find it more interesting that these are coherent given the explanation of how the latent representation was created.

So I think we have good explanations as to why this would turn creepy very fast. Especially given the hype and that the creator is leaning into it. But these are my best guesses. I can't really know without seeing what is done.

But honestly, I am super interested and would like to see these latent representations and play around with them. This could be a good thing to investigate if you are trying to determine how smooth the latent manifold is, which is extremely important if we're going to make deeper content contributions and rely less on our prompt engineering. Maybe I'll have to play with some negative prompts (if I can find the time lol).

[0] https://twitter.com/supercomposite/status/156716228808747008...

[1]https://twitter.com/sheslostheplot/status/156737091948789350...


My first reaction was that it's surprisingly close to the traditional "average face" [0] generated by overlaying photos that we've seen in so many articles.

My second reaction was to look for articles that don't average out young faces and the result gets even closer. [1]

The traditional simple averaging process removes most wrinkles, rosacea and blemishes because those differ between individuals but the facial proportions match Loab well.

It appears the AI when negatively weighted gives the most average possible result that still matches the concept of a face while picking the most unpopular levels of skin texture and lighting.

It's basically Courtney Cox at 70 after a ten-year whiskey bender.

[0] https://petapixel.com/2011/02/11/average-faces-of-women-in-4...

[1] https://www.researchgate.net/figure/Average-faces-of-young-2...


These are good points! I think another important aspect is that the links you show are only averaging faces of a single gender. The first twitter link to me seems to have features that are both masculine and feminine.


Just to be clear:

> “I can't confirm or deny which model it is for various reasons unfortunately! But I can confirm Loab exists in multiple image-generation AI models,” Supercomposite told Motherboard.

this is almost certainly a creepypasta which Vice is for some reporting as if it is real.

He probably found a creepy picture and then started using it as a seed or something like that.


I'm surprised it's taken this long for someone to produce the first creepypasta of the image-generating AI era, if "era" is the right word. I'm not all that surprised Vice reported on it so credulously.


I spent longer than I care to admit trying to figure out if "Loab" was some word play that Vice had not picked up on.


So "why" is just "idk" and it seems like the person generating them found her on accident once, and then kept trying to find her again. I'm a fan of a good creepypasta and honestly a memetic SCP that "lives" in the models of AI sounds pretty tight to me. But this ain't it.

I'll admit I was expecting the "why" (aside from SCP...) to be something like "turns out DALL-E gets live humans and corpses mixed up and in macabre images the live humans get a bit more corpse-y due to this".


If you like this, you'll want to see this AI-generated music video if you haven't already:

https://www.teddit.net/r/nextfuckinglevel/comments/x6d3c3/ai...


Just and FYI. It’s pointed out in the comments there[1]: the video is not fully AI-generated. It’s a video of a real girl to which some AI was applied to afterwards.

1: https://teddit.net/r/nextfuckinglevel/comments/x6d3c3/ai_gen...


I clicked this thinking "how bad could it be?" and closed it as fast as I could. This is nightmare fuel, probably don't click it.


As a counterpoint, everyone should absolutely click it, yes it's a bit creepy but it's not NSFL or anything and as generated art it is amazing, and needs to be studied and appreciated more.


I get squicked-out pretty easily, and I didn't find this too aversive. Art-wise, it's absolutely mesmerizing. I could definitely see this technique becoming a stable of the black metal / witch house / etc genres.


This just looks like coming down off LSD. Google's deep dream videos looked like coming up.

I guess it's useful to have an accurate video simulation so you can decide whether it's for you.


i have tripped dozens of times and never seen something like this. the visual mechanics are trippy but to see something like that i'd probably have to be in a really bad head space watching horror movies during the peak. never tried that, though. prefer to stick with documentaries and funny movies.


Seems odd to be unable to watch this rather tame video tbh. I would consider trying to transcend it rather than cultivate such a low threshold of fear.


Well that shit is gonna haunt me for a while…


Thanks I hated it, but also it is lowkey amazing.

The fluidity of the generation seems way more "natural" than most CGI.

Definitely the way forward for every horror video game.


> The fluidity of the generation seems way more "natural" than most CGI.

It’s probably because it’s not fully AI-generated. It’s essentially a real video with a filter applied to it[1]

1: https://teddit.net/r/nextfuckinglevel/comments/x6d3c3/ai_gen...


It would be really a cool experiment to have AI generated visuals real-time in a video game.


It’s almost like a cancer got into it and metastasized into all of the frames.


is it the same artist / same woman ?


I asked DALL-E to draw a self portrait of itself and its family one time, and the results still make my hairs stand up.

DALL-E itself was Clippy-like, blue, and bean shaped with eyes and mouth more expressive than a Zuckerberg VR avatar.

As for the family? There was none. DALL-E rendered itself on a pure black background


To bad it didn't do something like one of Dali's self portrait with bacon.

https://www.salvador-dali.org/en/museums/dali-theatre-museum...


Can you please share the image if that's okay?


https://i.ibb.co/7y08y2H/unknown.png

I left out the bit how DALL-E also thinks maybe it's a honey badger, or a lemur with all black eyes, or some sort of sharp wood demon.

The blue one is the one that haunts me because it seems to be the most plausible


Hah! I love the one in the bottom-left that looks like it's flipping us off.

This reminds me of playing around with Craiyon a few weeks back. It seemed to think of itself as a coyote who loves soccer. Of course, the training datasets aren't going to have data on dalle or craiyon, so those prompts are going to be more based on the other words, with some randomness


Crayon-dice is a perfect avatar for an image drawing AI.


The algorithm may be referencing Craiyon, the dall-e lite


DALLE2’s training predates craiyon I believe.


no doubt includes it now?


They tend to be somewhat transparent about changing their models. Having said that it’s possible they could have finetuned it since then.


> I asked DALL-E to draw a self portrait of itself and its family one time, and the results still make my hairs stand up.

Dall-E and other Image generation models are not conscious, and they aren't even intelligent. Stop anthropomorphizing them, it's not helpful. We will likely face this problem for real in the coming decades but there's no sense doing so with current models.


I don't subscribe to the AI Boogeyman theories. Consciousness is some made up philosophical macguffin. I don't care if a bunch of phds can align on a definition for hokum and then tell me if one robot has it or not.

Asking things to draw themselves is fun, I could just as easily ask an elephant holding a paint brush to paint itself and then enjoy the outcome. That's pretty much all there is to it.


>Consciousness is some made up philosophical macguffin.

Well it's not totally made up. There is definitely something there, there is a real difference between the experience of being a rock and being a human.

The idea then of a P-zombie or some other version of a major intelligence operating with the lights off internally really is spooky.

>Asking things to draw themselves is fun, I could just as easily ask an elephant holding a paint brush to paint itself and then enjoy the outcome. That's pretty much all there is to it.

Agreed. But asking Dalle or whatever model to draw "The meaning of life" and thinking there is some kind of enlightenment in what it draws is ridiculous.


> Well it's not totally made up. There is definitely something there

"Consciousness" can be a (fairly vague) term encompassing concretely realizable things like train of thought, the ability to introspect those thoughts, an internal model of self, etc. Humans have these whereas a rock does not, and there's nothing in particular preventing AI from eventually having these.

"Consciousness" as something incorporeal that a physically identical being could lack is nonsense territory IMO (but already heavily debated by people far smarter than me). I don't see how whatever we mean by consciousness can make no physical difference when we're directly aware of it and talking about it in the physical world.


If you are having this conversation with me then you are a consciousness and I am a consciousness and that's as good a definition of consciousness as we are ever going to get. Consciousness thus defined exists entirely within the communicative medium.


If I understand it correctly, it’s more like a big association model. So the word “pig” has a lot of images of the word “pig”, or images that had been tagged with those. They also have weighted associations with other things that have to with pigs.

IF I understand it correctly :)


I asked craiyon for a self portrait. In two of the outputs it was showing a picture of someone holding someone else's portrait in front of their face.


They are simply too big, and their results too unpredictable, to reliably prevent harmful results.

That seems like a surprisingly deep philosophical statement on society in general. And IMHO, along the same lines, "reliably prevent harmful results" is not something that should be pursued to exhaustion.


The creators of the mainstream AI art generators went to great pains to prevent these tools from being used to generate pornography.

If you want to talk about statements on society in general, I think the fact that nudity is seen as more harmful than gore, is a bigger problem than an AI generating either one.


I’d just like to point out that there’s no way to “win” this battle. If you release models or datasets that have or can predict porn, you will get an equal number of complaints from the people who think your model/dataset is irresponsible or only useful for smut.

In fact, the dataset used to train it received precisely this criticism on HN back when it was originally announced.


Loab looks remarkably similar to Billy the Puppet from Saw, which could explain its relation to gore and violence.


Excellent identification ! Probably happens with every image with exacerbated cheek rosacea, triggers connections to blood too


I was also thinking the girl from "The Ring".


Maybe DALL-E just.. you know. Has a type.


lol. thank you for that.


She's AI version of Null island: https://en.wikipedia.org/wiki/Null_Island


Looks a bit like Toni Collette. She’s in a bunch of horror movies?


Were there screenshots of horror movies in the training data? That could explain a lot.


Looks like we discovered some kind of dark nexus in the collective subconscious. Img to text models could probably put words to this which would help us loop back and understand exactly what collective concept we were looking at.

Somone here on HN has said that they use these AI models to explore the latent space of the human imagination. I would say that sounds exactly correct.


>Somone here on HN has said that they use these AI models to explore the latent space of the human imagination.

Bizarre. It's like saying that you are using the newly invented Bicycle to explore the concept of General Relativity. Woefully insufficient.


Was the bicycle constructed out of every thought ever recorded and every image ever created?


The bicycle contained a form of all engineering and physics knowledge of the time.


Good lord, I didn't realize that an AI generated image could be so shockingly frightening. To be honest the fact that DALL-E can do the things it does is very worrisome for me. I think we shouldn't go down this technological path.


Horrifying woman? I thought it was an Ozzy Osbourne picture.

EDIT: I'm not kidding.


It would not amaze me if you're spot on. Lots of images scraped off the 'net, especially celeb websites and quelle surprise.


Crungus is another one which keeps coming up.

Perhaps there is some thing the machines can see, but our psychology protects us from?


See that is a good creeypasta concept.


Clearly we are looking at a Keter-class infohazard.


Anyone else not find this image disturbing at all? Looks like she is suffering from a severe skin condition. That's about it.


I think this just tapped into our collective subconsciousness...


If you play around with these tools the overwhelming majority of images are deeply disturbing. Melted faces, fused bodies, hands with unusual numbers of fingers. I think this machine produces objects deep in the uncanny valley.


I've spent more time than I'd like to admit with Dall-E, Midjourney and SD and I disagree with your sentiment. Sure, you can create a bunch of weird creepy shit, but by no means are the overwhelming majority of images deeply disturbing, as you put it.

Just scroll through https://lexica.art/ which catalogs 10M+ StableDiffusion images, most aren't even close to being disturbing and are in fact aesthetically quite pleasing.


They're very impressive and while they all look great at a quick glance I still find most of them disturbing when I look closely. https://lexica.art/?prompt=ec869dfc-b31e-4310-84bd-fab9d1c9e... is a good example, where it looks ok at first but you realize that the limbs are freakish and mutated most of the time. For many look at the eyes and teeth!


Those are selected images though, so not representative.


They are scraped from Discord, not cherry picked, so very much representative.


If they were scraped from Discord where others posted them, then they are scraped from a filtered source which contains only cherrypicked pictures that people chose to post, the ones they found interesting and good - it would be representative if and only if it had sampled also the pictures they generated but did not post.


They're scraped from the Stable Diffusion bot which generated them. People didn't post the images themselves, so they're not cherry picked.


You get 4-9 images at a time and it takes literally 2 seconds to look and decide which to save. Saying it's "unusable" because you CAN get bad images is missing out..


"representative" is not the same concept as "unusable", so you should not put quotes around that text.

It's like going to the grocery, seeing a bin of beautiful apples and saying "every apple on the tree is beautiful". You can't draw that conclusion because those apples aren't representative of all apples because they went through a filtering process. In this case some one choosing to sell them. Upthread, some one choosing to share them.


The images up thread were scraped, not cherry picked. So it's more like walking through an apple orchard.


I think my primary point, "no one was addressing usability just statistical power" still holds, but I concede your point.

Fun fact, apples grown for market get some level of sun shading to keep them pretty. Too much sun damages their skin and can give them that sort of leathery look. Which now that I think about it is a filter itself, hardly germane anymore though.


Cool!

Interesting so I plugged in for prompts from “Captain Kirk”, and none of the images look much like William Shatner. More like a cross of an old Chris Pine with a tiny bit of Shatner thrown in..

https://lexica.art/?q=Captain+Kirk

Weird stuff


Eh. Looks like ozzy Osbourne


opened to check if she looks horrifying. yes, "she" does ...


The ghost in the machine.


this is dumb but I’m trying to play along and match the energy


A wicked witch!


Official statement, dunking on the normies:

https://twitter.com/supercomposite/status/156766407375967846...

> To clarify for the press (many are asking for an explanation without jargon): I have brought a real IRL demon to life. Research has found that demons are real and live inside of computers. Computers are like little houses for demons and church is like a big house for angels.


Let's not ask it for socio-political advice as it may tend towards Marxism-Leninism with some Maoism kicker.

What should we do about the energy crisis? Make them go out and chop wood.

What should we do about quiet quitting? Send them to the gulags.

What do we do about dissidents and people who disagree with us? Send them to the furthest gulags.

What about the poor wheat and corn production this year? Kill all the birds.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: