Hacker News new | past | comments | ask | show | jobs | submit login
Free AI-generated headshots put stock photo companies on notice (theverge.com)
189 points by alanwong 28 days ago | hide | past | web | favorite | 78 comments

I used to fancy myself a starving artist (emphasis on "starving").

These pictures suffer from a common newbie-artist problem!

One of the concepts of early stage drawing I noticed is "symbol drawing". This is where a person knows how an eye should look, and where an eye should go on the face, and so draws an eye on a face-shaped oval. They repeat this for the other eye, a nose, a mouth, etc. They are often sad with the results.

A person who practices incorrectly will try to do photorealistic eyes, noses, mouths, etc but use the same 2D composition technique. No matter how realistically they can render components, their composition is off because they are ignoring the structure of a face that causes subtle differences in the shape of each of the components when projected onto a sheet of paper.

Our mind is so capable of 3D modelling that when we look at a face we don't notice the changes in perspective leading to changes in component shape and shading. The great artist skill is to build up those 3D to 2D projections and avoid the 2D composition (except in stylistic choices).

I suspect that the networks rendering these faces, and the underlying biological phenomena they are designed against, are "symbol" recognizers, rather than "structural" recognizers.

Furthermore, I'd go so far as to say that the reason these faces "feel" weird sometimes is that your mind is reconstructing the structure of the fictitious persons' bodies and warning you that they have bones that are misshapen and lumps under their eyes, and teeth that point strangely, etc.

A biological response might be "misshapen eye ... likely parasite detected, avoid this person" or "mismatched emotional cues ... likely brain damage / person unpredictable, avoid"

> likely parasite detected, avoid this person

Definitely the case for this guy https://cdn.vox-cdn.com/uploads/chorus_asset/file/19216620/0...

There no parasite that causes facial deformities, and mismatched emotional cues are a fundamental part of humour/acting. It makes as much sense to rationalise the unbidden disgust at unexpected facial structure with disease justifications as it does for the far more likely “person appears to have been damaged during youth/juvenile period by violent/poor nutrition & is of a low social class”.

Thanks for leading a vanguard of anti-racism against deepfakes.

Moving your left eye, nose, mouth, and right ear independently of each other is fundamental to acting?

"Unexpected" is a milder term than "never before seen on a human seen, nor interpolated".

> Thanks for leading a vanguard of anti-racism against deepfakes.

Buddy, I appreciate it. You know how hard it is when you’re surrounded by doofuses who can’t even read well enough to tell the difference between race and class and who don’t even understand how to read in good faith much less do it, and stomp all over themselves trying to epic own people with ‘facts’ that are just straw men created from their flawed understanding of existence outside of their bubble? It’s tough, not gonna lie.

Getting glasses for the first time at an adult age can have a similar distorting effect on facial perception for the first few months.

I think your point is that people will recognize these faces as not being real, or maybe at least feel like they are "off" in some way, but can't put their finger on it...

That being said, if you scroll through thousands of these, what you are saying does happen often, but even more often, the faces look completely real and I would have no problem believing they were anything but.

Maybe I missed your point though....

> [...] the faces look completely real and I would have no problem believing they were anything but.

I suppose that depends on who the viewer is. For example, everyone at work was blown away with how hard it was to pick the real face here: http://www.whichfaceisreal.com/index.php

However, I went through about 50 in under a minute and got every one correct.

Programmers spend their time thinking in abstractions and models, rather than appreciating things as they are, as an artist must (see Drawing on the Right Side of the Brain -- https://www.drawright.com/theory). I wouldn't be surprised if more artsy types have a much easier time recognizing these fake photos, despite all the component pieces roughly being where they should be.

Funny how 95% of the time the most obvious "wrong" thing is unrealistic teeth, teeth have just as much variety as hair and skin but the fake ones all have this mannequin style bridgework (spoken as the son of a dentist)

For me I figured out how to get 100% just looking at the backgrounds, the fake images have monochrome backgrounds or subtle artifacts.

It used to be easy by comparing reflection on eyes - one was completely different than the other. Or teeth line. I can see they've improved it a lot. I can't tell the difference now.

Just about all of these images have a characteristic "swirl" somewhere. Weird ears and eartings, impossible hair textures, confused "hats" and nightmarish hands/other faces in the image are also clear tells, but that little swirl is usually there even in otherwise "perfect" images.

I wonder what causes it.

It's hard to learn high frequency gradients. It's easy to learn a single edge, but not a large pile of overlapping gradients like the ends of hair.

I don't know why it's so common to have an arbitrary nonsense "sticker" on a face though, I wonder if it's because input photos were trunacted at edges and the learner tried to model the edge as a facial feature instead of pruning it's knowledge to a safe interior of the photo.

The Matrix of course!

If you scrolled through thousands of actual stock photos I'm sure you'd find plenty that seem "off" in some way. Those just cost a lot more.

Is my observation correct that this is most apparent when the face is not perfectly perpendicular to the display? I feel like even some tiny fraction of a degree, and I sense something is not quite right, while the images that seem most believable are perfectly perpendicular to the display.

Is it also difficult to get the face to align with the teeth? Nearly all of their pearly whites feel eerily disconnected to me.

That seems to be an artifact of 3D model being non-existent, so all the projected angles are wrong unless the output angle matches the common input angle.

That was my impression as well! It looks like those projects where a kid did the initial sketch and a skilled artist did a photo-real rendering.

Convolutional neural nets work operate on low level structure and do high-level structure largely by stacking together low-level structural processing.

The "right way" to do high-level structure for images is a fruitful active area of investigation which I expect will probably be tackled in the next 18 months.

I don't think it's possible to do without learning a hidden 3D model from 2D data.

It's nonsense to learn 2D projections directly from 2D projections, because small errors in the 2D projection can become catastrophic errors in the 3D model that humans use to interpret a 2D photo.

Learn maybe. But its easy to construct 3d information from 2d observations. See structure from motion, shape from shadow, Voxel carving, and even orbit determination.

>18 months.

Bold prediction. I wouldn't be surprised if this is an open problem in 18 years.

I'll go further:

For each problem there are a few sub-problems and a "right way" to do each of them where a "right way" corresponds to an algorithm which efficiently tackles some element of the implicit structure of the problem.

I think within 5 years we will have made great progress on algorithmically discovering these "right ways" across many problem domains. I think both differentiable neural architecture search[0] and the MAML with implicit gradients[1] stuff is very interesting from this perspective.



Also as a starving artist who isn't starving because I have a non-artist job, I agree wholeheartedly. I've also done some research into this and capsule networks are the closest SOTA we have to solving this problem. There isn't any 2D projections of 3D space work with these networks AFAIK, but the purpose and intuition points in this direction.

Also, quality data is always the hard part here, so we might be able to bootstrap some of this structure learning with 3D models created for games, projected as 2D, and use that as transfer learning somehow.

In a sense, 3D models are the ultimate compression of 2D information, and therefore the most amenable of 2D transfer learning?

They are many models that are capable of generating 3D faces even from single photos.

For example:


or use the rotation of a real face as a guide:


it's not hyper-realistic but it's getting better and better.

At many points in my life I have tried to do sketching and recognize my problems in your description of symbol drawing. Since you sound knowledgeable about it could you possibly recommend something that expands on that topic, and gives advice for learning to draw that is centered on that concept?

Learn to sculpt! For me, a few hours making faces from clay was all it took to teach my mind the mapping from 3d to 2d. This forces you to think from 2d to 3d by taking an image you have in your mind and forcing you to build 3d support for it.

Then, do still life sketches of what you sculpt, no matter how bad. This seemed to reinforce the other direction for me.

That and play Descent 2.

I've always struggled with drawing organic 3D things, and this is probably part of why.

> A person who practices incorrectly

So how does one practice correctly?

Life model drawing, drawing what you see instead of what you thing it should look like

Ahh. Is this also what makes a lot of high school art--and even adult art--look amateurish?


Looks almost perfect to me, it's just the eyes and hands that consistently seem messed up.

The eyes look really creepy somehow. It's like it is two different eyes.

This is about headshots, a very specific category of stock photos. I have a question about stock photos more generally:

If an ai company looked at thousands of examples of commercial, copyrighted stock photos, then created an ai that would make similar stock photos, but have a method that prunes or selects generated photos that are sufficiently distant from any example as no not be obviously derivative, could they conceal their “theft” and sell their stock photos free of legal risk?

More generally, can AI wash itself clean of copyright infringement by showing that it was “inspired” but not derivative? I guess a judge could compel you to reveal your training set, but at some point do you think there will be general ai that can have the argument that it only seeks “inspiration” and does not “knock off”?

You've hit on a very confusing and debated topic :)

I tried to do some research into this awhile ago. What I found is that you (generally, kind of) are able to use copyrighted work to make another work as a "transformative work". For example: you can look at an image of a person and use that as _inspiration_ to draw the same person (as long as you aren't tracing). However! that's still kind of a gray area.

ALSO: how does that apply if you are using the exact pixels of 1,000,000 images to make new ones? I don't think anyone has a definitive answer yet.

My guess is that it will have to be decided in some high profile court case before we get real answers :)

Yeah, I think that while this seems like a small question right now, it will be incredibly impactful how it is decided for the future.

Imagine if Micheal Jackson made one album, but a week later AI’s made thousands of albums that sounded just as good and original (or better!) but were categorized as inspirations. Imagine how that would change the incentives of creation to know it will be consumed by the hive mind in mere moments.

Imagine the great benefit to humanity by having all those Michael Jackson quality albums.

I don’t think it would change incentives much. Michael Jackson is a brand. Having procedurally generates Michael Jacksonish type music might be nice for elevator companies, but won’t impact his ticket sales that much. There are always lots of similar bands trying to emulate the most popular. Sometimes they success (Creed v. Pearl Jam) but many times the only way I would learn about them is by finding them in the dollar bin. I’m not sure what the present day equivalent is of the dollar CD box.

This is a political question not a scientific one.

We'll need to train a copyright-evaluating AI based on human judgments.

Funny enough, the patent office has similar questions about AI:



It might depend on if you have deep enough pockets to prove you are not violating copyright.

> itself clean of copyright infringement by showing that it was “inspired” but not derivative?

if a human did it, it would obviously be clean of copyright infringement. So why is it any different if an AI did it?

I think there is a perception that pure functions applied to input data produce derivatives. Humans assume that machines make derivatives, but humans are likely to apply their own experiences and interpretations. It is easier to convince people that a machine was not being “creative”.

AI can capture the skill and tone of the human and can replicate it in mass faster that the human could

And the problem is?

Copyright was created as a response to the printing press being able to copy and distribute. However, with the advent of AI, this may no longer be effective. But to use existing copyright laws to control the output of an AI is wrong. May be new laws are needed. Until then, i dont think copyright infringement apply to the output of an AI if the output would not already infringe copyright had it been done by a human.

You can “accidentally” lose your training set.

I think we should err on the side of non-infringement. After all, this is how our real brains work. As long as the works are perceived as nonrelated, it shouldnt matter if they come from someone else s dataset. The similarity could be put to the test by doing a double blind test with a group of humans

I think the method of creation is sometimes relevant. Like if you can prove that you wrote a poem before someone else did, but they registered the same work coincidentally I don’t think you will be treated the same as if it is shown that you copied. So it’s more than if the results are sufficiently novel to a blinded human... perhaps in the future this will be an area where you could be audited - forced to show your training sets or you will be assumed guilty???

I think there have been many, many similar copyright cases in music. E.g. composers have performed their composition method to prove originality. It would be surprising if judges started claiming that dataset companies have excusivity to all NN-derived works , and would seriously hold back progress.

Looks like someone has rebranded 'thispersondoesnotexist.com', thrown together an advert and turned it into a startup.

Why not?

This is useful, makes promotional/demo material more fun


Little tutorial previously posted on HN teaching you how to spot the fakes. The tell is usually around the hairline for me.

There was a post about some "Could you determine if this person is AI-generated or a stock image?" and turns out, that yes, it's very possible.

Most noticeably in 3/4 of the images in the article, the subject's gaze does not work out. Their eye lines never converge.

It would be interesting to mix in real photos then have another AI try to pick out fakes from real ones. They can learn from each other.

Apologies if you know this already, but ...

That's likely exactly how these pictures were generated.

Generative adversarial neural networks (the typical approach for this type of problem) have two nets that compete against each other.

One net tries to generate images that look like the sample data, the other one tries to tell them apart.

This is how adversarial networks work. You have a network trying to build images of humans and another one trying to guess what image is from that network and what image is real.

I wonder how long it'll be before we see AI photos being used in fake social media accounts.

The use of these specific ones would be easily detected, since the dataset is available to everyone and companies can cross-reference. However, if the program becomes commercially available, unique fakes can easily be created.

Read an article about it, that's already happening

We joked at work that these are perfect for bot-farms looking to take Twitter trolling to the next level. Will be really interesting to see how these crop up in usage.

That is already happening, AI bots are in video chat rooms as well.

What makes you think that isn’t already happening?

Fixed your link[0]

PS: Linking from your already downloaded url will generally be tied to your account session.

0: https://drive.google.com/file/d/1JKoYoqMIbsaj0jZNvpUzFQCHrWP...

That girl's face looks ok...OMG What's that monster next to her!?

Just goes to show that it's extremely trivial to clean up the final product with a few passes for quality assurance.

Single handedly, even. This is something one person could do on their own.

And yet they didn't even bother to try.

If they really intend to sell such images, it's extremely unprofessional to leave the botched examples in the mix, despite efforts to rationalize why they should be included.

Based on that, I suspect it's one person acting alone, and it's a get rich quick effort to take the money and run.

Is there any work out that is able to generate human bodies alongside the faces? The only thing I’ve seen close to this is the X-ray app. https://www.theverge.com/2019/6/27/18760896/deepfake-nude-ai...

There's also "This Person Does Not Exist[1]", which has been discussed a few time before[2], here.

[1] https://thispersondoesnotexist.com/ [2] https://news.ycombinator.com/item?id=19144280

There's an artifact that looks like a scar on lots of these, e.g. : https://drive.google.com/drive/folders/1WPsVkdt4qDxjV2itBgw_...

Are these photos derivative works?

In the article:

> Zhabinskiy is keen to emphasize that the AI used to generate these images was trained using data shot in-house, rather than using stock media or scraping photographs from the internet. “Such an approach requires thousands of hours of labor, but in the end, it will certainly be worth it!” exclaims an Icons8 blog post.

FYI - They are not free for commercial use per the "Terms and Condition" link even thought it says on the homepage "for any use." It is for personal use only.

there's something really frightening the way they look. like Frankenstein images patched together from dozens of others.

Some are spot on - but most have a tiny bit of uncanny valley going on. Unmatched eye direction, wide faces or artifacts.

my thought too, but i'm still pretty impressed. Usually, the "uncanny valley" problem sticks out like a sore thumb but these, on average, are very good.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact