Thanks for all the upvotes! Since I made this site, people have already started to train on datasets beyond just real faces. Turns out it can disentangle pretty much any set of data.
I'm currently working on a project to map BERT embeddings of text descriptions of the faces directly to the latent space embedding (which is just a 512 dimensional vector). The goal is to control the image generation with sentences, once the mapping network is trained. Will definitely post on hacker news again if that succeeds. The future is now!
Thanks! Initially, I thought about just generating a big batch and cycling through it. Then I thought it would be more dramatic if the machine was "dreaming" up a face every 2 seconds in real time. I went for the dramatic approach just so I could phrase it that way to my non-tech friends!
Are the old images discarded? It would be interesting to use these as references for hyper-realist drawings. You could have something completely authentic that could never be traced back to its source.
Love the work you guys are doing in the progressive GAN space. Last year I did something similar to make a face-aging network, involving training an encoder to get an initial guess of a latent vector for someones face into the pgan space, and then relied on BFGS optomization to fine-tune the latent vector, followed by further fine-tuning of some intermediary layers of the generator network to really match the input pixels. I also snuck an affine transform layer in there allowing the network to shift the image around to better fit the target.
But overall, Im still tweaking. In the mean time, I've been focusing on static image analysis for aging research, but I hope to find better encoding schemes down the road.
> Turns out it can disentangle pretty much any set of data.
All the example I have seen (including your links) are variants of face generation algorithms. Any ideas on how this could be useful beyond image generation in some style? Specifically for (data) science?
Sorry if this is a naive question.
Edit:
By "variants of face generation algorithms" I mean any image generation really.
The original Karras et al 2018 paper did both cars and cats, which aren't faces. Worked very well, unsurprisingly. (ProGAN also did well on those, though it was the faces everyone paid attention to.) Look at the samples in the paper or the Google Drive dumps, or at the interpolation videos have posted on Twitter.
Aside from the original work, on Twitter, people have done Gothic cathedrals very well, graffiti very well, fonts very well, and WikiArt oil portraits not so well. On Danbooru2017 full anime images (linked in my thread), one person has... suggestive blobs but has only put 2-3 GPU-days into it and we aren't expecting much so early into training. skylion has been running StyleGAN on a whole-body anime character dataset he has, and the results overnight (on 4 Titans) are pretty impressive but he hasn't shared anything publicly yet.
Thanks! The wait on training is killing me, though. I've been doing large minibatch training to try to fix the remaining issues in the anime face StyleGAN and it's frustrating having to wait days to see clear improvement. Checking GAN samples is so addictive and undermines my ability to focus & get anything else done. I'm also eager to get started on full Danbooru image training, which I intend to initialize from skylion's model - whenever that finishes training...
Haha, having to work around the computation limits are welcoming! It feels like building web apps back in the late 90's again. These days we have so much memory and disk space at hand it doesn't even feel like a challenge anymore.
I forgot one failure case: a few hundred/thousand 128px pixel art Pokemon sprites. StyleGAN seems to just weakly memorize them and the interpolations are jerky garbage, indicating overfitting. (No GAN has worked well on it and IMO the dataset is too small & abstract to be usable.)
no not naive at all. this method isn't specific for just extracting features from faces. it can disentangles features from any kind of images. in fact, the next dataset i might train on is on flowers (or birds)
OK, my point is what could be done beyond generating images in some style? Can we generate interesting mock data given a database for instance (of course this is exactly what you did in a way, but I have in mind e.g. a database containing some numerical/categorical features known to a specific accuracy)?
You can use GANs to generate fake data based on stuff like particle accelerator data or electronic health records. Whether you can use StyleGAN specifically is unclear. What's the equivalent of progressive growing on tabular/numeric data? Or style transfer?
Could be used to generate building plans or other schematics (pretty sure of no use though). Could certainly be put to good use generating pornographic images.
Hey, you might want to consider bert-as-service[0] for deep feature extraction from a BERT model. It will give you a 768 dimensional representation of the description, then you can embed that in the 512-dim latent space? I've been thinking of something similar.
It's not that hard to do it yourself, but it's a really clean package, and it gives you nice CLI flags for most things like pooling strategy, and what layer you want to get the activations from.
Some enterprising developer could use images from Tinder (or better) OkCupid tagged with data coming from the individuals profile data, then interpolate based on abstract factors such as risk taking, gender bias etc. Well... you get the picture.
I think this is a very dangerous game we are playing here but I guess it is going to be done.
@lucidrains - this is pretty amazing. Every time you refresh the page, is it a real-time generation, or does it draw from a pool/DB of real-time images generated previously? I got the exact same image twice, which is why I am asking, which kind of dampened the "cool" factor just a notch.
you'll have to build an encoder to encode someone's face into the latent space. then you'll have to dive into the latent space and find the dimension(s) that controls for hair style (just fork the colab and start experimenting with interpolations)
They're probably public domain [1,2]. Generally in the USA and Europe, you can't copyright computer generated images or images created by nonhuman entities.
I feel this is wrong. If I make random generative art I instantly lose copyright? Or how about Photoshop? Really I'm asking myself where is the lone drawn.
You don’t “loose” copyright, you never had it in the first place.
Copyright is (read the law!) a temporary monopoly granted for works meeting certain criteria, being creative is one of them. You’d hold copyright for the code you wrote to generate the “art”. If you download somebody else’s code (as this site uses Nvidia’s), you lack the creative element.
> Recently a talented group of researchers at Nvidia released the current state of the art generative adversarial network, StyleGAN, over at https://github.com/NVlabs/stylegan
> I have decided to dig into my own pockets and raise some public awareness for this technology.
> Faces are most salient to our cognition, so I've decided to put that specific pretrained model up. Their research group have also included pretrained models for cats, cars, and bedrooms in their repository that you can immediately use.
> Each time you refresh the site, the network will generate a new facial image from scratch from a 512 dimensional vector.
Googling "nvidia face generator" lead me to "A Style-Based Generator Architecture for Generative Adversarial Networks," (6 Feb 2019) a paper showing the faces on the site.
I did not know what the site was and the very first image that loaded was very similar to Ross Ulbricht. With the name of the site I thought it was a page dedicated to him. Very odd.
For all of these “this is a simulated face”-claims I wish they would show the 2-3 most similar faces in their training set. For all you know, it could just be spitting out a random training image. How would you know the difference?
That might be because they cleaned the dataset thoroughly. I vaguely recall there was something about 'facial landmarks' and alignment in the ProGAN work which was presumably carried over to StyleGAN. Doubtless helps the final quality.
However, aligned faces are definitely not required - I didn't do any kind of alignment for my anime faces and you can see the eyes/nose/mouth in all sorts of positions in the samples & videos.
This is usually done, most GAN paper show in the appendix a list of generated images with distance to images in the dataset, for example check from pages 14 to 16 in this GAN paper [1].
Note that measuring distances between images is not trivial and some measurement space must be chosen, typically cosine at the last ResNet feature map.
you can download the training set here https://github.com/NVlabs/stylegan#resources it's based on 70k high resolution flickr images. try interpolation on the colab link above so you can be convinced its capturing the features rather than just memorizing
I'm seeing some characteristic artifacts in most of these pictures. A hair halo floating just outside of the head is pretty common, and there's a sort of rainbow fringing that was very common in the deep dream postings that I'm still seeing popup...
... but all, or almost all, of these would be irrelevant at a profile pic size. At that size, assuming these aren't just recapitulations of the training data (and I assume they aren't) this technique appears to be 99%+ successful.
Also, look at the non-face stuff. Some backgrounds are just "incredibly blurred vaguely landscapy stuff", which is plenty realistic, but I've seen the algorithm attempt wood grain, which went poorly. I've seem some bizarre patchwork backgrounds, and one picture had a person cut off to the right like a single photo trimmed from a family photo, and the cut-off person was some sort of SCP-monstrosity mercifully cut off by the edge of the photo. Still, the success is impressive. The failures are definitely going from "in your face" to "easy to ignore/miss".
Fix up the training data a bit and this'd be a profile pic machine.
Every one of these pictures has something 'off' with them. But it takes a while to notice. My first three - the teeth were odd, there were small, regular rectangles out of the teeth. Then it was the eyes. Specifically the iris was the wrong shape and had too much glare from what I assume is a camera.
Every single picture, if you really look at it, is disturbing for reasons you can't pick out. It's definitely hitting the uncanny valley. It's juuuust human enough to blend, but not human enough to avoid the creeping feeling of dread.
But that being said, if I'm cruising forums and see this in a thumbnail size, I'm not going to be able to pick out that it's not a real person.
they're locally plausible but globally everything's slightly wrong: the measurements of features, the mixture between high-resolution and sudden blurriness, the occasional warping effect, the shifting perspective, the eyes don't sit in the skull correctly, the hair seems to be intersecting the forehead sometimes, every area seems to be located in a different space & there's no three-dimensional coherence to it
With some of them you notice it straight away like the woman with something sticking out of a hole in her cheek:
https://fb.pics/image/38yjt
and the woman with the mutilated left ear:
https://fb.pics/image/380UC
and the child with the adult eye bags:
https://fb.pics/image/38Jva
One thing it almost always does wrong is glasses.
https://fb.pics/image/38NNu
And apart from pictures of young children, most of them have strange vertical wrinkles under the eyes, even when everything else is relatively convincing.
right off the bat the two photos [in each post] have some sort of artefact on the right temple, gave it away as a manipulated image on first glance. and there are halos and blur.
im also wondering just how far these fakes could go? there is such high resolution with very common cameras now that the reflection of what a person is looking at is visible in the lens of thier eye. [its even a zero day] the AI is going to need someway of creating a fake setting to go with the face, and fake EXIF data that matches the fake camera model that would have taken the picture.
Stolen ones are easier to do reverse image detection on and expose as fraudulent. I see this a lot on Twitter - bots pull a mix of stolen profile pics, bios, etc.
Since it uses a deep neural network, I don't think it's "thousands a second". Also, you can download an image and crop a face out of it in several seconds. You could even automate the process.
The biggest problem is transferring faces to existing photos. It was hard to do manually. Now it's much easier. Also, people are generally trained to ignore various artifacts by CGI-ridden movies and compression algorithms. So much of our notion of how the world looks comes from digital imagery, it's kind of scary.
I think we need to change the threshold of quality for an image/video to constitute "proof" of any kind. You can hide most of the weird artifacts by scaling things down or passing them through heavy compression.
> Since it uses a deep neural network, I don't think it's "thousands a second".
The generator is like 150MB. The forward pass is <0.1s. Hypothetically you should be able to generate on a decent GPU like a 1080ti with 11GB VRAM at full utilization <730 images per second. Use a few GPUs and you're at thousands per second.
Where is this data from? I see 300MB model on their Google Drive. And if I understand correctly, you also need source and destination images to transfer styles from and to, so it's not like the model generates photos out of thin air.
The 300MB model covers both the G and D. You only need G to generate. The style transfer is just noise. And I time my own 512px anime StyleGANs at ~21 images per second per model; half that throughput to account for the increased model size and depth of a 1024px. No matter how you tweak the numbers - halve it again if you wish! - it's clear that thousands per second is entirely attainable with a few GPUs at low cost. (For comparison, 8 V100s is ~$7/hr on AWS; 10x1080ti is ~$1.3/hr on Vast.ai.)
The presentation is very interesting. That's always what amazes my with those GAN outputs. These people do not actually exist. Obviously, there are some funky examples. Nothing wrong with mine at first, although his buddy should probably see a doctor: https://imgur.com/a/dkS8Ux5
What's funny is "Like Lorem Ipsum, but for people" is the tagline for randomuser.me, which currently uses stock photos with random user data. Combine these 2 and you got yourself a party.
Run it several billion times, create an Earth-sized social network, then give it content with a meme generation system like Dank Learning https://arxiv.org/abs/1806.04510
Facum Ipsum sounds like an entertaining side-project. Coupling one of these images with a random profile generator for each employee and outputting to JSON. One button press would populate your app with relevant data.
People keep saying this'd be great for fake profile photos, but seems to me that's not realistic yet, at least not as demonstrated here.
A social media profile with a single picture is pretty suspicious.
To be convincing you'd need a steady stream of pictures of the "same" fictitious person, doing typical social media thing -- selfies with friends, vacation pics, appearances in other peoples' pictures, etc.
This is literally the first AI image processing project I've seen posted here that actually shows high resolution images.
Every single other one I've seen has a bunch of tiny low res thumbnails on a github page that serve to completely obscure any potential artifacts or issues with the system.
(I could clone their code and run it, but that's not the output that any of the discussion they've prompted is operating on, and that's kind of the point of hacker news).
So thanks for doing the bare minimum for an image processing project, finally.
Most of the projects you're thinking of probably never even generated high-resolution images. Until recently, the actual output layer for most of these systems would be something like a 256x256 array of pixels.
1. What use cases are there for a photo processing algorithm that only spits out tiny thumbnails?
2. If it can output higher resolution images, why are all of the examples tiny thumbnails? You can hide a lot of otherwise obvious flaws with a tiny thumbnail.
Because of computing power issues - training a good model with a significantly higher resolution becomes a lot more expensive. If you're doing a proof of concept or analyzing algorithms, you stick to lower resolutions; at larger resolutions the algorithm (and its effects) are the same, but you just need ten or hundred or thousand times more hardware and/or time.
GAN creating faces purely at the pixel level still seems a strange approach to me. In some years it will feel very restrictive. I guess it's the only tractable method at the moment.
Is anyone working on a GAN to generate bone structure then flesh and skin/mouth/eyes textures and pipe the result in a ray tracer?
It's incredible what can be done in 2D solving directly for the result, but imagine where this goes when this works in volume and multiple levels more driven by physics.
Upon further reflection, my observation's mostly a straw man. HIPAA covers a patient's face when a dermatologist takes an image just as a x-ray covers a bone structure.
Privacy is hard to talk and reason about without defining everything specifically.
This projects images were sourced from Flickr. You can find medical imagery on Flickr reasonably easy as well it turns out.
That defeats the purpose. The purpose of deep learning is to let the machine solve the problem using only goal data. If you are going to program it with bones and flesh models then you don't need a neural net.
I disagree. If the purpose is simply to get a single-use photograph, then yes. But for this type of application I think we want to create a virtual human that can, at the very least, be photographed from two different angles and ideally put into motion. To create an virtual entity.
You wouldn't "program" it with flesh and bones, you would generate a life-like but original new skeleton in the same way we generate these images, except the space of solution is the space of possible skeletons instead of possible pixel configurations. And then generate soft tissues that are also original, conforming to biology constraints and also constrained by the underlying skeleton. Same for skin, created from scratch but believable and driven by the underlying tissue.
But what if you can train other feature lines that will generate backgrounds and other humans? Then add in another feature line that can morph an existing human line into different views, poses and/or animations.
With enough training data, I would think you could create alternate views from the first view via ML methods, rather than doing skeletal structure by ML and then physics modeling to get views.
With tens of thousands of training images, and hundreds of dimensions in the latent space, I don't think I would assume this is true. You may be thinking one feature is "eyes", but the eyes may instead be built out of 10 sub-features. Those sub-features may be inextricably linked to other, identifiable, macro-features.
Hmm, it does babies and infants too. I was not expecting that.
Other commentators mention that Ashley Madison, stock-photo companies, and other spammers will take this and run with it. Honestly, I suspect that has already happened for a while now and may explain the issues that FB and Twitter are having [0].
Though I can't find the thread, there was a discussion here on HN a while back about the 'Inversion' issue. Briefly, Youtube uses some ML and RNN stuff to help determine spammers vs. real-people (after pre-processing and cleaning things up a fair bit). However, if the number of spammers becomes too high, such that the spammers that DO make it through the various filters become over 51%, then the filters will 'Invert'. Meaning that the MLs and RNNs will start to classify the spammers as 'real' and the real-people will likely be told they are spammers.
I can imagine that this site will quickly exacerbate that issue.
Honestly, in reading Cal Newport's new book [1], I can't say I'm all that sad about it. The 'casino' like design of the modern web is obviously bad for us. In moderation, yes, but to the level we are at currently? Not a chance. Driving users away from these sites and devices isn't a bad thing for anyone that isn't earning a paycheck via the FAANGs.
Hopefully this kind of tech will cause a bit of a restructuring of the modern web in the long haul. I doubt it, but one can hope.
I'm waiting for the time where all this neural technology will start to be used for something good, like making better games with more NPCs who can really talk. Maybe never. The life-cycle seems to be research -> malicious use -> hipster nonsense -> memes -> abandonment.
Edit: the problems with these images look very much like application of anisotropic smoothing. G'MIC has filters like that. You can make this stuff look more realistic by blurring it (gaussian) and adding noise (uniform). Blurring hides small-scale irregularities, while noise makes blurring less obvious by adding small-scale "grains" that you perceive as detail/texture.
Things do get applied. If you want applications of GANs to better games, did you notice the past few weeks discussion of using ESRGAN & other superresolution GANs to upscale artwork of old games to make them far prettier and highres?
I was thinking more along the lines of an RPG with generated town population. Indie devs usually don't have resources to draw hundreds of portraits and record tends of hours of dialog, but if they could use ML to generate portraits and synthesize decent speech, it would allow them to create much bigger, more immersive worlds.
Also, this stuff could be used to take a hand-drawn portrait and animate it with different expressions without rework.
"The future is already here, it's just unevenly distributed." Stuff like that will come, eventually. Like any cutting-edge R&D, there's a long valley of death from a lab demo to a globalized real-world product. It's a lot of work to package things up so reliably & cleanly that harried game devs can easily & usefully incorporate them into games.
Well, I'm not so sure. On one hand, games use graphics techniques that were developed couple years prior to their development. On the other hand, they fail to capitalize even on relevant AI research of the 60s and 80s. When they do, it looks amazing (e.g. FEAR AI), but it's very rare.
The excuse I always hear is that those techniques might be smarter but not as fun. That doesn't appear near as applicable to graphics stuff, where game developers have always been eager to apply and extend the cutting edge.
> As much as we like to pat ourselves on the back, and talk about how smart our A.I. are, the reality is that all A.I. ever do is move around and play animations! Think about it. An A.I. going for cover is just moving to some position, and then playing a duck or lean animation. An A.I. attacking just loops a firing animation. Sure there are some implementation details; we assume the animation system has key frames which may have embedded messages that tell the audio system to play a footstep sound, or the weapon system to start and stop firing, but as far as the A.I.’s decision-making is concerned, he is just moving around or playing an animation.
> Now let’s look at our complex behaviors. The truth is, we actually did not have any complex squad behaviors at all in F.E.A.R. Dynamic situations emerge out of the interplay between the squad level decision making, and the individual A.I.’s decision making, and often create the illusion of more complex squad behavior than what actually exists!
> Imagine we have a situation similar to what we saw earlier, where the player has invalidated one of the A.I.’s cover positions, and a squad behavior orders the A.I. to move to the valid cover position. If there is some obstacle in the level, like a solid wall, the A.I. may take a back route and resurface on the player’s side. It appears that the A.I. is flanking, but in fact this is just a side effect of moving to the only available valid cover he is aware of.
As this stuff gets better and better, I wonder if it will actually increase our distrust in digital imagery and in doing so, increase the demand for in person interaction, at least in matters of consequence.
Either that, or digital communication will have to include defensive fake detection features, and the rest of that thought is a Philip K Dick novel ;)
Photos have been faked since the beginning of photography. Mathew Brady (famous for photographing the American Civil War) moved dead bodies to stage shots, had live soldiers lie down and pretend to be dead, etc.
The dataset is compiled from flickr portraits, and the model was trained to generate sampels from that distribution, so there will definitely be bias towards the sort of people whose portraits end up on flickr. Per the paper:
>The images were crawled from Flickr (thus inheriting all the biases of that website) and automatically
aligned and cropped. Only images under permissive licenses were collected. Various automatic filters were used
to prune the set, and finally Mechanical Turk allowed us
to remove the occasional statues, paintings, or photos of
photos.
It's still much more diverse than previous datasets (which used U.S. celebrity photos), but would require some additional work to match the actual world population distribution.
There was a study done on faces and beauty. They created faces based on global averages of features and found these (synthetically invented) faces to rank even more highly on the beauty scale.
I've never read any benchmarks around render time on sophisticated GANs, so maybe the answer to this is obvious, but: Is this showing a random selection from a set of offline-generated images, or is the GAN actually generating these on each request?
This is what always bothers me about demonstrations of the technology. There's almost always a extra hidden layer of human curation of the output so we only see the examples that are most interesting (90% amazing result with a mixture of hilarious/horrible bad results for flavor)
This work is impressive, I don't mean to take anything away from it, but if the author had to filter through 1000 images to select the 5 I saw, that's ... disappointing?
It's worth noting that although at a first glance the face looks extremely realistic, there are some details that don't quite make sense and hint at a randomly-generated face.
- weird hair above the person's right eye, that doesn't match with the overall hairstyle (the patch of short hair) or realistic hair behaviour (straight bit of hair)
- what seems like beard on the chin, with unrealistic lighting
- hair turns into leaves at the bottom
- weird reflex in the left glass
- mismatching shapes for glasses (there's a small bump only on the right glass)
Hrm, your example was quite glaring in its flaws, most of the images I saw, on the other hand, looked quite flawless. I actually came here to disagree with someone else who stated the images he saw had alien-like alarming characteristics, or something along those lines. I can’t tell most of these are fake even at full size on an XS Max.
It's also worth noting that a couple of years ago, most GAN-generated faces looked obviously wrong at 128x128px. It's entirely plausible that this approach is ultimately a dead end, but it's also plausible that we're at a crucial inflection point in the development of computing.
I noticed that there's generally odd texturing and hair placement. Wrinkles on an otherwise smooth face, or in weird places / directions. Hair of a mismatched color. The unusual facial texturing seems to occur more on the right side of these images.
Most of them are CLOSE but still pretty Uncanny Valley, then there's https://i.imgur.com/A2ThWBE.jpg which was...shocking to say the least. Crazy how far this stuff has come though, and how many more applications it has.
Also interesting how the oddities of the images look a lot like how some of the visual effects of psychedelics manifest. Hair blending into an ear, the lines around the eyes trailing off, the "hairiness" of some of them, and of course the nightmare fuel I linked above.
Wouldn’t it be possible to add a real human feedback on each face presented on thisisnotaperson.com, in the form of a simple button « Fake » « Not fake » that would help the disciminator in its analysis with thousands of inputs?
A painting is usually immediately recognizable as a work of art / fiction. Do you want to appear as a team member on an escort service site which uses automatically generated placeholder images to protect their employees? Do you want your face on a billboard ad for Viagra?
There is a huge difference between "hey, that painting looks like you" and "hey, that is a photograph of you".
That almost certainly isn't needed for a project like this, but I wonder if it would be needed to try to make money off this. It might be like the whole "The events depicted in this movie are fictitious..." warning you see at the end of almost every movie regardless of it is some realistic and plausible story or if it something like Avatar.
This will soon get good enough to be indistinguishable from real faces. What's more, there will be collisions with real faces too, which could be amusing or disturbing when it triggers a conflict.
Very interesting. I'm sure this technology will get better. It's in its infancy.
But what's "off" to me about these pictures are the eyes, every single picture. I don't get that feeling of human connection. In some of the pictures, the "person" has two different eyes. In some others, the eyes just make me feel sick to my stomach if I look at them. They're "off" at best and super creepy at worst.
That being said, as others have pointed out, in profile pic size I'm sure I couldn't tell.
biggest weakness of this system seems to be generating realistic backgrounds, the faces look amazing - but around the edges some photos appear to "swirl" with the background
That last one is funny. Basically: expect anyone photographed at this angle to be wearing one of those TED-talk wireless microphone things usually. Rendered probabilistically?
Really cool stuff, but aside from the obvious nightmare fuel stuff downthread, there's often still some interesting artifacts. I doubt I'd have noticed them before reading a medium piece on the subject, but they are there: https://medium.com/@kcimc/how-to-recognize-fake-ai-generated....
Not to go all Black-Mirror here, but imagine training a model based on individualized user preferences to generate some sort of idyllic "person", leveraged for surreptitious advertising. For instance, analyzing what kind of Instagram models one engages with regularly and using that data to generate individualized "influencers". Just a thought :)
Hmm. You could take a subset of photos, label them on attractiveness, figure out the vector for your personal sense of human aesthetic, and then generate pictures of people that you specifically find beautiful. Sounds like a good business idea for... uh... ads.
(Or a service that makes you prettier in social media photos. Yay dystopia!)
The first 3 I viewed were all incredible except for a weird little flaw around the edge of the hair/background line, first 2 had an unnatural notch out of their hair and the third was a weird discolouration/thinning of hair.
Absolutely awesome! Some faces seem to have some weird horizontal asymmetry, though (I'm not saying that people are perfectly symmetric, but in these pictures the two halves of the face seem to belong to different people).
The first time I heard of GAN's creating faces of people who never existed was from the show "Person of interest", where the "machine"(an AI) creates a face and assigns it to its identity.
Is there any service that let's you pay to play with GANs? I have some art projects but don't want to go through the trouble of setting it all up. Certainly someone would rent me their GPUs?
I've had a small (nerdy) dream of converting current generation Pokemon sprites into older palette and size constrained sprites using techniques similar to this.
This image generator does not understand the concept of earrings. Or stretched earlobes with plugs in them. Now and then I get a person who has these weird jewel-sores on their cheeks.
If a generated ”random” person happen to be identical with a real person then you must say the person exist. It’s very likely that some images describe real people and therefore the domain name makes no sense. Like if a process happened to generate a real equation it’s not the case that the equation doesn’t exist.
Seriously, these photos are really creeping me out. They've _almost_ escaped uncanny valley but the little details (wonky ears, unnatural hair, weird artifacts) make them super creepy.
The worst I've gotten is this guy: https://imgur.com/A2SsmVn. He's mostly normal but some glitch (presumably a partial pair of glasses) makes it look like his robot face mask is coming unpeeled. Extremely unsettling.
EDIT: Holy crap, I might have found one worse. This woman is smiling away as her mutated hand stabs a piece of glass into her face and I am so uncomfortable: https://imgur.com/z6UKVWn
That second one is terrifying! Did someone feed the system thalidomide and PCP or what? Some of them though, if I didn’t know I was looking for flaws, I would probably never think twice about them being humans.
There's at least one doppelganger that looks exactly like you in the world. It's interesting to think that some of these GAN rendered images of people could actually be people that exist in the real world.
Yes. You can take a particular latent point and generate slight variants of it. You could also apply different style/noises to change the overall appearance like orientation. Take a look at the StyleGAN paper's figures, the StyleGAN video, or the various interpolation videos people have made.
Gwern has applied this to anime dataset https://twitter.com/gwern/status/1095131651246575616
Cyril at Google has applied it to artwork https://twitter.com/kikko_fr/status/1094685986691399681
This was to raise awareness for what a talented group of researchers made at Nvidia over the course of 2 years, the latest state of the art for GANs. https://arxiv.org/pdf/1812.04948.pdf (https://github.com/NVlabs/stylegan)
Rani Horev wrote up a nice description of the architecture here. https://www.lyrn.ai/2018/12/26/a-style-based-generator-archi...
Feel free to experiment with the generations yourself at a colab I made https://colab.research.google.com/drive/1IC0g2oDQenrDmwbtkKo...
I'm currently working on a project to map BERT embeddings of text descriptions of the faces directly to the latent space embedding (which is just a 512 dimensional vector). The goal is to control the image generation with sentences, once the mapping network is trained. Will definitely post on hacker news again if that succeeds. The future is now!