Gwern has applied this to anime dataset
Cyril at Google has applied it to artwork
This was to raise awareness for what a talented group of researchers made at Nvidia over the course of 2 years, the latest state of the art for GANs. https://arxiv.org/pdf/1812.04948.pdf (https://github.com/NVlabs/stylegan)
Rani Horev wrote up a nice description of the architecture here. https://www.lyrn.ai/2018/12/26/a-style-based-generator-archi...
Feel free to experiment with the generations yourself at a colab I made
I'm currently working on a project to map BERT embeddings of text descriptions of the faces directly to the latent space embedding (which is just a 512 dimensional vector). The goal is to control the image generation with sentences, once the mapping network is trained. Will definitely post on hacker news again if that succeeds. The future is now!
"One or more high-end NVIDIA GPUs with at least 11GB of DRAM. We recommend NVIDIA DGX-1 with 8 Tesla V100 GPUs."
I'm sure my wife will understand why I took out that second mortgage on our home...no problem.
How many faces are generated? This can't be real time.
The results were .... eh .... okay, at least on my ugly face.
But overall, Im still tweaking. In the mean time, I've been focusing on static image analysis for aging research, but I hope to find better encoding schemes down the road.
> Turns out it can disentangle pretty much any set of data.
All the example I have seen (including your links) are variants of face generation algorithms. Any ideas on how this could be useful beyond image generation in some style? Specifically for (data) science?
Sorry if this is a naive question.
By "variants of face generation algorithms" I mean any image generation really.
Aside from the original work, on Twitter, people have done Gothic cathedrals very well, graffiti very well, fonts very well, and WikiArt oil portraits not so well. On Danbooru2017 full anime images (linked in my thread), one person has... suggestive blobs but has only put 2-3 GPU-days into it and we aren't expecting much so early into training. skylion has been running StyleGAN on a whole-body anime character dataset he has, and the results overnight (on 4 Titans) are pretty impressive but he hasn't shared anything publicly yet.
(Who says we aren't compute-limited these days?!)
That is, until Graphcore delivers their IPU.
It's not that hard to do it yourself, but it's a really clean package, and it gives you nice CLI flags for most things like pooling strategy, and what layer you want to get the activations from.
I think this is a very dangerous game we are playing here but I guess it is going to be done.
then yes, it should be possible
"To qualify as a work of 'authorship' a work must be created by a human being":
https://www.copyright.gov/comp3/chap300/ch300-copyrightable-... [PDF], see section 313.2 "Works that lack human authorship"
Monkey selfie case:
On 23 April, the court issued its ruling in favor of Slater, finding that animals have no legal authority to hold copyright claims 
Copyright is (read the law!) a temporary monopoly granted for works meeting certain criteria, being creative is one of them. You’d hold copyright for the code you wrote to generate the “art”. If you download somebody else’s code (as this site uses Nvidia’s), you lack the creative element.
> Recently a talented group of researchers at Nvidia released the current state of the art generative adversarial network, StyleGAN, over at https://github.com/NVlabs/stylegan
> I have decided to dig into my own pockets and raise some public awareness for this technology.
> Faces are most salient to our cognition, so I've decided to put that specific pretrained model up. Their research group have also included pretrained models for cats, cars, and bedrooms in their repository that you can immediately use.
> Each time you refresh the site, the network will generate a new facial image from scratch from a 512 dimensional vector.
we can smoothly interpolate between faces, so it seems impossible to me that these are just memorised from the training set
However, aligned faces are definitely not required - I didn't do any kind of alignment for my anime faces and you can see the eyes/nose/mouth in all sorts of positions in the samples & videos.
Previously it wasn't trivial to do a GAN image generator, now as this site shows it's, if not trivial, also not particularly hard.
... but all, or almost all, of these would be irrelevant at a profile pic size. At that size, assuming these aren't just recapitulations of the training data (and I assume they aren't) this technique appears to be 99%+ successful.
Also, look at the non-face stuff. Some backgrounds are just "incredibly blurred vaguely landscapy stuff", which is plenty realistic, but I've seen the algorithm attempt wood grain, which went poorly. I've seem some bizarre patchwork backgrounds, and one picture had a person cut off to the right like a single photo trimmed from a family photo, and the cut-off person was some sort of SCP-monstrosity mercifully cut off by the edge of the photo. Still, the success is impressive. The failures are definitely going from "in your face" to "easy to ignore/miss".
Fix up the training data a bit and this'd be a profile pic machine.
Every single picture, if you really look at it, is disturbing for reasons you can't pick out. It's definitely hitting the uncanny valley. It's juuuust human enough to blend, but not human enough to avoid the creeping feeling of dread.
But that being said, if I'm cruising forums and see this in a thumbnail size, I'm not going to be able to pick out that it's not a real person.
I'd say you're correct that it is still in it, but it is clearly climbing up the other side now. We're past the minima of verisimilitude now.
The biggest problem is transferring faces to existing photos. It was hard to do manually. Now it's much easier. Also, people are generally trained to ignore various artifacts by CGI-ridden movies and compression algorithms. So much of our notion of how the world looks comes from digital imagery, it's kind of scary.
I think we need to change the threshold of quality for an image/video to constitute "proof" of any kind. You can hide most of the weird artifacts by scaling things down or passing them through heavy compression.
The generator is like 150MB. The forward pass is <0.1s. Hypothetically you should be able to generate on a decent GPU like a 1080ti with 11GB VRAM at full utilization <730 images per second. Use a few GPUs and you're at thousands per second.
Run it several billion times, create an Earth-sized social network, then give it content with a meme generation system like Dank Learning https://arxiv.org/abs/1806.04510
A social media profile with a single picture is pretty suspicious.
To be convincing you'd need a steady stream of pictures of the "same" fictitious person, doing typical social media thing -- selfies with friends, vacation pics, appearances in other peoples' pictures, etc.
Every single other one I've seen has a bunch of tiny low res thumbnails on a github page that serve to completely obscure any potential artifacts or issues with the system.
(I could clone their code and run it, but that's not the output that any of the discussion they've prompted is operating on, and that's kind of the point of hacker news).
So thanks for doing the bare minimum for an image processing project, finally.
Two problems that I can see:
1. What use cases are there for a photo processing algorithm that only spits out tiny thumbnails?
2. If it can output higher resolution images, why are all of the examples tiny thumbnails? You can hide a lot of otherwise obvious flaws with a tiny thumbnail.
Is anyone working on a GAN to generate bone structure then flesh and skin/mouth/eyes textures and pipe the result in a ray tracer?
It's incredible what can be done in 2D solving directly for the result, but imagine where this goes when this works in volume and multiple levels more driven by physics.
Of course, this is also why training data is more difficult to acquire, as you mention.
Privacy is hard to talk and reason about without defining everything specifically.
This projects images were sourced from Flickr. You can find medical imagery on Flickr reasonably easy as well it turns out.
You wouldn't "program" it with flesh and bones, you would generate a life-like but original new skeleton in the same way we generate these images, except the space of solution is the space of possible skeletons instead of possible pixel configurations. And then generate soft tissues that are also original, conforming to biology constraints and also constrained by the underlying skeleton. Same for skin, created from scratch but believable and driven by the underlying tissue.
Other commentators mention that Ashley Madison, stock-photo companies, and other spammers will take this and run with it. Honestly, I suspect that has already happened for a while now and may explain the issues that FB and Twitter are having .
Though I can't find the thread, there was a discussion here on HN a while back about the 'Inversion' issue. Briefly, Youtube uses some ML and RNN stuff to help determine spammers vs. real-people (after pre-processing and cleaning things up a fair bit). However, if the number of spammers becomes too high, such that the spammers that DO make it through the various filters become over 51%, then the filters will 'Invert'. Meaning that the MLs and RNNs will start to classify the spammers as 'real' and the real-people will likely be told they are spammers.
I can imagine that this site will quickly exacerbate that issue.
Honestly, in reading Cal Newport's new book , I can't say I'm all that sad about it. The 'casino' like design of the modern web is obviously bad for us. In moderation, yes, but to the level we are at currently? Not a chance. Driving users away from these sites and devices isn't a bad thing for anyone that isn't earning a paycheck via the FAANGs.
Hopefully this kind of tech will cause a bit of a restructuring of the modern web in the long haul. I doubt it, but one can hope.
 Not that Jack can even properly use twitter to begin with: https://danluu.com/karajack/
Edit: the problems with these images look very much like application of anisotropic smoothing. G'MIC has filters like that. You can make this stuff look more realistic by blurring it (gaussian) and adding noise (uniform). Blurring hides small-scale irregularities, while noise makes blurring less obvious by adding small-scale "grains" that you perceive as detail/texture.
Also, this stuff could be used to take a hand-drawn portrait and animate it with different expressions without rework.
Well, I'm not so sure. On one hand, games use graphics techniques that were developed couple years prior to their development. On the other hand, they fail to capitalize even on relevant AI research of the 60s and 80s. When they do, it looks amazing (e.g. FEAR AI), but it's very rare.
> As much as we like to pat ourselves on the back, and talk about how smart our A.I. are, the reality is that all A.I. ever do is move around and play animations! Think about it. An A.I. going for cover is just moving to some position, and then playing a duck or lean animation. An A.I. attacking just loops a firing animation. Sure there are some implementation details; we assume the animation system has key frames which may have embedded messages that tell the audio system to play a footstep sound, or the weapon system to start and stop firing, but as far as the A.I.’s decision-making is concerned, he is just moving around or playing an animation.
> Now let’s look at our complex behaviors. The truth is, we actually did not have any complex squad behaviors at all in F.E.A.R. Dynamic situations emerge out of the interplay between the squad level decision making, and the individual A.I.’s decision making, and often create the illusion of more complex squad behavior than what actually exists!
> Imagine we have a situation similar to what we saw earlier, where the player has invalidated one of the A.I.’s cover positions, and a squad behavior orders the A.I. to move to the valid cover position. If there is some obstacle in the level, like a solid wall, the A.I. may take a back route and resurface on the player’s side. It appears that the A.I. is flanking, but in fact this is just a side effect of moving to the only available valid cover he is aware of.
Either that, or digital communication will have to include defensive fake detection features, and the rest of that thought is a Philip K Dick novel ;)
Does it only generate white people? I've been refreshing for a while but don't see any people of color.
Edit: No, I finally got someone who wasn't white, it just seems to have a helluva bias.
>The images were crawled from Flickr (thus inheriting all the biases of that website) and automatically
aligned and cropped. Only images under permissive licenses were collected. Various automatic filters were used
to prune the set, and finally Mechanical Turk allowed us
to remove the occasional statues, paintings, or photos of
It's still much more diverse than previous datasets (which used U.S. celebrity photos), but would require some additional work to match the actual world population distribution.
This work is impressive, I don't mean to take anything away from it, but if the author had to filter through 1000 images to select the 5 I saw, that's ... disappointing?
This is what I got: https://i.imgur.com/iCfzjkZ.jpg
In no specific order:
- weird hair above the person's right eye, that doesn't match with the overall hairstyle (the patch of short hair) or realistic hair behaviour (straight bit of hair)
- what seems like beard on the chin, with unrealistic lighting
- hair turns into leaves at the bottom
- weird reflex in the left glass
- mismatching shapes for glasses (there's a small bump only on the right glass)
Also a LOT more in this subreddit: https://old.reddit.com/r/SyntheticNightmares/
Clearly, this algorithm has captured a dryad.
But seriously: it may be essential for legal reasons to be 100% sure that an automatically generated face does in fact not depict a real-life person.
There is a huge difference between "hey, that painting looks like you" and "hey, that is a photograph of you".
But what's "off" to me about these pictures are the eyes, every single picture. I don't get that feeling of human connection. In some of the pictures, the "person" has two different eyes. In some others, the eyes just make me feel sick to my stomach if I look at them. They're "off" at best and super creepy at worst.
That being said, as others have pointed out, in profile pic size I'm sure I couldn't tell.
(Attribution-NonCommercial 4.0 International) https://drive.google.com/open?id=1TKGTq6XgMBzA29EfOGD6RB9jjP...
(Or a service that makes you prettier in social media photos. Yay dystopia!)
I could use those but was hoping for something more turnkey.
Anyone have any suggestions on how I might start?
Some might just think they are deceased with the current title.
The worst I've gotten is this guy: https://imgur.com/A2SsmVn. He's mostly normal but some glitch (presumably a partial pair of glasses) makes it look like his robot face mask is coming unpeeled. Extremely unsettling.
EDIT: Holy crap, I might have found one worse. This woman is smiling away as her mutated hand stabs a piece of glass into her face and I am so uncomfortable: https://imgur.com/z6UKVWn