On a side note, the section "A word about ethics" was a welcome addition. I found this:
> If we can generate realistic looking faces of any type, what are the implications for our ability to trust in what we see.
resonated with another article posted on HN (https://news.ycombinator.com/item?id=18309305) talking in part about the ethical impact of engineered addiction.
I guess very soon we will be able to generate "super-attractive" (as in "superstimuli") faces for virtual personas, according to targeted demographics and purpose (advertisement, youtube videos for kids, political messages and so on).
Devils advocate; I doubt that. "super-stimulating" ourselves into oblivion requires a level of willful complacence we manage to avoid, at least on the whole as a species. Otherwise things like Vegan diets wouldn't be fads because no one would willfully choose crappier food options for the sake of abstract reasons like "ethics" or "morality" that have zero impact on our daily lives.
There is solid evidence against vegan diets being a fad, unless you regard a >3% yearly sales growth of vegan-labeled food or a roughly 600% increase in google searches since 2004 a fad, in addition to the roughly 500m-1b people who are on a mostly plant-based diet for cultural or practical reasons. I'd wager that only a small percentage of people are in it solely because of ethical or moral reasons.
Everyone likes to be super stimulated. That's what the phrase means. It's tautological. You may just like to be super stimulated by different things, but at the end of the day, it's just dopamine in your Ventral Tegmental Area.
GP does not deny being super stimulatable. Just not by a beautiful face. Once AI reaches the level that it can be an intellectual mentor people like GP maybe sucked into infinite pointless learning just for the kick of it.
It’s already happening without AI (I read HN for mostly pointless stimulation; there is much more fascinating knowledge on the internet than you can understand in your lifetime)
The first fact is free. That's how confident they are that you'll become a paying member.
I know many brilliant people with zero drive who hedonistically seek out the greatest stimuli they can find. When I talk with them about this, they say it is partly out of a feeling of depression/helplessness alongside feeling like you can’t actually change anything in the world so you might as well just enjoy the ride.
I’ve found driven people have more impulse control regardless of “intelligence” (however we may choose to measure this).
Imagine a version of Star Trek where it turns out that every planet is Risa, only Vulcan is Space Buddhist Risa, Qo’noS is Kinky Risa, Kzin is Furry Risa etc., and all the Vulcans, Klingons, and Kzinti etc. have just been extinct for thousands of years leaving bots behind.
There is no doubt that technology will revive the hedonistic trap from times and lead to some degree of decadence, but the stoic mindset should survive and prevent mankind to fall completely.
"...big data and your love live after this..."
You misspelled "profitable"
We have done that already, except it was not using AI. I would classify most anime under that label of unrealistic super-attractive images.
About the ethical impact, Akihabara seems to be one end result of this. So it would be the same but in a larger scale.
How is it really different, in an essential way, from using Eigenfaces for the same task?
There's a paper from year 2000 that does exactly that; you can see the results on page 5.
Now, the results in the linked article are way more impressive than the Eigenfaces-generated random faces, but from skimming the text, it seems like the principles are the same: dimensionality reduction from a high-dimensional space into a space where the axis are "meaningful" components (in Eigenfaces, the reduction is done using SVD).
Eigenfaces is a linear approach; there's also a multilinear version, TensorFaces by Vasilescu et al. I wonder if similar quality images can be obtained using this approach.
Also, deep learning based approaches don't require too much fiddling with features. Main work involves coming up with NN architecture and loss function.
For variational autoencoders (one baseline technique for this sort of thing) if you make your neural network have one layer without a nonlinearity, and train it, it ends up minimizing the same objective as PCA (i.e. finding the eigenfaces).
I believe this is also true of GANs where you similarly restrict the generator and discriminator to be very simple.
I bet there is a nonlinear non-NN approach that could perform well, but we may not have the investment in hardware, well-optimized algorithms, etc to train big models fast.
edit: here's a paper that connects GAN to PCA in a simple case, among many other things. not the easiest to follow, though.
Unlikely in the near future, highly probable in the longer term. Imagine writing a script (movie) and having the AI generate the characters ... storytelling to a new level.
Avatar Digitization From a Single Image For Real-Time Rendering
(from the people at Pinscreen)
There's a lot of work being done on using deep learning for computer graphics in general as well.
It's not AI, but MakeHuman has parametric features for human meshes, and you can hit "Random". It also has sliders for age and gender parameters. I last used it a while ago, I'm sure I'm underselling it.
This approach doesn't work very well because simple linear morphs don't account for all the correlations and multidimensional variations of an actual human body shape, resulting in similar or unrealistic body shapes. It's like nudging a face in a photo editor. Different characters in Daz Studio are still hand-crafted in a separate 3D modeling software, and just adjusted with the morphs.
Images of real people have been scored along the adjustable metrics, and then as you click the +/- adjustments, the source images are very craftily "blended" to produce a finite spectrum of results.
Yes, no output image is itself a "real" human (although I wonder if you tweak the settings just right to exactly match one of the input images if it would just spit it back out?).
It does not appear to me that we are seeing an AI that has learned what humans look like and can now generate arbitrary fictional humans. The magic trick is exposed by lossolo's comment below. 
EDIT: Apparently I'm just agreeing with what femto113 said 3 hours ago down-thread 
 - https://news.ycombinator.com/item?id=18310461
 - https://news.ycombinator.com/item?id=18311377
imagine if you could watch breaking bad season 9 for example, Walter White Jr. breaks bad
Two networks--a "generator" and a "discriminator"--play a minimax game where the generator maps random vectors into images, and the discriminator attempts to distinguish between these images and real images of celebrities. When the discriminator is (close to) optimal, image-gradients based on the features it uses to predict are passed to the generator, so that the generator can learn the distribution in feature space that makes up human faces.
> Images of real people have been scored along the adjustable metrics, and then as you click the +/- adjustments, the source images are very craftily "blended" to produce a finite spectrum of results.
There is no reason to believe the source images are actually embedded in the latent space.
The descriptors that you can edit in the gui don't necessarily span an orthogonal basis in that latent space, so some of them are correlated, which is why editing one value can change others. Additionally, there is no a priori reason to believe that the manifold of "human face-like images" of 628x1024 is 512-dimensional, so there are areas of the space that still don't map well to real images. The network's ability to cover this space is limited by the number of unique training images it sees, how long it is trained, and its architecture.
I think both you and the author of the article are making the same mistake here. (Although at least you use "orthogonal" and "correlated," whereas the author calls nonorthogonal vectors "entangled" for some reason.)
If you have a nonlinear function f on a vector space, there's no reason why an orthogonal basis for that space will give a better parameterization than a nonorthogonal basis. Even if you have a linear function, there's no reason why that should make a difference.
(For example, take f(x,y) = (x-y,y). Then f(x,0)=(x,0) and f(y,y)=(0,y), so "correlated" input directions (1,0) and (1,1) are mapped to "independent" or orthogonal outputs.)
I think it is a bit of a mystery why Gram-Schmidt orthogonalization makes a difference here. Perhaps the author should experiment more with different inner products.
>If you have a nonlinear function f on a vector space, there's no reason why an orthogonal basis for that space will give a better parameterization than a nonorthogonal basis.
I don't think I made that claim. Here's all I'm saying: To whatever degree the features of interest are linearized in the latent space (and there's really no guarantee that they are), we don't have any guarantee that those linear features are orthogonal to one another, so tuning the latent representation along one feature will also impact others.
> (For example, take f(x,y) = (x-y,y). Then f(x,0)=(x,0) and f(y,y)=(0,y), so "correlated" input directions (1,0) and (1,1) are mapped to "independent" or orthogonal outputs.)
That's true, but remember that the nonlinear mapping is from our latent space (spanned by uniformly random 512-element input vectors) to pixel space. We really don't care about linear algebra in pixel space. I have zero expectation that we would preserve orthogonality from latent to pixel space.
I don't think any part of the GAN objective requires that these interesting features actually be linearized in the latent space (obviously they are not in pixel space), but the approach is to use a GLM to find the latent vectors that best fit the features anyway. Whether or not the vectors you identify with the GLM really retain their semantic meaning through the latent space, they're also clearly not orthogonal, so changing the latent representation along one dimension also changes others.
Here, take a look at this video, one hour long of high-res generated faces. Can you still say AI hasn't learned to generate arbitrary identities?
I've got a similar algorithm https://bit.ly/2ELVG50
To use, upload a photo with the #ageme command. This is beta though, so will take a while to return. The other thing this bot does (the #showage command) runs instantly.
Your model seems to produce much higher quality output. I think I played a penalty by putting the users face into latent space, which is a difficult optimization to perform.
On a second thought, I don´t mean that the controls behave randomly. I understand they affect related parts of the system. If you increase "baldness" on a woman's face, it will obviously increase the "male" factor and the "gray hair" factor. I understand that this faces are being generated from a continuous space. Fascinating.
"Images of faces manipulated to make their shapes closer to the average are perceived as more attractive."
Basically, there's just one level of cognition. In this case, the AI would only achieve that expected fidelity if the system is layered with more and more models that aim for correctness and accuracy (does this look like a woman, does this look like a mouth, does this look like a nose, etc). The problem with this approach is that it becomes incredibly hard to determine what's needed to be 100% successful at a complex task.
This is the reason why I think we are still far far away from a fully cognitive AI and is the same reason why you only see AI used for very narrow use cases.
Self-driving cars seem to be the first real attempt to have a broad AI system applied to a super-complex and unpredictable field, but I always see conflicting information regarding the progress and challenges in this area.
In fact, that network is probably already part of the original GAN training phase.
Expecting that this works all the time (or expecting that all points in this 512-dim space result in a beautiful person) is probably a bit too much to ask. :-)
> Training on photos of celebrities of Hollywood means that our model will be very good at generating photos of a predominantly white and attractive demographic... If we were to deploy this as an application, we would want to make sure that we have augmented our initial dataset to take into account the diversity of our users.
Could this system generate credible faces of people of color? If it has a "gender" axis, could it have a "melanin" adjustment axis? Or various ethnic axes?
High cost to generate the data, but the first thing I’d do if I was setting up a designer baby startup.
And the same is true for what's presented here: as "fake photos", once you hit photorealistic, you're done. There is no "until humans start seeing the pixels".
Maybe once, but not over repeat occurrences. Then you have an arms race between consumers tuning their aesthetic preferences to certain quality signals, producers trying to exploit those preferences without actually delivering quality, consumers re-adjusting their preferences after getting burned, etc.
A closer comparison might be dating profile pics, which as I understand dating "shoppers" quickly learn to distrust, at least for certain angles or types of shots. This AI enhancement stuff would presumably cause some rapid evolution in that particular arms race.
The only thing keeping us safe is that we don’t yet know how our minds work.
On the other hand, this could be useful for generating NPC/character portraits in RPGs.Fat chance this will ever happen, though. No one wants to deal with Python mess of libraries and no one seems to care about putting neural net stuff in self-contained, reusable packages.
I guess one needs to train a model from a dataset that contains a picture of oneself and many images of other people, some with beards (possibly CelebA-HQ). According to the README this would take about 2 weeks on a NVIDIA Tesla V100.
Given that we already have a model from CelebA-HQ would it be possible to use that as a pre-trained model and just train it a bit more with an additional image? If possible would that speed things up enough to do it on cheaper hardware in less than a day?
Actually I think this could do the job: https://arxiv.org/abs/1702.04782
Is it the clothes, or gate, or size? Is it hair style or style of shoes? And at what age does this female/male separation begin. Generally, I can't assess new born people very well but by the age of say 5 years old it becomes much easier.
This software seems to answer some of these questions by doing things like presenting males with stronger jawlines or females with higher cheekbones. Does this same thing happen with other parts of the body (barring the obvious)? I know that females usually have wider hips and as a result walk differently than males. Are there other examples of differences that we intrinsically know but don't consciously realize?
NOTE: I'm speaking in traditional gender roles. Obviously I can't tell if someone is gender dysmorphic from a distance.
Probably everything from https://en.wikipedia.org/wiki/Sex_differences_in_human_physi....
Still very cool but not as cool as it initially appeared when I saw it.
The AI should generate a face which will be most beautiful and trusted in all cultures.
It will be using my present face and morph things from there according to the limitations.
Using least incisions in least risky areas of face.
After that we get a surgeon to do perform plastic surgery as per this AI.
I am looking for ideas on how to accomplish it.