I had the thought of how obfuscating this technology could be to hide information. Imagine this going off and publishing an unlimited number of images with generated descriptions that are indistinguishable from what a human would do. There would be no way to verify what the correct information is if you are someone just casually searching the Internet. (This could also almost apply for any type of information...)
Yeah, the listings on this page don't quite make it up to a human level of coherence:
"Thus serve a 24 hour security pournising. Also hype is requested as much as you please empty and have a prestige less um be restriction, day or night."
It's close, though! But maybe not close enough for people to worry -- the last 10% of the security pournising always takes 90% of the development time.
If that was going to be a problem, people would already be generating fake numerical data like material properties, physical constants, statistics, or whatever that's also subtly wrong. You don't need machine learning to do that.
This fear of fake information from ML misleading everyone is ridiculous and kind of arrogant. It assumes that the world is full of "other people" who are too stupid to make decisions for themselves and usually concludes that "us smart people" have to somehow control what they see to protect them from themselves. We've had fake information since forever and we've developed systems to deal with it. Citing sources, trustworthy organizations, multiple sources agreeing with each other, people pointing out mistakes, Google favoring popular sites, confirming it yourself, etc. Some fake information still gets through and it is a problem but it already happens and the world keeps turning. For casual internet searchers who don't care how reliable their information is, let them believe whatever nonsense satisfies them. They aren't trying to be right, they're just entertaining themselves.
At the very least, I think we need to train people in a lot more media literacy. But traditional approaches to that rely on media being scarce and expensive to create, which gave people enough time to carefully vet what they were consuming. As it becomes cheaper to create media than to vet it, we'll have the same problem as spam: it'll be impossible for humans to effectively filter it manually.
I think the real solution is automated vetting tools, so no information is presented without provenance. Basically, any time somebody sees an image or a video, there should be a link that lets you find out about the source, the editing, and who, specifically is vouching for it. And warnings for things that lack that. That still gives the viewer agency, but brings the problem back to human scale.
I wonder if this could be used as an identification tool for beetles like a phantom sketch. You could move some sliders to get it closer to the bug you are thinking of or which you have in front of you.
Of course, this method would have to compete with a related model just trying to classify a photo.
it could be used like that.
Technically, the model already learned the features of the dataset. (although it was unlabelled).
There are some implementations to find features in StyleGAN and enhance/modify those: StyleGan Encoder
https://github.com/spiorf/stylegan-encoder
This is a beautiful write-up. And I don't wish to detract from the author's work. Nature's result side-by-side with that of the Machine. Makes one feel as though we've taken a step backwards from Alan Turing's "The Chemical Basis of Morphogenesis".
Consider the "morphogenetic puzzle" of a bi-valved seashell that shuts with perfect water-tight seal. There is a constraint to this design: survival!
Reminds me of video game art: every game has two sets of art, the “real” art which is part of the mechanics of the game world, and the “pretty bits” which dangle off the game objects and try to trick you into believing there’s more to the game world than there is.
A lot of gameplay involves testing for this boundary... Trying to figure out whether you can actually do things that are implied by the art.
Are there any modern games where 100% of the art exists inside the game world?
Excellent. I did something similar a few months ago using a dataset of zoological silhouettes, resulting in a menagerie of mammals, bugs, spiders and other mutant wonders.
Was this model tested for overfitting? I do not have any sense of whether the beetles that I'm seeing match some source pictures exactly.
I noticed that the transformations seem to be fast through a transition and then seemingly paused. Is this intentional or does this have something to do with the model?
To the creator: since most (all?) of the beetles are symmetrical, couldn’t you generate left halves and then reflect it to create the right half? This could help you prevent asymmetric generations
The network should just learn that they all have symmetry and only encode the unique information. Once you start hand-coding priors like this, where do you stop? Maybe also constrain the range of colors? Size? Other geometric features? Eventually you're just doing old fashioned programming, not ML. And since generating beetle images isn't really the important goal anyway, why would you use dirty tricks to achieve it?
What I never understand about these things is ... what actually does the drawing? The AI decides what the beetle looks like, at what level of abstraction? When/how does it go from beetle idea to pixels? Does this network "know" what the beetle's "leg" is, or does it just "know" "this pixel here should be this color"?
From what I understand there are two networks in a GAN like this one.
One (the discriminator) is trained with a bunch of images showing what beetles can look like. It detects a real or fake image of a beetle.
The other (the generator) is just generating images with a convolutional neural network. The generator optimizes itself based on how close it is to passing the discriminators test - that is its "loss function".
So over time, the generator gets better and better at making things that look like beetles. The process takes a very long time and is aided by many GPUs (as mentioned in the article)
The machine here doesn’t even know that those are beetles (because nobody told it), it is “just” arranging pixels in a similar manner as the pixels from the source images. It does understand that each generated image must have “legs”, “eyes”, “shells”... and other features that it detected are common in the original images.