The ones with text are so surreal. They feel readable and yet everything is just slightly off. I can read Japanese and this feeling is very different from reading a script that I completely don't know (like Arabic or something). Maybe when people have a stroke and can't comprehend language it feels something like this?
The AI-generated characters remind me of hentaigana, which are the obsolete variant forms of kana used before 1900: they're clearly Japanese writing, but weird and basically illegible to the modern reader.
As pointed out in the comments to the Know Your Meme article, this was originally generated by a GAN and posted on Instagram by @busyrotting, who isn't credited in the article at all.
Unfortunately their profile [1] is gone, and the Internet Archive doesn't seem to have saved the post [2] either. Nevertheless, we've looped back to neural networks again.
There's a lot of GAN monsters here. One of the big improvements from StyleGAN->2 was the introduction of Perceptual Path Length Loss (PPL). Has the author thought about adjusting that? Additionally there is StyleGAN2-ADA (recent) which deals better with smaller datasets (I'm not sure how big this dataset is but I'm sure it isn't as big as ImageNet). I'd imagine this would help significantly because most of the monsters are having issues with the body and multiple people, so there's probably not as many samples with that feature in the dataset. I'd be interested to see what improvements these would make.
PPL was one of the problems in scaling up StyleGAN2. It contributes to the over-regularization that made diverse images impossible to learn. There might be some very weak value of PPL which is beneficial (probably the optimal value of the PPL regularization is not exactly 0) but when the runs take a month or two, it's not easy to experiment with them!
The dataset is actually ~3x bigger than ImageNet, FWIW.
> The dataset is actually ~3x bigger than ImageNet, FWIW.
Oh that is very interesting. Thanks. Do you know anything else about the dataset? I was looking at the link for some quick stats and didn't see them. I'm curious if there are metrics to support or disprove my hypothesis about body features. If so ADA would still help because there isn't an even distribution of the features.
Not far from the truth! The source dataset was 2.4 million images from Danbooru. If you go to the site, you’ll see that male characters are pretty rare.
They seem to be frequent, but only because there are an ungodly number of submissions to danbooru. Think of it more like “for every male character, there are at least 10+ female characters.”
And it doesn’t help that a lot of the male characters are drawn in a female style...
That was the problem with my attempt at a 'This Husbando Does Not Exist' model: https://www.gwern.net/Faces#anime-faces-male-faces There weren't that many male faces, and the ones that I could extract turned out to be super-feminine looking anyway, or outright crossplay/traps, so it didn't wind up looking all that different!
Sites like Danbooru have a very strong bias to female characters in at least poses which at least slightly imply sexual content.
This explains a lot about why the generated images are the way they are (having a lot of borderline NSFW pictures and even if not having poses which are sexualized and/or imply sexual content).
While Anime tend to be more open to such things and tend to have often quite a bit of fan service the generated images are way to much biased compared to the "normal" anime/manga content you find on platform like crunchyroll or similar.
I mean a non negligible part of the pictures they used have likely hentai variations if you disable safe mode (If Danbooru is anyway similar to a site I used a view years ago to get nice screen background, if used in safe mode. And this seems to be the case.)...
But then due to practical (and maybe copyright) reasons it's very viable to use Danbooru as source but not so much to get snapshots of Anime's or pages from "real" Mangas.
This is the front page of Danbooru for me, as of posting:
https://i.imgur.com/9eO6ZpU.png [standard warning that it might contain nudity and violence exceeding your personal threshold]
It seems to me that if it truly used Danbooru, it also filtered for specific tags, which is also indicated by the art-style that it uses. The ratio of ostensibly male to female characters there is more so 7:18
One page as a data point is way too small to conclude anything. It could totally be some uploaders just happened to dump a batch of illustrations with males since he/she likes it.
There are tags for male/female which will be a better representation.
1girl: 3038k
1boy: 559k
(Note: these tags are used when there is only 1 character with this gender presents. But it's good enough.)
Your idea about they filtered images could still be true, to be clear.
Sorry. These would be ok where I live (tabloids routinely show much more), but some places in the world (and some automatic content filters) can be quite sensitive, so I thought it fair to add a warning.
These generated "anime" pictures look overfitted to me. It's not convincing me that it's in any way original. I bet large parts of these pictures actually do exist, and it's just mashing its memories of these pictures together, instead of actually having internalized how to draw manga pictures.
Maybe the source dataset was too small, and it was not trained strictly enough, or it should have actually trained on anime stills instead of these Moe manga style drawings.
The state of the art, for example that site that generates images from sentences is much more impressive to me.
> The state of the art, for example that site that generates images from sentences is much more impressive to me.
Come on, DALL-E came out like last week - give us a break! That was a major research project contributed to by at least 11 fulltime OpenAI researchers using 400 million images; and this was like 3 hobbyists in a Discord occasionally.
I mean obviously it's very interesting to look at and no small feat. I suppose I wrote my comment in the wrong context, my apologies. If 3 hobbyists moonlighting in a Discord could casually match Dall-E it would definitely spell utter doom for us all :P
Anyway, I did the most hated thing on HN, casually dismissing someone's hard work or careful thoughts. I'll go chew on some sand now.
At the very least it's mashing them together in such a way that about 50% of the time they form anatomically correct humans. So at the very least it is rotating and shrinking limbs in a three dimensional way and internalized that.
The other 50% of the time it goes wrong, and the result is either a human that is not anatomically correct, or something that doesn't even seem to be a two-dimensional projection of three-dimensional Euclidean space.
Take into account that the train set is very heterogeneous compared to eg. very well lighted, centered (celebrity) faces or whatever went into thisfacedoesnotexist.
From the detailed writeup:
> Broadly, his StyleGAN2-ext increases the model size and disables regularizations, which are useful for restricted domains like faces but fail badly on more complex and multimodal domains.
"I am a freelance American writer & researcher. (To make ends meet, I have a Patreon, benefit from Bitcoin appreciation thanks to some old coins, and live frugally.)" - from the "about Gwern" page https://www.gwern.net/Links
There's some anime I'm really into, and some that make me really uncomfortable, like Made In Abyss, that has specific sexualization of children around fetishes. As cool as this technology is, a lot of these images seem to draw on pedophillia and sexualizing children. This is an example of a "neutral" training set with results that I wouldn't want to use to represent the entire genre.
I have no idea if this comment is off limits for HN. I'm surprised to see no one else voicing their discomfort with the results of this StyleGAN output.
Quite a bit of child nudity and a camera that occasionally lingers in uncomfortable places. It's not strictly pornographic, but it is nonetheless (unfortunately) popular in less scrupulous circles for reasons beyond the plot.
I'm curious, if trained with mostly English text in the images rather than Japanese, if it would produce mostly recognizable roman characters (but not necessarily words) or whether instead the characters would be largely unrecognizable.
More reference images with what looks convincingly (to a nonspeaker's eyes) like text:
I should introduce this site to HN. "Let's make porn image by Genetic Algorithm" is very interesting project that aimed to make porn image by just selecting which is more ecchi, started in recent days. Initially the image was just like a mosaic but now hentai people made boobs, face and body.
Is something like this done in real time? I see there are sliders, but is this type of inference something that be done cheaply for a couple thousand users?
The images were all generated in advnace and are served statically, so it's not done in real-time. It would be theoretically possible, but just a lot more work and money.
There's some similar sites that are currently a better use-case for that if you want to customize one on the page with live inference, check out https://waifulabs.com/
Not necessary, but the source used for the images displayed (there are others on the website explaining what it does) has a bias in that direction which is stronger then the bias you find in e.g. Animes on crunchyroll or similar.
EDIT: It also a very reasonable source as it's very hard to get large amount of anime/manga images which mostly are in color, which more importantly are tagged and you know you will most likely not get in trouble for using them.
Faces mostly work. Limbs and feet are way more hit-and-miss, lots of disturbing messes. It's still intriguing, especially since I expect further improvements to the tech over time.
edit: well it's translating a div but still pretty neat
This an uninformed thought but if you're extrapolating something. If we took the base knowledge math/physics can AI/ML stumble onto the next undiscovered stuff?
Somewhat tangent, like how you can conceptualize something in your brain, you could pre-load your neurons randomly to figure out some new idea. I imagine most of it would just be noise.
What kind of computer configuration does it require to train a model like this? Speaking as noob, is it possible to create models like this on your average Ryzen 5 / GTX 1060 6GB type PC? If so, how much time would it take?
I feel really excited to experiment on such projects in my spare time but reading how it costs researchers hundreds and thousands of dollar don't even try.
If you don't have TFRC access, this is out of the question as a hobbyist. It's >32 GPU-months. Sorry. However, you can still transfer learn this particular model with <1 GPU-month, if you have some anime or artwork dataset you want (eg a specific character).
The good news is that there's been huge progress in GAN efficiency over the past few years, and we are rapidly approaching the point where ImageNet-scale datasets can be reasonably trained with 1 GPU-month or less.
So, if you feel left out, don't worry. In the near future, between the RTX GPUs and optimization and new VAE/GANs, I expect things like Danbooru2020 https://www.gwern.net/Danbooru2020 to be totally doable on hobbyist hardware.
Amazing project and thank you so much for the detailed answer. In case you read it and I know it's too much to ask of you, but you seem to be an expert at ML, so what resources would you recommend me to read up on for a hobbyist who wants to create a project like this (I mean this x does not exist type or similar).
Somebody recommend Andrew NG courses but I wasn't able to complete it and got bored. Then I tried watching sentdex and it was amazing and I'm able to fully comprehend what he does since he teaches more practical aspects like running gta v on autopilot etc.
The "deformed-ness" isn't because of the ML, anime/manga girls drawn by humans are already deformed enough (just looks at those huge eyes, if they were real you probably couldn't fit a brain in that head) - the difference is that human artists have the sensitivity required to keep their drawings from looking too weird, while the AI drives straight over the edge of the uncanny valley (or rather, creepy canyon)...
Human-drawn "deformities" are things like super-long legs, missing noses, or mouths so wide they look like a Muppet.
This AI creates tumor-like deformities, where limbs are missing or melded together. As an example, here's an image that couldn't decide if it wanted the girl to have a large chest or a dislocated shoulder: https://thisanimedoesnotexist.ai/results/psi-0.8/seed9182.pn...
This problem is famously solved by Progressive GANs - which are trained with images of increasing size, starting from very small; it helps eliminate the deformities by learning shape at multiple scales.
This is where the AI needs to be trained on the "attractiveness" of different deformities (of course this will depends whether you are aiming for something weird looking or not)
I think it's interesting because it accurately highlights what many of us males focus on. (Face+hair && breasts) || (butt && legs)
With everything else and how it mashes together being an afterthought.
We can patch our behavior and pretend we are looking at other things, but the GAN aggregates what all of the artists did and has an aesthetically pleasing result even when they don't make sense upon closer inspection, because it is including the features we look at.
I am not sure what images you get, but for me it is more wtf than porn. It gets the faces mostly right, it gets the feet sometimes right, the arms are consistently between nightmare fuel and Deadpool regrowing a lost limb.
It reminds me of an old article from the Journal of Irreproducible Results that posited that the missing limbs from ancient Greek statues could be found having been transplanted (somehow) to Hindu statues.
For those who might not realise, Anime [1] are expensive to make.
Hopefully someday those top quality Anime would be cheaper to produce with the help of Machine learning and AI. And hopefully those Japanese companies [2] hopefully take the world wide market more seriously.
The top voted reply states: "So anime are not really that expensive to make, in terms of the usual cost of producing a professional level half-hour animated television show with top talent."
Regardless, outside of videogames autogenerated content turns me off (and sometimes even in videogames, you can tell if something was handcrafted or autogenerated). If the way to make animé "survive" means removing meaningful human authorial input, then maybe it's best if it dies?
Well respected sci-fi authors of all times have seen this scenario as a dystopia. I remember one short story -- was it by PKD -- where an author inputs a novel "high concept" into a computer, a second later says "wait, I have a minor correction" and the computer replies "too late, I've already written the novel and its sequels based on your concept, it's already being distributed".
Surely the answer to "this requires top talent and human artistry, and this stuff costs money" cannot be "replace them all with Machine Learning", can it?
>outside of videogames autogenerated content turns me off (and sometimes even in videogames, you can tell if something was handcrafted or autogenerated). If the way to make animé "survive" means removing meaningful human authorial input, then maybe it's best if it dies?
what if it simple generated frames between the ones drawn by hand? so the artist doesn't have to draw every frame just key frames say on ever second or half second and the ai filled in the in between frames while maintaining consistancy
I hope your prediction never comes true. It's terrifying to think of the human element in art as niche. It's a kind of dystopia our scifi authors warned against decades ago...
I don't think anyone ever suggests that weird extreme outside ai-as-art-itself. Usually the idea with ai creation is "would your work be easier if I provided something 90% there that you fix up as needed". People will not get replaced soon, but rather provided with tools like clippy "hey, you're drawing a person, would you like 100 auto-generated running posses that you don't have to draw manually?"
For something like in-betweeners? Maybe. Like someone else pointed out, it's already the case in animation that the pros do the keyframes or even concept art, and the interns/low-paid workers do the rest of the frames. Maybe this could be automated and it'd be a win.
In the general case I'm still skeptical though. Consider the case of comics, with the jobs of penciller and inker. Is the inker an automatable job? Do they not add important artistic decisions to what at first sight would seem like a merely mechanical job, to the point some inkers have their own fan base? (And, it is argued, a good inker can "make or break" the pencil art). And if this is true, could it be also the case that even the lowly in-betweener brings his/her small grain of artistic input to the final piece?
In the context of anime, I'm talking about a hypothetical future (hinted at by the person I was replying to) where characters are designed by ML/computer automation, with minimal human input -- or in a nightmarish scenario, even entire plot lines and episodes! -- to drive costs down.
In the context of videogames, I mean the battle-tested technique of sandbox and roguelikes (and the games they inspired) of generating missions, scenarios and levels programmatically and randomly, so that you don't get the same game twice. It's cute, and for roguelikes it works, but it's also very different to levels created by hand.
It speaks in terms of salary of animators and actors and compares them.
Of course, all salaries and in general production of almost everything is cheaper in most nations than in the U.S.A. simply because very few taxes are paid in the U.S.A., so employees receive more salary, but every product also costs more.
For instance, in 2019, the average net compensation for a salaried worker in the U.S.A. was 51 916 U.S.D., with median beng 34 248 U.S.D.
I found those numbers for Japan to be 39 851 U.S.D. converted, and median 62 9475 U.S.D., actually making median higher due to the top-heavy nature of U.S.A. incomes.
This top-heavy nature might also be why it's not far to compare absolute star shows with each other.
I, frankly, do not really understand how The Simpsons could ever take more resources to produce than anything which involves choreographed, three-dimensional fighting scenes that involves characters dazzling and jumping through buildings:
GAN is definitely growing super fast- there was the app by Rosebud AI recently that allows users to upload photos and they’ll just sing Christmas carols lmao.
Visiting the website I feel as if there's something to be learned about the current state of anime. Or maybe it's just their training data and I'm reading too much into it, dunno.
The model has no idea about what arms are supposed to look like [1] but has no problem making sure that the resulting abominations are wearing underwear only. Also missing: men.
That's sampling bias - danbooru2019 is mostly old touhou fanart plus whatever a subset of horny men who like tagging things have uploaded from pixiv.
Worldwide anime and related media have a mostly female audience and certainly don't lack male characters. In Japan women write more fanfiction relative to men who draw more fanart, I think, but I've never counted.
Women who made R-18 fan arts tend to avoid their arts tagged to original name, this is also a big factor. They also do search prevention (検索除け) by like inserting random character between name.
That's true but since danbooru relies on manual uploads by fans who liked the pictures anyway, it isn't much of an issue.
There are other sites with more female users (eshuushuu and zerochan iirc) but danbooru's data is better because of a combination of 1. not allowing horrible poorly drawn fetish art or selfposts 2. ignoring the fact that most of the good artists don't want to be posted in the first place.
That's because each and every one of them is a stylistic genre called moe, that emphasizes the cuteness and sexual appeal of the characters.
The a.i. was clearly trained using moe only.
Moe and fan-service tend to walk as one. You will also notice that the overwhelming majority of the characters look female. A high to absolute number of female characters is also common to moe.
I'd actually be interested what turns out if an a.i. actually be trained from a completely random selection of last year's releases in animation from Japan and whether it would manage to keep different art styles apart, or somehow blend them.
I don't really believe in the common claim that it is easy to recognize animation from Japan from stylistic elements, it just so happens that it seems that moe is often what is being talked about:
> I'd actually be interested what turns out if an a.i. actually be trained from a completely random selection
Something I noticed is that AIs appear to have trouble with differentiating perspective vs art style. What I mean by that often characters are in 3/4 view, which naturally means one eye will be smaller than the other due to perspective. But some art styles have bigger eyes than others. What I see sometimes is that the AI then produces an image where each eye comes from a different art style.
It doesn't appear to account for the concept of a three-dimensionally spherical head and how perspective works with respect with such shapes.
It most certainly would not produce good results unless you specify that stuff like shirokuma cafe[0] characters are not human.
Given enough input, it should be able to learn on it's own what a humanoid shape is to a human.
Was the This Person Does Not Exist-a.i. simply trained with pictures of human heads? As it seems to be quite capable of keeping many things apart that are commonly kept apart.
T Person DNE also centered all faces in exactly the same way, which makes it a lot easier to avoid e.g. adding a third eye (because there are only two positions where eyes appear in the training data.)
T Waifu DNE uses similarly centered and cropped portraits, which is why its output is consistently of higher quality than T Anime DNE.
There is a pretty large fanservice/borderline NSFW/smut and similar Anime industry.
But there are also quite many Animes which don't have the problem or at most have a very small amount of fan service "sprinkl in". But in this generator the borderline NSFW stuff seems to be the main topic.
I let it run for a minute or so and every image appeared to be fully clothed or at least “bikini clothed”. Psychedelically distorted? Yes. NSFW? I guess it would depend on your W.
Ok sometimes most characters are in implicit poses + likely in underwear/swimsuit and sometimes just some characters are in implicit poses + likely in underwear/swimsuit.
"This Rental Does Not Exist" would be useful for game world building. Fill up entire cities with plausible furnished rooms.
A storefront generator would be useful. There are procedural city generators that generate blank buildings in a reasonable way, but can't fill in the details.
I don't know much about anime, but why are so many anime women depicted in an overly sexualised manner? Didn't we want to leave this kind of sexism behind as a society? It's like watching a sexist ad from the 1950s, but worse.
Mostly because that's what one encounters and instantly recognizes as “anime”.
Most Japanese animation is entertainment targeting young teenagers and younger. Japan does however have a considerable adult animation market. The a.i. is not trained with any “anime”, however, but with still promotional art which is of far higher quality and clearly trained with the adult market where sexualization is, as one might expect, more common.
The paradox is that the unrealistically cute “moe” designs are typically targeted at adults and adolescents and most of the children's antertainment does not look like that.
Often in Japan, one speaks of four principal demographics: young males, young females, older males, and older females. — the a.i. is quite clearly entirely trained with the “older males” demographic, which should be undertood to target around 18-30 years of age, and even within that demographic it is trained almost exclusively with moe art.
Most “anime” would not even be instantly recognized as such, which is why I think the term is fairly useless. — it seems when people speak of “anime” more often than not what they mean is “moe art” whether it be animated or not.
It's trained on fanart, mostly from pixiv via another site, which is tagged safe/sort-of-nsfw/nsfw but many things are tagged safer than they actually are.
No one knows why but it’s often said(with anecdatas) that >50% of creators for these content are biological female.
So girls draw sexualized girls, some men do as well. Some says the high composition of female artists comes from the fact that moe arts are extension of Shoujo Manga market where predominantly female characters are portrayed by predominantly female artists for predominantly younger female audiences.
Whatever the actual reason might be, it’s probably more complicated than “person with polar opposite sexuality of the subject consume porn”.
Well, this is not representative of all anime, but admittedly of a large subsegment.
> but why are so many anime women depicted in an overly sexualised manner?
Because it sells very well to young males, especially the stereotypical basement-dwelling sexually frustrated nerd population. And Japan is the land of the Hikkikomori, after all.
But it's not like this kind of thing doesn't exist in the USA either. Just take a look at some superhero comics.
As others have noted, this was clearly trained on a small subset of anime imagery. This sort of thing is definitely a thing in anime, but there is also a huge amount of content that does not feature highly sexualized female character designs.
I dont know much about anime, but isn't at least a portion of anime market basically porn? Women are definitely still portrayed in a sexualized manner in the porn industry, generally speaking.
Neither condoning nor condemning, but: Japan is the same country where shows featuring brother-sister incest are aired on television, albeit late at night. [0] I'm not Japanese, just an anime fan, but from my understanding they (Japanese anime fans) could not give less of a shit about Western taboos and sensibilities.
Why is the term “western” so often used whenever Japan be mentioned to mean either “Anglo-Saxon culture” or even more specifically “U.S.A. culture”?
I do not believe that many of these “western taboos”, invariably dealing with sexual themes, would be so taboo in most of continental Europe. — it is mostly the Anglo-Saxon who is known for his staunch moral control on sexuality, swearwords, and nudity.
It's not like Japan does a lot of incest in real life, they just seem to think romances are boring without violating taboos. Japanese people are so against incest they won't have sex after marriage 'cause you're related now.
> Watch some porn on xvideos.com and lose all hope in humanity.
There's a difference between sexism and sexualization in entertainment. Say one could enjoy some kinky porn and at the same time not treat all women as objects in day to day life. The same way you could enjoy a violent movie and not go on a killing spree afterwards.
There's clearly a demand for this kind of stuff and I don't see any point in trying to shame it out of view - it won't change the fact that people enjoy it (they'll just enjoy it in secret).
_edit_
Not sure why op deleted the above comment, he did have some valid (altho a bit dramatic) points. Looking back the line that I quoted might have been sarcasm.
> If it makes you feel better, anime men get the exact same treatment.
I find this to be a big difference that sets the U.S.A. apart from the other two media giants: India and Japan.
India and Japan are far better it seems at tapping the market of male sexualization and U.S.A. media has never quite done that.
I would in fact argue that this non-sexualization of males seems to be a very Anglo-Saxon cultural property, but many other cultures do not have the large media market to show it.
I've seen quite a few comments coming from the Anglo-Saxon world about the Internationally released Swedish series Love & Anarchy, about how surprisingly attractive and sexualized the males were.
https://thisanimedoesnotexist.ai/results/psi-1.1/seed0300.pn...
https://thisanimedoesnotexist.ai/results/psi-1.5/seed7458.pn...
https://thisanimedoesnotexist.ai/results/psi-1.2/seed1289.pn...