Hacker News new | past | comments | ask | show | jobs | submit | iyaja's comments login

Hi everyone. I just published a new blog post which talks about the evolution of GANs over the last few years. You can check it out here.

I think it's fascinating to see sample images generated from these models side by side. It really does give a sense of how fast this field has progressed. In just five years, we've gone from blurry, grayscale pixel arrays that vaguely resemble human faces to thispersondoesnotexist, which can easily fool most people on first glance.

Apart from image samples, I've also included links to papers, code, and other learning resources for each model. So this article could be an excellent place to start if you're a beginner looking to catch up with the latest GAN research.

I hope you enjoy it!


Cool! But one thing I'd like to see discussed is to what extent the images in various publications have been cherry-picked.


Cherry picking is no longer necessary with recent advancements. The images on https://thispersondoesnotexist.com/ are random and have a few artifacts (particularly the backgrounds and hair), but if you weren't looking for it you'd be unlikely to notice anything.


I'm curious about that too, and I'd like to know if there's been much work on GANs that can generate videos.


This was a very enjoyable read, thank you! You do a great job in making these concepts understandable.

The self attention mechanisms caught my eyes. Going to look into implementing something like that for a toy dataset. Thanks for the inspiration


I know people have been having trouble adapting these kinds of generative techniques to text. Do you know of anyone making interesting progress there?


I put your question in talktotransformer.com and got this response:

  I know people have been having trouble adapting these kinds of
  generative techniques to text. Do you know of anyone making
  interesting progress there?
  
  
  It is difficult to make progress in the field of generative methods
  with text alone - it takes effort and creativity to get a generative
  system working. A big part of our research focuses on generating
  sequences which correspond to handwritten data, and to improving on
  that method we have developed generative techniques which allow us to
  generate a large range of novel sequences. Our work is still small,
  but we are not stopping there, in fact our next major research project
  is to generate novel sequences for novel languages.
  
  
  What I see is that we are still in an early stage when it comes to
  the technology used in generative methods to create new words, but I
  suspect this is due to a combination of factors. First and foremost is
  the fact that the techniques we've developed to generate novel
  sequences are highly specialized in a particular kind of context -
  we are not going to create random numbers or sequences because that
  just doesn't work. Generating a word, for example, uses very specific
  computational principles and can only be done if you are aware of the
  context in which it is being generated (or "determined" as the
  linguists would say). Even so, the general principle has been around
  so long, that one could quite easily create several different methods
  to create.


GPT-2 is, however, not adversarial at all, and that might be part of why it rambles and lacks consistency or much of a 'point'.


+1. Also, the author of this article wrote a terrific article on GPT-2 as well. I'll definitively recommend you to check it out if you are interested in the latest breakthrough of text generation: https://blog.floydhub.com/gpt2/


Yes!

https://medium.com/capital-one-tech/why-you-dont-necessarily...

GANs simply try to replicate a set of features - you can think of this as images or text. Variations in the GAN designs will be present, but the general principles are the same.


This is really helpful, I've been wanting to deep dive on GANs and this has pushed me to do it.

I like the mix of images and explanations.

Thanks!


Hi everyone. I'm the author of this article.

In case you didn't read it yet, I used a StyleGAN to interpolate between a few popular characters from HBO's Game of Thrones series.

Here's something interesting to note: All the results (images and animations) were generated from Nvidia's StyleGAN that was pretrained on the FFHQ dataset, with absolutely no fine-tuning.

Instead, to make StyleGAN work for Game of Thrones characters, I used another model that maps images onto StyleGAN's latent space. I gave it images of Jon, Daenerys, Jaime, etc. and got latent vectors that when fed through StyleGAN, recreate the original image.

With the latent vectors for the images in hand, it's really to modify them in all the ways described in the StyleGAN paper (style mixing, interpolations, etc.) as well as through simple arithmetic in the latent space (such as shifting the latent vector in the "smiling direction").

As a bonus, since there's no StyleGAN training involved, all the steps that I just mentioned can be executed extremely fast.


Hi everyone. I'm the author. There's one more I thing I wanted to add: a good reason you should try using some sort of hyperparameter search, even you think it's a complete waste of time and compute, is for reproducibility.

This probably applies more to open-source academic contributions, where you're trying to help your fellow practitioners recreate and use your models, as opposed to a corporate setting, where reproducibility would be the equivalent of getting fired.

Recently, I was trying to train a ResNet to beat the top Stanford DAWNBench entry (spoiler alert: I did, but by less than a second). Initially, I blindly tried manually tuning the learning rate, batch size, etc. without even reading the original model's guidelines.

After actually going through a blog post written by the David C Page (the guy with the top DAWNBench entry), I saw that he tried varying the hyperparameters himself and that the ones that were set by default in the code were what he found to be optimal.

That saved me a lot of time and let me focus on other things like what hardware to use.

I think the lesson here is that if more researchers perform and publish the results of some basic hyperparameter optimization, it would really save the world a whole lot of epochs.


I enjoyed the article, and I know that writing these takes a nontrivial amount of time. So I think it would be wise of you to run these through a spell checker before publishing, as this is a less than a minute investment which pays off every time someone reads it.


> The heavier the ball, the quicker it falls. But if it’s too heavy, it can get stuck or overshoot the target.

This explanation of momentum is somewhere between misleading and wrong. Momentum is about inertia and acceleration, i.e., the ability to quickly change speed.


> The heavier the ball, the quicker it falls

Wasn't there a famous experiment about this someone once did?


A misleading experiment: the heavier the feather the quicker it falls is obviously true; steel feathers are useless. The same is true for balls (except for the exceptional situation of a perfect vacuum), it's just that drag and air currents don't influence balls all that much at low speeds.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: