
My Favorite Deep Learning Papers of 2017 - gjstein
http://cachestocaches.com/2017/12/favorite-deep-learning-2017/
======
hacker_9
Super stuff. Can anyone here who reads these research papers tell me their
process to understanding them? Does anyone try to recreate the models using
the math from the papers? I'm interested in this stuff at a hobby level, but
find my mind glazes over when trying to process what is being talked about. It
would help if these papers were more interactive, or perhaps I need to sit
down and try and build these models in tensor flow perhaps.

~~~
chillee
Agh I typed up a response but it got deleted somehow. But basically, I view
there as being 3 layers of understanding. For whatever level you're stuck at,
there's different ways to get "unstuck".

1\. Understand the task they're solving, how they did it, and their results.
Anybody with a basic understanding of the domain should be able to do this by
reading the abstract/intro and conclusion. If you find yourself having trouble
here, you probably just need more background in the field.

2\. Gain some intuition of why their method works. This is probably one of the
hardest parts to figure out how to do, and is probably the part that most
people stumble on. Really, this is basically the entirety of what you're
trying to do when you learn math. There's also varying levels of intuition.
There's "I get why this might work", "I get why this works", and "I get why
it's impossible that this doesn't work", in order of difficulty. The more
background you have, the easier this intuition is to grasp. Alternatively, you
can bootstrap your intuition by reading other people's blog posts, talking to
somebody who understands the paper, asking the authors, playing around with
your own implementations, etc. I'm not sure anybody has any good answers for
this stage. Personally, I really like good blog posts from bloggers I know are
good, but unluckily, many papers do not have blog posts attached :(

3\. Finally, there's the strict mathematical rigor part. These levels aren't
really strict; oftentimes, I'll treat math I'm not familiar with as a black
box theorem. If you don't have the math background for these proofs, there's
usually not much better recourse than learning the subject properly.

Luckily, many ML papers barely have any mathematical proofs :)

Alex Irpan has a great explanation of the Wasserstein GAN paper here:
[https://www.alexirpan.com/2017/02/22/wasserstein-
gan.html](https://www.alexirpan.com/2017/02/22/wasserstein-gan.html)

If you're looking for interactive blog posts, distill.pub probably has the
best: [https://distill.pub/2017/research-
debt/](https://distill.pub/2017/research-debt/)

And one final note: Many papers (especially papers that aren't math papers)
are often surprisingly simple to get to step 2; it's just hidden behind a lot
of cruft. I will say that it's wise to be careful about so called "intuitive
explanations" of a concept. If somebody gives an "intuitive" explanation for
why X is true, but that intuitive explanation doesn't explain why !X is false,
it's not very useful.

~~~
hacker_9
Thanks for the links, hadn't even heard of distilling but the papers on that
site are a lot more approachable for sure. Thanks for the other insights too,
I will keep them in mind for the next paper I read.

------
zengid
Are GAN's pretty hot these days or is that just a coincidence of the authors
preferences? I just received Ian Goodfellow's Deep Learning textbook [1] and I
know he pretty much invented the technique, so I'm wonder how influential /
important GANs are in the field.

[1] [http://www.deeplearningbook.org/](http://www.deeplearningbook.org/)

~~~
317070
GANs are for graphics, not machine learning. They have no test scores as they
cannot run on the test set. Therefore, its hard to tell how much they overfit.

But they have uses, like for lossy compression of images or texture
generation. Stuff which focuses on the graphical side of things where it
outperforms machine learning methods with its crisper samples.

For an overview of this argument:
[https://arxiv.org/abs/1511.01844](https://arxiv.org/abs/1511.01844)

The idea of adversarial training is important and relevant in ML though! It
allows for setting up losses which are hard to formulate otherwise.

~~~
chillee
Huh? GANs are for graphics, not machine learning? I mean, first of all, how
are GANs not machine learning? They are definitely "machine learning", and I
don't think anyone would disagree.

Second of all, GANs are most definitely not just for graphics. They've been
applied to text generation, to generating adversarial examples, to data
preprocessing, etc.

Third, I have no idea what you even mean by "test set" in the context of GANs.
It is true that it's hard to tell their performance, but that's irrespective
of whatever you're talking about. It's hard to evaluate performance because
we're usually judging the quality of the generated images, and we don't have
any good ways of evaluating "perceptual loss", or how real an image looks.

As for the OP, GANs have been a very hot topic. Not as hot as this blog post
makes them look perhaps (with nearly every paper about them...), but I
wouldn't really disagree with any of the papers posted. Only one I'm not
familiar with is the "most useful" one, but the rest were all pretty great
papers imo. As for Ian Goodfellow, he's a very smart guy who seems to do a
pretty good job explaining things. I saw a couple YouTube videos from him at a
meetup covering his DL book, and he did a great job teaching.

~~~
argonaut
Although I would agree that GANs are part of machine learning, some people
definitely do disagree, and their concerns are valid. It's definitely an area
of open research.

Your third point is actually the point of those who disagree. It's the same
reason why we have the principle of unfalsifiability in science.

~~~
chillee
I'm a bit confused about the point that you're stating. I've never seen
anybody not group GANs under machine learning.

Machine learning is typically split into supervised learning, unsupervised
learning, and reinforcement learning, and GANs are usually considered part of
unsupervised learning. I guess the part I don't understand is what you mean by
"their concerns are valid"? What are their concerns about? Whether GANs are a
promising path of research? And if GANs aren't part of machine learning what
are they?

