
What is wrong with variational autoencoders? - kawera
http://akosiorek.github.io/ml/2018/03/14/what_is_wrong_with_vaes.html
======
stochastic_monk
The equations on this page don't render for me on either Firefox or Chrome,
making reading this a little jarring. See their paper [0] for what I think is
easier to follow.

*A more informative title would be something like "Improving Variational Autoencoders with improved Evidence-Based Lower Bound [ELBO] methods. The "What's wrong with VAEs?" title, followed by paragraphs explaining how VAEs work, made me wonder why I was reading it in the first place, but practically improving ELBO methods for VAE implementations is a meaningful contribution.

[0] "Tighter Variational Bounds are Not Necessarily Better",
[https://arxiv.org/abs/1802.04537](https://arxiv.org/abs/1802.04537)

~~~
akosiorek
Author of the post here; I'm not sure what's preventing the equations from
showing. The blog are just a bunch of static sites generated by Jekyll, math
is from mathjax.

~~~
stochastic_monk
mijoharas below correctly identified it as being the HTTPS Everywhere browser
add-on I use which prevented them from showing. I don't know enough about
browser/website design to be able to help fix the problem.

Thank you for your work, the post, and the response!

------
jostmey
Where are the results? At least show me the MNIST samples either in the blog-
post or the paper. And while your at it, show the closet datapoint to the
sample. Is the model just memorizing the datapoints in the dataset?

~~~
akosiorek
Author of the blogpost here; the main results are the convergence curves of
IWAE-64 and log p(x) in the paper, but thats a good point, thanks. We'll
include samples in the camera-ready version of the paper.

------
visarga
I thought it was that VAEs generate blurry images. Has anyone been able to
make VAEs as good as GANs for image generation?

~~~
twanvl
IIRC there was a NIPS 2016 paper that showed that the blurry images are
because of the variational approximation used in the generator/decoder
network. They replaced this by doing MCMC inference on the encoder network,
and showed that you get sharper images, but much less efficiently. In general,
variational inference will always tend to model the mean or mode of the data.

