
Boltzmann Encoded Adversarial Machines - davelope911
https://arxiv.org/abs/1804.08682
======
cs702
_Very_ interesting.

At a high level (ignoring many details) the main idea is to replace generator
networks in GANs with Restricted Boltzman Machines, or RBMs, which are easier
to train (more stable). The authors call this kind of architecture "Boltzmann
Encoded Adversarial Machines," or BEAM for short.

The experiments provide persuasive evidence that BEAMs outperform GANs. Figure
3, in particular, I find very persuasive -- it compares the ability of
different architectures to learn to generate low-dimensional mixtures of
Gaussians, with BEAMs very clearly outperforming GANs. The results in higher-
dimensional applications such as image generation also suggest that BEAMs
outperform GANs, but the improvement is somewhat more subjective due to the
nature of high-dimensional data. Obviously, these results need to be
replicated by others.

It looks promising to me. That said, it's been _years_ since I've touched an
RBM -- I only have a vague recollection of how they work and how they're
trained, layer by layer, as proposed by Hinton in 2006 or so. Time to re-read
old papers!

~~~
drams
To clarify: in the case of a BEAM both the generator and all but the top layer
of the discriminator is replaced with an RBM. The adversary in this case
operates on features encoded by the RBM, not raw data samples. Secondly the
RBM is trained with a combined loss involving log-likelihood and the
adversarial term.

~~~
cs702
Yes. For simplicity's and brevity's sake, I ignored many important details in
my summary.

~~~
drams
No worries! :)

~~~
cs702
Thanks. Have you made any code available online?

~~~
drams
Yes; The following recent review article actually provides code samples:
[https://arxiv.org/abs/1803.08823](https://arxiv.org/abs/1803.08823) which use
an open-source version of our software called 'paysage'
([https://github.com/drckf/paysage](https://github.com/drckf/paysage)). This
has currently not been updated too recently, but we expect to put out a new
update quite soon. The update will clean up code, docs, features, but might
not yet contain the BEAM training code. The latter is pending some decisions
about IP, etc.

~~~
cs702
Thank you. I'll take a look!

------
w-m
It's amazing to see deep learning blast through all the benchmarks, for
example in computer vision, over the last couple of years. At the same time
something starts to feel off about having all these single-use asymmetric
feedforward networks solving their own little task. Being trained in one
direction, then used in the other, then thrown away. Maybe being chained
together for a more complex task, but that seems to be about it for the
average (real-world application) use case of deep learning nets.

I'm sure there's plenty of interesting work being done in ML to improve on
this situation and come up with new architectures. Yet I was moderately
surprised when I rediscovered Boltzmann machines recently, and found not much
work seemed to be going on there at all (very little at NIPS 2017 for
example?).

This BEAM seems intriguing, here's hoping it opens the door to a better
understanding and modeling of our world.

~~~
nafizh
RBMs went out of fashion after 2010-2011, as other architectures worked better
than them in almost all of the tasks in vision.

------
rememberlenny
Can someone explain the basic implications of this against current GANs and
also provide a practical ML application?

~~~
drams
I can try. (I am a coauthor of this paper) First off, Unlearn.ai is a startup
working to build new tools that make precision medicine a reality. We needed
to be able to build generative models which allow us to 1\. model multimodal
data easily (consider medical datasets with categorical data, binary, and
continuous, with various bounds etc. all mixed together) 2\. be able to answer
counterfactual questions about data (for example if I down regulate a gene how
does this effect the rest of the gene expression?) 3\. be able to build models
which handle time-series data (give me a likely progression of this person's
cognitive scores given their current scores and other indicators)

RBMs are natural candidates for models which handle these kind of issues quite
well. 1. Although people have done work trying to get GANs to work well with
multimodal data, it's pretty kludgy. 2. GANs do not provide a means of
inference (contrast VAEs which can satisfy this demand). 3. We have built a
solid extension of RBMs to temporal models which work quite well.

However, as explained in this paper, stock RBMs have significant training
issues. This paper attempts to improve the situation.

~~~
tlarkworthy
RBMs have a native probabilistic output (the output is a distribution you can
slice), but vanilla neural networks don't (the output is a vector). Is that
right?

~~~
drams
It's best to say that an RBM is an undirected NN which models a probability
distribution of some variables. You can sample from the distribution (which is
a stochastic process). There are other NN models which use feed-forward NNs to
do similarly --such as GANs and VAEs and others. The generation process is
also stochastic, but the difference is that you sample a noise distribution
and then feed that through the NN. In all cases the generated samples are
still vectors.

------
MarkMMullin
I'm wondering if the work on adversarial systems, this one being quite
interesting, can help us with our giant bugaboo of "OMG, its overfitted :-("
Right now we model, train, test, fail, and start all over again, and usually
fiddle with the hyperparameters to boot - what would happen if we turned
training into a two phased approach, with a BEAM/GAN whatnot used on each
cycle to measure how 'brittle' the backprop is? The idea being to round down
the spikes in the learned model by penalizing the backprop when it is too
narrow - training would take longer, but we'd throw away fewer sets, I'd think

------
babak_ap
Is there a reference, open source, implementation available? (on Github or
similar)

~~~
TheAnig
I too was interested in this

~~~
drams
See the comment above.

------
bra-ket
can this be applied to sequence learning?

------
johnfactorial
Just what the AI/ML crowd needs in the midst of burgeoning fear of AI: a new
technology with "Adversarial Machines" right there in the name.

