
Bayes and Big Data: The Consensus Monte Carlo Algorithm - Anon84
http://research.google.com/pubs/pub41849.html
======
bazzargh
Any hint as to how this differs from other parallel Monte Carlo algorithms? Or
is it an application of those methods to big data? (People were doing parallel
Monte Carlo at least as far back as the early 90s - my own research topic on
exact nuclear calculations was basically obsoleted by the superior scalability
of that approach)

~~~
tlarkworthy
it's first author is a Googler, and its posted on a Google domain.

EDIT: haha I am downvoted for speaking the truth. The parent and I have both
read lots of papers like these, there is only a modest contribution of this
paper. It basically says we fudged the inference and the end results look
similar so its a good optimization. However, there will be probably lots of
specific models that "need" the mixing freedom the approximation removes, and
hence the algorithm will only work for a specific subspace of MCMC problems.
MCMC is basically impossible to debug, so we dunno how well it works overall.
THIS IS THE SAME CONCLUSION OF EVERY OTHER MCMC APPROXIMATION PAPER. The only
reason this is on HN, is because of its heritage. I do not think this paper is
revolutionary (unlike some other papers coming out of Google).

EDIT 2: evidence Modern: "Fully Parallel Inference in Markov Logic Networks"
\- Max-Planck "Hybrid Parallel Inference for Hierarchical Dirichlet Process"
"A split-merge mcmc algorithm for the hierarchical dirichlet process" Old:
"Parallel Implementations of Probabilistic Inference" a 1996 review paper!

You might say these papers are not exactly the same, ok, but the final
justification for the given paper is:

"Depending on the model, the resulting draws can be nearly indistinguishable"

NOTE kewords: "depending" and "nearly"

~~~
Demiurge
I have no idea if you're right or not, but I really hate how people downmod on
HN instead of stating their disagreements.

~~~
ColinWright
In general so do I, but in this case the initial reply was a content free
snipe. In that case I'm not surprised it got a downvote without investing the
time in making a more complete reply.

The subsequent edit was useful, although the attitude is, well, "sharp".

------
mathattack
Can anyone explain cases where the Monte Carlo optimizations wouldn't work in
a distributed manner? Most of the examples of Monte Carlo that come to mind
are path dependent pricings of financial instruments. That seems to be an easy
one to distribute the computations. As long as each path gets it's own
processor, you can average the prices at the end.

What's a good example of where this breaks down? I'm thinking weather
simulations, but I don't know the domain well enough.

~~~
tristanz
Most Monte Carlo methods used to calculate Bayesian posterior distributions
are iterative, you can't parallelize each draw since it depends on the
previous draw. Moreover, each draw depends on a function of all the data, so
if your data is larger than can be stored on a single computer it gets
complicated.

This paper doesn't look like it has a magic solution though. It seems to
propose an adhoc sharding approach and then shows it sometimes works.

~~~
nyan_sandwich
MCMC isn't strictly serial; you can run multiple chains in parallel and get
decent results (provided your break in period isn't too big), and this is
commonly done for multicore.

The "all the data" problem is interesting. It seems solvable if each node
summarizes the likelihood function for part of the data. I'll have to read the
paper and see if that's what they did.

~~~
tristanz
All of it's definitely solvable, it's just harder than lots of Monte Carlo
problems that are embarrassingly parallel.

