Hacker News new | comments | show | ask | jobs | submit login
Hamiltonian Monte Carlo explained (arogozhnikov.github.io)
210 points by adamnemecek 7 days ago | hide | past | web | favorite | 17 comments





Also check out Michael Betancourt’s (one of the developers of Stan) introduction: https://arxiv.org/abs/1701.02434

Hey, I've read this article while doing my course exercises. Coming from Metropolis-Hastings, HMC seemed a bit like magic. Here's some notes that I wrote about it, could somebody tell was I even remotely close:

>In other words explained by my TA: when you have a really complex multi-dimensional distribution normal MCMC will take forever to explore it. HMC on the other hand adds momentum that will help the MC to explore areas where the probability is high. Imagine the probability is being translated into 3d landscape where high probability corresponds to deep areas and low high. Any ball with gravity will follow those curvatures and not jump over the walls needlessly where the probability is low.

>Also HMC is the current state-of-the-art MCMC algorithm if you have very high-dimensional data. Regular MCMC too can be applied if the distribution is much simpler. However instead of MCMC, VI is commonly used since it can give really good results with little amount of work. I mean sure you have to choose your approximation distribution but after that it's dead simple. Only maybe if you need really high accuracy you might choose something like HMC.


What's VI?

Variational Inference. If you google variational inference mcmc you'll find more information about them, if you're interested.

Excellent! Always believed being able to "play" with the Hamiltonian particle would be a great motivator in understanding physics ;)

This is why I get so excited about about probabilistic programming in general. 1M dimension data sets likelihood estimation in reasonable time right on your laptop. Real world samples are usually sparse, heterogeneous. By abstracting out your analysis it not only reduces the chance for human error. But allows for ingestion of even more disparate sets of archival and de novo data sources. I have little doubt this will lead to more nuanced theories. And higher reproducibility of results.

A recent example of the state-of-the-art: predicting rare events in pediatric transplant surgeries.

https://arxiv.org/abs/1809.04407


Note that many people still use HMC without a closed form for the gradient, via approximation. In fact, Stan (http://mc-stan.org/) automatically approximates the gradient by default if none is given.

This is another excellent visualization of MCMC algorithms: https://chi-feng.github.io/mcmc-demo/app.html#NaiveNUTS,bana...

Nice looking page. The future of education is here!

Yeah. This looks like a "modern textbook" presentation.

OK I somewhat understand Markov Chains and I somewhat understand Monte Carlo simulations (at least in the context of financial modeling), but this is quite over my head!

I don't follow the literature on sampling based inference very closely. Could anyone tell me what's the state of the art in confirming and debugging convergence problems

I'm not an expert, but I think R-hat is pretty commonly used. There are a whole bunch of diagnostic plots in the bayesian community within R (bayesplot and tidybayes are useful here). Not sure if that's state of the art (R-hat definitely isn't, as it's in Gelman's PhD thesis).

The folk theorem of computational statistics suggests that if the model has convergence problems, it's a bad model ;)


Very cool Kanye

Why is the title of the site "Brilliantly Wrong"? They post a variety of such methods. Is this an example of a sophisticated method that is incorrect somehow?

Maybe it's a play off Less Wrong?


It's just the name of his blog. Self-deprecating humor, maybe?



Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: