Hacker News new | past | comments | ask | show | jobs | submit login

Nothing is universal and guaranteed, of course. YMMV, and all that.

For example, robust linear regression from chapter 17, that fits 300 points over 4 parameters (easy, but far from trivial) runs in 180 seconds in JAGS and 485 in Stan, in parallel with 4 chains, taking 20,000 samples.

Bayadera takes 276,297,912 samples in 300 milliseconds, giving much fine-grained estimations.

So, depending on how you count the difference, it would be 500-1000 times faster for this particular analysis, while per-sample ratio is something like 7,000,000 (compared to JAGS).

Of course, JAGS and Stan are mature software packages, while Bayadera is still pre-release...

Thanks. About the second part of my question - are you doing much the same stuff as JAGS/Stan? Like, they do a lot of work to make sure that their MCMC is validly converging to the posterior - does Bayadera make similar guarantees?

Is the speedup coming from a better implementation, or because GPUs are just way faster, or because it cuts statistical corners? If its cutting corners, are they sensible?

It uses different MCMC algorithm - affine invariant ensemble MCMC. The difference comes from the fact that this algorithm is parallelizable, while JAGS/Stan's isn't. So, many GPU cores are the main factor. But, the algorithm is also a factor, in a sense that parallel chains always mutually inform each other.

They may do a lot of work to make sure that MCMC is validly converging, and Bayadera also does its stuff on that front, but the truth is, and you'll find it in any book on MCMC (Gelman included) that you can never guarantee MCMC convergence.

Looks very nice. I wonder if the upcoming Xeon Phi will make the task of parallel sampling simpler. Or at least compiling and optimising automatic parallel samplers on the fly. Macros might be great for this. That's the ultimate probabilistic programming goal. Write the model and get efficient sampling for free.

Thanks. I doubt that XeonPhi would be any faster than my old AMD R9 290X, and the 10x price tag is also not inviting.

Can you point me to any good documentation on parallel MCMC algorithms and any info you might have written down on how you parallelized it? This sounds extremely worth porting over to some other probabilistic programming languages.

I'll be glad to send you the paper once it gets accepted.


Registration is open for Startup School 2019. Classes start July 22nd.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact