Hacker News new | comments | show | ask | jobs | submit login

From a statistics perspective, a few things are missing. Most importantly, a discussion of statistical power and a discussion of why exactly it is that they test until statistical difference is found. Every scientist knows that if you test a big enough sample, you are more than likely going to find a statistical difference, regardless of whether it actually exists. Hence, using only that as your heuristic for what makes one algorithm better than another is not very useful.

Agreed that statistical power wasn't used to calculate how long test should have been run. I know statistical significance can be found earlier if you are "looking" for it, that is why I ran the simulations until it was found at least 10 times. (I know it is not the most scientific way, but I used it as a heuristic; I don't know how to use statistical power in case of MAB. Probably 'statistical power' concept is not valid at all for MAB. Need to study more.)

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact