Hacker News new | past | comments | ask | show | jobs | submit login
Confidence Games (justindomke.wordpress.com)
34 points by justindomke 14 days ago | hide | past | favorite | 6 comments



If you have a hard time understanding a confidence interval, just treat it as a Bayesian credible interval (from an analysis with a flat and irrelevant prior). Now you can use the easy interpretation… “95% chance the parameter is between a and b”. You can do this because many frequentist methods are approximately Bayesian methods with flat priors.

A confidence interval by itself doesn’t mean much of anything unless you use some hack to convert to a probability interval. Why? Because confidence intervals are utterly insane. Here is an example of a valid 95% confidence interval. Draw a random number between 0 and 1. If greater than 0.05, take (-inf, inf) to be the confidence interval, otherwise take the empty set.


It may be an "easy interpretation," but it doesn't seem like a valid one: "The 95% probability relates to the reliability of the estimation procedure, not to a specific calculated interval." - https://en.wikipedia.org/wiki/Confidence_interval#Misunderst...

Similarly, the given example does not seem to represent how statistical confidence intervals work. As just mentioned, the 95% refers to the reliability, not to the probability that a specific interval contains the true parameter value! (In fact, the example is missing which parameter is being estimated.)

---

Let's see if we can rework the example to show that the 95% CI is actually well-defined and not arbitrary. To be specific about the problem:

- First, the population is given to be the uniform distribution between 0 and 1.

- Second, we must choose a parameter to estimate. For simplicity, let's suppose we're estimating the population mean. (Note the true population mean is 0.5.)

- Third, each sample is an independently, identically distributed single number from the population (we technically also compute its mean, which is the number itself).

---

Here is the challenge: Given an arbitrary sample, can we provide a confidence interval (CI) such that 95% of all CIs computed in this way contain the population mean 0.5?

Here is a Python program showing that given a sample mean x, the 95% CI is [x - 0.475, x + 0.475]:

  from random import random
  
  NUM_SAMPLES = 1000000
  ERROR = 0.475
  POP_MEAN = (1 - 0) / 2
  
  means = [random() for _ in range(NUM_SAMPLES)]
  num_correct = sum(x - ERROR <= POP_MEAN <= x + ERROR for x in means)
  
  print(f'% CIs containing the pop mean: {num_correct / NUM_SAMPLES}')
Experimentally, you can see that as NUM_SAMPLES gets larger, the percentage of intervals that contain 0.5 approaches 95%.


You are correct that the "easy interpretation" is not valid for confidence intervals. However, it is the correct interpretation for Bayesian credible intervals. Re-read my comment. I think I'm pretty clear that you need to use a method to convert confidence intervals into probability / credible intervals to make them useful.

You can treat many frequentist statistical methods as shortcuts to Bayesian methods with flat priors, thus allowing you to exchange confidence (a meaningless concept in practice) for probability.

You demonstrated in your simulation what confidence means, but in applications, we only have one confidence interval per parameter, not 1,000,000, so the "reliability of a procedure" interpretation is not useful.


I agree that often in practice a confidence interval will end up being similar to a Bayesian credible interval. However, having a flat prior is not enough to guarantee this: the post gives an example with a uniform prior and a valid 70% confidence set that only captures 37.5% of posterior mass, and even an example with a uniform prior and a valid 70% confidence interval that captures 0% of posterior mass (because, similarly to your example, the set is empty).


Thanks, you are right that a flat prior does does guarantee numerical correspondence. I probably should have said 'matching prior'.


What really clicked confidence intervals for me is the explanation that the '90%' in a 90% confidence interval indicates that 90% of the 90% confidence intervals will contain their true parameter.

There is no reason to restrict this to repeating the same experiment many times. The confidence interval above applies to any & all constructed confidence intervals, even those estimating completely unrelated parameters.

Relevant xkcd: https://xkcd.com/882/




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: