Why is the "wrong" interpretation of confidence intervals still seemingly correct?

Alice Chen

Alice Chen

Answered question

2022-11-11

Why is the "wrong" interpretation of confidence intervals still seemingly correct?
According to online sources, if you are operating at 95% confidence, it means if you repeated a sampling process many times and then looked at the 95% confidence intervals over all the results, 95% of the time the brackets would contain the true population mean.
But then they say that it is NOT the same as saying "you can be 95% confident that the intervals you computed contain the population mean."
Isn't it, though? For instance I do my first experiment and get a 95% confidence interval. Then a second. Third, fourth, ..., 100th. 95 of those intervals should contain the population mean. Is this correct so far?
If so, then why isn't it the same as me saying, from the moment I did the very first test, "this particular interval has a 95% chance at being one of the intervals that contain the true population mean"?

Answer & Explanation

Raven Hawkins

Raven Hawkins

Beginner2022-11-12Added 19 answers

Explanation:
Certainly the probability that the interval you will get contains the population mean is 0.95, but the conditional probability given the numbers that you got can be different. Here are four examples, one of which (the second) is realistic.
- One instance where that would obviously happen is when you know the population mean, so that the conditional probability given what you know about both the population and the sample would be either 0 or 1.
- A less extreme example is when you have a prior probability distribution for the population mean and your confidence interval falls within some region where the mean is unlikely to be. This can happen in some practical situations.
- A more disturbing case goes like this: Suppose you have a sample of size 3 from a uniform distribution on the interval [0,A] and your confidence interval for the population mean A/2 is [ B X ¯ , C X ¯ ] where X ¯ is the sample mean. I leave it as an exercise to find the correct values of B and C to make this a 95% confidence interval. Now suppose the sample you get is 1,2,99, so that X ¯ = 34 and the confidence interval is [34B,34C]. If I'm not mistaken, the confidence interval excludes 99 / 2 = 49.5, but in fact you know that the mean is at least 99/2 in this case! The data alone tell you that this is one of the other 5%. (This one is of course easily remedied by observing that the minimal sufficient statistic is the maximum observed value, and then using a confidence interval of the form [B′max,C′max], where B′ and C′ would both be less than 1.)
A case that is perhaps even more disturbing is this. Two independent observations are uniformly distributed between A 1 / 2 and A + 1 / 2. Call the larger of these max and the smaller min. Clearly [min,max] is a 50% confidence interval for A. But if max min = 0.0001 then you would be a fool if you did not find it highly improbable that A is between them, and if max min = 0.9999 you would be a fool if you were not nearly 100% sure that A is between them. This technique gives you a 50% coverage rate, but the data tell you whether the instance you've got is likely to be among that 50% or not. (This one also has a standard remedy: Ronald Fisher's technique of conditioning on an ancillary statistic, which in this case is max min. You get a more reasonable 50% confidence interval. I don't remember the details, but it is the same as the posterior distribution when you use an improper uniform prior on the real line.)

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?