Confidence interval calculation. I need to compute models, which return the probability to reach the goal state per every sample. For some models I have the probability sequence equal to 0,0,0,0,0,0,0,0,0… and 1,1,1,1,1,1,1,1,1,1,1…

Zackary Duffy

Zackary Duffy

Answered question

2022-09-06

Confidence interval calculation
I need to compute models, which return the probability to reach the goal state per every sample. For some models I have the probability sequence equal to 0,0,0,0,0,0,0,0,0… and 1,1,1,1,1,1,1,1,1,1,1…
I need to compute confidence intervals for these extreme conditions as well as for different cases, where I have some zeros and ones in the sequence.
I can't use the general formula
C I = ( μ z σ / n , μ + z σ / n ) .
here, as well as t-distribution formula, because it will return the interval equal to [ μ 0 , μ + 0 ] immediately after the first sample.

Answer & Explanation

darkflamexivcr

darkflamexivcr

Beginner2022-09-07Added 14 answers

Step 1
You are using the so-called Wald interval, based on normal approximation and other asymptotics. It is OK for huge n (e.g., nationwide public opinion polls). But it is known to have problems (including the one you mention) for small n.
Agresti-Couil Interval: Let n ~ = n + 4 and estimate p ~ = x + 2 n ~ .. Then a 95% CI for p is of the form
p ~ ± 1.96 p ~ ( 1 p ~ ) n ~ .
This amounts to appending two successes and two failures to the actual data before computing the CI.
For x = 0 successes and n = 10 trials, this gives (−.04,.33), which you can interpret as (0,.33). [Similarly for upper endpoints above 1.]
The Agresti interval closely approximates the more accurate (but messier) Wilson interval. The Wilson interval 'inverts the test' for H 0 : p = p 0 vs H a : p p 0 , rejecting when | Z | 1.96 ,, where Z = x / n p 0 p 1 ( 1 p 0 ) / n .
Step 2
Bayesian Interval from Uniform Prior. Using U n i f ( 0 , 1 ) B e t a ( 1 , 1 ) as prior and the binomial likelihood p x ( 1 p ) n x , the Bayesian posterior is B e t a ( x + 1 , n x + 1 ) and the 95% posterior probability interval uses quantiles .025 and .975 of the posterior distribution. For x = 0 and n = 10 ,, this interval is (.002,.282). The Bayesian interpretation differs in philosophy, but the numerical results are sometimes used as a confidence interval by frequentists. This interval can never produce endpoints outside of (0,1).
> qbeta(c(.025,.974), 1, 11)
[1] 0.002298972 0.282359950
Of course, the Bayesian approach has the advantage of being able to take substantive prior information, if available, into account by using an 'informative' prior.

Do you have a similar question?

Recalculate according to your conditions!

New Questions in College Statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?