# The situation in the question :so we have a number of scatter plots with each showing an estimated regression line (based on a valid model) and associated individual 95% con fidence intervals (CI) for the regression function at each x-value, as well as the observed data. A professor asks 'I don't understand how 95% of the observations fall outside the 95% CI as depicted in the figures'. Briefly explain how is is entirely possible that 95% of the observations fall outside the 95% CI as depicted in the figures.(We weren't given actual figures)

the situation in the question :so we have a number of scatter plots with each showing an estimated regression line (based on a valid model) and associated individual 95% con fidence intervals (CI) for the regression function at each x-value, as well as the observed data. A professor asks 'I don't understand how 95% of the observations fall outside the 95% CI as depicted in the figures'. Briefly explain how is is entirely possible that 95% of the observations fall outside the 95% CI as depicted in the figures.(We weren't given actual figures)
Anyway I thought that it may have been due to the fact that a lot of outliers affected the regression line calculated, and so a confidence interval formed from a bad regression line would be bad - resulting in 95% of observations falling outside the 95% CI.
You can still ask an expert for help

• Questions are typically answered in as fast as 30 minutes

Solve your problem for the price of one coffee

• Math expert for every subject
• Pay only if we can solve it

Kaiden Stevens
Step 1
In a classical frequentest setting, the probability statements regarding a confidence interval relates to the (random) bounds of the interval. For example, take the common confidence interval for the mean, $\mu$, of some normal data generating process. We have
$P\left(\overline{y}-1.96\frac{\sigma }{\sqrt{n}}<\mu <\overline{y}+1.96\frac{\sigma }{\sqrt{n}}\right)=0.95$
Step 2
Notice that $\mu$ is not treated as random, it is 'fixed' as there is only one true mean. The probability statements we make corresponds to the lower and upper bounds of the interval, that is, $\overline{y}±1.96\frac{\sigma }{\sqrt{n}}$, since these bounds depend on $\overline{y}$ (let's for the moment assume we know σ), then it could be entirely possibly (due to sheer 'luck') for a specific sample, we obtain a value for y¯ that results in the entire interval lying completely outside the majority of the observations. However, what the confidence interval does say is that during repeated sampling, 95% of the times we should expect to see the interval encapsulating the true mean.