# Boxplot: whiskers and outliers doubt I have a doubt on boxplot. I'll expose my knowledge and the

Boxplot: whiskers and outliers doubt
I have a doubt on boxplot.
I'll expose my knowledge and then my doubt.
$x=\left\{{x}_{1},{x}_{2}...{x}_{n}\right\}$: the set of samples
${q}_{1}$,${q}_{3}$: the first and third quartiles
${w}_{l}$,${w}_{u}$: the lower and upper whiskers
$IQR={q}_{3}-{q}_{1}$
box extends from ${q}_{1}$ to ${q}_{3}$
${w}_{l}=max\left(min\left(x\right),{q}_{1}-1.5\cdot IQR\right)$
${w}_{u}=min\left(max\left(x\right),{q}_{3}+1.5\cdot IQR\right)$
$outliers=\left\{{x}_{i}\in x\phantom{\rule{thickmathspace}{0ex}}|\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}{x}_{i}<{w}_{l}\vee {x}_{i}>{w}_{u}\right\}$
Observations:
$\text{whiskers' distance from box are not symmetric}\phantom{\rule{0ex}{0ex}}\phantom{\rule{thickmathspace}{0ex}}⟺\phantom{\rule{thickmathspace}{0ex}}\left({w}_{l}=min\left(x\right)\vee {w}_{u}=max\left(x\right)\right)$
${w}_{u}-{q}_{3}<{q}_{1}-{w}_{l}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}\nexists {x}_{i}:{x}_{i}\in outliers\wedge {x}_{i}>{w}_{u}$
${w}_{u}-{q}_{3}>{q}_{1}-{w}_{l}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}\phantom{\rule{thickmathspace}{0ex}}⟹\phantom{\rule{thickmathspace}{0ex}}\nexists {x}_{i}:{x}_{i}\in outliers\wedge {x}_{i}<{w}_{l}$
My doubt: if all what I exposed is correct, how do you explain the presence of outliers in this speed of light boxplot (third experiment, lower outliers) and in this plot (see wednesday, lower outliers)?
In the case my reasoning is wrong, please provide a simple numeric counterexample.
You can still ask an expert for help

• Questions are typically answered in as fast as 30 minutes

Solve your problem for the price of one coffee

• Math expert for every subject
• Pay only if we can solve it

thatuglygirlyu
Consider the data
$\left\{0,4,5,5,5,6,6,6,6,7,20\right\}.$
The median is $6$, the first quartile is $5$, and the third quartile is $6$. So the IQR is $1$ and it easily follows that $\left\{0\right\}$ is a lower outlier and $\left\{20\right\}$ is an upper outlier. What you need to take into account is that the box shows you where 50% of the data lies, so if this is particularly narrow, then the IQR is small, and any values outside the range determined by the 1.5IQR rule are outliers. There can be many outliers, or none at all.

ttyme411gl
The definitions of ${w}_{l}$ and ${w}_{u}$ in my question were wrong. Referring to Wikipedia:
"whiskers can represent several possible alternative values" such as "the minimum and maximum of all of the data" or "the lowest datum still within 1.5 IQR of the lower quartile, and the highest datum still within 1.5 IQR of the upper quartile", or even "one standard deviation above and below the mean of the data" and finally "the 9th percentile and the 91st percentile" or "the 2nd percentile and the 98th percentile".