# The data for each grade have the same interquartile range (IQR). Which of the following best compares the two best score distributions? With reference to line plots the data for Sixth grade geography test score is 7 8 8 9 9 9 9 9 10 10 10 11 11 11 12 12 12 14 14 15 The data of seventh grade geography test score is 7 10 10 11 11 11 11 12 12 13 13 13 13 13 14 14 14 15 16 17 Question
Data distributions The data for each grade have the same interquartile range (IQR). Which of the following best compares the two best score distributions?
With reference to line plots the data for Sixth grade geography test score is
7 8 8 9 9 9 9 9 10 10 10 11 11 11 12 12 12 14 14 15
The data of seventh grade geography test score is
7 10 10 11 11 11 11 12 12 13 13 13 13 13 14 14 14 15 16 17 2021-02-14
The median is the middle value of the data distribution. Here there are 20 data points for sixth and seventh grade test scores each. So the median will be the average of $$\displaystyle{10}^{{{t}{h}}}$$ and $$\displaystyle{11}^{{{t}{h}}}$$ data point when arranged in ascending order.
The median of sixth grade geography test score is calculated as shown below
$$\displaystyle{10}^{{{t}{h}}}$$ data point =10
$$\displaystyle{11}^{{{t}{h}}}$$ data point =10
Median = $$\displaystyle\frac{{{10}+{10}}}{{2}}={10}$$
The median of seventh grade geography test score is calculated as shown below
$$\displaystyle{10}^{{{t}{h}}}$$ data point =13
$$\displaystyle{11}^{{{t}{h}}}$$ data point =13
Median = (13+13)/2=13ZSK
The IQR is difference between Quartile 3 and Quartile 1.
For sixth grade geography test scores
7 8 8 9 9 9 9 9 10 10 10 11 11 11 12 12 12 14 14 15
The Quartile 1 divide the data distribution in such way that 25% of data lie less than it and 75% lie more than it. In a way we can say it is median of first half of data distribution.
Quartile 1 is average of 5th and 6th data points when arranged in ascending order
$$\displaystyle{Q}_{{1}}=\frac{{{9}+{9}}}{{2}}={9}$$
The Quartile 3 divide the data distribution in such way that 75% of data lie less than it and 25% lie more than it. In a way we can say it is median of second half of data distribution.
Quartile 3 is average of $$\displaystyle{15}^{{{t}{h}}}$$ and $$\displaystyle{16}^{{{t}{h}}}$$ data points when arranged in ascending order
$$\displaystyle{Q}_{{1}}=\frac{{{12}+{12}}}{{2}}={12}$$
Inter Quartile Range =Q3−Q1=12−9=3
For seventh grade geography test scores
7 10 10 11 11 11 11 12 12 13 13 13 13 13 14 14 14 15 16 17
The Quartile 1 divide the data distribution in such way that 25% of data lie less than it and 75% lie more than it. In a way we can say it is median of first half of data distribution.
Quartile 1 is average of $$\displaystyle{5}^{{{t}{h}}}$$ and $$\displaystyle{6}^{{{t}{h}}}$$ data points when arranged in ascending order
$$\displaystyle{Q}_{{1}}=\frac{{{11}+{11}}}{{2}}={11}$$
The Quartile 3 divide the data distribution in such way that 75% of data lie less than it and 25% lie more than it. In a way we can say it is median of second half of data distribution.
Quartile 3 is average of $$\displaystyle{15}^{{{t}{h}}}$$ and $$\displaystyle{16}^{{{t}{h}}}$$ data points when arranged in ascending order
$$\displaystyle{Q}_{{1}}=\frac{{{14}+{14}}}{{2}}={14}$$
Inter Quartile Range =Q3−Q1=14−11=3
Thus the median of Seventh grade is 13 and median of sixth grade is 10, while the Inter quartile range for both of them is 3. So option 3 is correct.
The median score of the seventh grade class is 3 points greater than the median score of the sixth grade class. The difference is the same as the IQR.

### Relevant Questions The pathogen Phytophthora capsici causes bell pepper plants to wilt and die. A research project was designed to study the effect of soil water content and the spread of the disease in fields of bell peppers. It is thought that too much water helps spread the disease. The fields were divided into rows and quadrants. The soil water content (percent of water by volume of soil) was determined for each plot. An important first step in such a research project is to give a statistical description of the data. Soil Water Content for Bell Pepper Study \begin{matrix} 15 & 14 & 14 & 14 & 13 & 12 & 11 & 11 & 11 & 11 & 10 & 11 & 13 & 16 \\ 9 & 15 & 12 & 9 & 10 & 7 & 14 & 13 & 14 & 8 & 9 & 8 & 11 & 13 \\ 15 & 12 & 9 & 10 & 9 & 9 & 16 & 16 & 12 & 10 & 11 & 11 & 12 & 15 \\ 10 & 10 & 10 & 11 & 9 \end{matrix} If you have a statistical calculator or computer, use it to find the actual sample mean and sample standard deviation. A two-sample inference deals with dependent and independent inferences. In a two-sample hypothesis testing problem, underlying parameters of two different populations are compared. In a longitudinal (or follow-up) study, the same group of people is followed over time. Two samples are said to be paired when each data point in the first sample is matched and related to a unique data point in the second sample.
This problem demonstrates inference from two dependent (follow-up) samples using the data from the hypothetical study of new cases of tuberculosis (TB) before and after the vaccination was done in several geographical areas in a country in sub-Saharan Africa. Conclusion about the null hypothesis is to note the difference between samples.
The problem that demonstrates inference from two dependent samples uses hypothetical data from the TB vaccinations and the number of new cases before and after vaccination. PSK\begin{array}{|c|c|} \hline Geographical\ regions & Before\ vaccination & After\ vaccination\\ \hline 1 & 85 & 11\\ \hline 2 & 77 & 5\\ \hline 3 & 110 & 14\\ \hline 4 & 65 & 12\\ \hline 5 & 81 & 10\\\hline 6 & 70 & 7\\ \hline 7 & 74 & 8\\ \hline 8 & 84 & 11\\ \hline 9 & 90 & 9\\ \hline 10 & 95 & 8\\ \hline \end{array}ZSK
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following: Construct a one-sided $$\displaystyle{95}\%$$ confidence interval for the true difference in population means. Test the null hypothesis that the population means are identical at the 0.05 level of significance. The article “Anodic Fenton Treatment of Treflan MTF” describes a two-factor experiment designed to study the sorption of the herbicide trifluralin. The factors are the initial trifluralin concentration and the $$\displaystyle{F}{e}^{{{2}}}\ :\ {H}_{{{2}}}\ {O}_{{{2}}}$$ delivery ratio. There were three replications for each treatment. The results presented in the following table are consistent with the means and standard deviations reported in the article. $$\displaystyle{b}{e}{g}\in{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}\text{Initial Concentration (M)}&\text{Delivery Ratio}&\text{Sorption (%)}\ {15}&{1}:{0}&{10.90}\quad{8.47}\quad{12.43}\ {15}&{1}:{1}&{3.33}\quad{2.40}\quad{2.67}\ {15}&{1}:{5}&{0.79}\quad{0.76}\quad{0.84}\ {15}&{1}:{10}&{0.54}\quad{0.69}\quad{0.57}\ {40}&{1}:{0}&{6.84}\quad{7.68}\quad{6.79}\ {40}&{1}:{1}&{1.72}\quad{1.55}\quad{1.82}\ {40}&{1}:{5}&{0.68}\quad{0.83}\quad{0.89}\ {40}&{1}:{10}&{0.58}\quad{1.13}\quad{1.28}\ {100}&{1}:{0}&{6.61}\quad{6.66}\quad{7.43}\ {100}&{1}:{1}&{1.25}\quad{1.46}\quad{1.49}\ {100}&{1}:{5}&{1.17}\quad{1.27}\quad{1.16}\ {100}&{1}:{10}&{0.93}&{0.67}&{0.80}\ {e}{n}{d}{\left\lbrace{m}{a}{t}{r}{i}{x}\right\rbrace}$$ a) Estimate all main effects and interactions. b) Construct an ANOVA table. You may give ranges for the P-values. c) Is the additive model plausible? Provide the value of the test statistic, its null distribution, and the P-value. Consider the rates of children (under 18 years of age) living in New York with grandparents as their primary caretakers. A sample of 13 New York counties yielded the following percentages of children under 18 living with grandparents.
5.9, 4.0, 5.7, 5.1, 4.1, 4.4, 6.5, 4.4, 5.8, 5.1, 6.1, 4.5, 4.9
a) Obtain and interpret the quartiles.
b) Determine and interpret the interquartile range.
c) Find and interpret the five-number summary Find the mean, median, mode, and range for each data set given.
a. 7, 12, 1, 7, 6, 5, 11
b. 85, 105, 95, 90, 115
c. 10, 14, 16, 16, 8, 9, 11, 12, 3
d. 10, 8, 7, 5, 9, 10, 7
e. 45, 50, 40, 35, 75
f. 15, 11, 11, 16, 16, 9 Case: Dr. Jung’s Diamonds Selection
With Christmas coming, Dr. Jung became interested in buying diamonds for his wife. After perusing the Web, he learned about the “4Cs” of diamonds: cut, color, clarity, and carat. He knew his wife wanted round-cut earrings mounted in white gold settings, so he immediately narrowed his focus to evaluating color, clarity, and carat for that style earring.
After a bit of searching, Dr. Jung located a number of earring sets that he would consider purchasing. But he knew the pricing of diamonds varied considerably. To assist in his decision making, Dr. Jung decided to use regression analysis to develop a model to predict the retail price of different sets of round-cut earrings based on their color, clarity, and carat scores. He assembled the data in the file Diamonds.xls for this purpose. Use this data to answer the following questions for Dr. Jung.
1) Prepare scatter plots showing the relationship between the earring prices (Y) and each of the potential independent variables. What sort of relationship does each plot suggest?
2) Let X1, X2, and X3 represent diamond color, clarity, and carats, respectively. If Dr. Jung wanted to build a linear regression model to estimate earring prices using these variables, which variables would you recommend that he use? Why?
3) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
4) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
5) Dr. Jung now remembers that it sometimes helps to perform a square root transformation on the dependent variable in a regression problem. Modify your spreadsheet to include a new dependent variable that is the square root on the earring prices (use Excel’s SQRT( ) function). If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
1
6) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
7) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must actually square the model’s estimates to convert them to price estimates.) Which sets of earring appears to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
8) Dr. Jung now also remembers that it sometimes helps to include interaction terms in a regression model—where you create a new independent variable as the product of two of the original variables. Modify your spreadsheet to include three new independent variables, X4, X5, and X6, representing interaction terms where: X4 = X1 × X2, X5 = X1 × X3, and X6 = X2 × X3. There are now six potential independent variables. If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
9) Suppose Dr. Jung decides to use color (X1), carats (X3) and the interaction terms X4 (color * clarity) and X5 (color * carats) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
10) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must square the model’s estimates to convert them to actual price estimates.) Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase? 1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance. True or False
1.The goal of descriptive statistics is to simplify, summarize, and organize data.
2.A summary value, usually numerical, that describes a sample is called a parameter.
3.A researcher records the average age for a group of 25 preschool children selected to participate in a research study. The average age is an example of a statistic.
4.The median is the most commonly used measure of central tendency.
5.The mode is the best way to measure central tendency for data from a nominal scale of measurement.
6.A distribution of scores and a mean of 55 and a standard deviation of 4. The variance for this distribution is 16.
7.In a distribution with a mean of M = 36 and a standard deviation of SD = 8, a score of 40 would be considered an extreme value.
8.In a distribution with a mean of M = 76 and a standard deviation of SD = 7, a score of 91 would be considered an extreme value.
9.A negative correlation means that as the X values decrease, the Y values also tend to decrease.
10.The goal of a hypothesis test is to demonstrate that the patterns observed in the sample data represent real patterns in the population and are not simply due to chance or sampling error. Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of $$\alpha = 0.05$$. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.) Lemons and Car Crashes Listed below are annual data for various years. The data are weights (metric tons) of lemons imported from Mexico and U.S. car crash fatality rates per 100,000 population [based on data from “The Trouble with QSAR (or How I Learned to Stop Worrying and Embrace Fallacy),” by Stephen Johnson, Journal of Chemical Information and Modeling, Vol. 48, No. 1]. Is there sufficient evidence to conclude that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates? Do the results suggest that imported lemons cause car fatalities? $$\begin{matrix} \text{Lemon Imports} & 230 & 265 & 358 & 480 & 530\\ \text{Crashe Fatality Rate} & 15.9 & 15.7 & 15.4 & 15.3 & 14.9\\ \end{matrix}$$ 