# Explain whether there are any outliers in these data or not. The 5-number summary for the run times in minutes of the 150 highest grossing movies of 2010 is provided.

Question
Explain whether there are any outliers in these data or not.
The 5-number summary for the run times in minutes of the 150 highest grossing movies of 2010 is provided.

2020-10-21
Calculation:
Inter quartile range:
In contrast to the range, which measures only differences between the extremes, the inter quartile range (also called mid spread) is the difference between the third quartile and the first quartile. Thus, it measures the variation in the middle 50 percent of the data, and, unlike the range, is not affected by extreme values. $$IQR = Q_{3} - Q_{1}$$,
From the given 5-number summary for the run times in minutes $$Q_{1} = 98\ and\ Q_{3} = 116$$.
$$IQR = Q_{3} - Q_{1}$$
$$= 116-98=18$$
Thus, the interquartile range is 18.
Outlier Rule:
If any observation is greater than $$Q_{3} + 1.5$$ IQR or less than $$Q_{1} — 1.5$$ IQR then that observation is considered as high or low outlier.
$$Upper fence = Q_{3} + 1.5$$ IQR
$$= 116 + 1.5(18)= 116+27= 143$$
Here the maximum value 160 which is greater than the upper fence 143. Hence 160 is considered as the higher outlier.
$$Lower fence = Q_{1} — 1.5/$$IQR
$$= 98 — 1.5(18)$$
$$= 98-27=71$$
Here the minimum value 43 which is below the lower fence 71. Hence, 43 is considered as the low outlier.
Thus, there is a high and a low outlier in these data.

### Relevant Questions

The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P $$(A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).$$
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
Which statement best characterizes the definitions of categorical and quantitative data?
Quantitative data consist of numbers, whereas categorical data consist of names and labels that are not numeric.
Quantitative data consist of numbers representing measurements or counts, whereas categorical data consist of names or labels.
Quantitative data consist of values that can be arranged in order, whereas categorical data consist of values that cannot be arranged in order.
Quantitative data have an uncountable number of possible values, whereas categorical data have a countable number of possible values.
Give the correct choices of these multiple choice questions in questions (a) and (b) and explain your choices (for example: why quantitative and not qualitative? Why neither and not discrete or continuous? Why ratio and not nominal, ordinal, or interval?
a.Question: Birth years of your family? Are these data quantitative or qualitative? Are these data discrete, continuous, or neither? What is the highest level of measurement of birth years? (Nominal, Ordinal, Interval, or Ratio?)
b.Question: Survey responses to the question “what is the gender of your first child?” Are these data quantitative or qualitative? Are these data discrete, continuous, or neither? What is the highest level of measurement associated with the gender measurements? (Nominal, Ordinal, Interval, or Ratio?)
Describe the spread in running times based on quartiles.
The data represents the average drive (in yards) for 186 professional golfers on the men’s PGA tour in 2011. The first quartile is 285.2 yards and third quartile is 297.5 yards.
Geographical Analysis (Oct. 2006) published a study of a new method for analyzing remote-sensing data from satellite pixels in order to identify urban land cover. The method uses a numerical measure of the distribution of gaps, or the sizes of holes, in the pixel, called lacunarity. Summary statistics for the lacunarity measurements in a sample of 100 grassland pixels are x¯=225 and s=20s=20. It is known that the mean lacunarity measurement for all grassland pixels is 220. The method will be effective in identifying land cover if the standard deviation of the measurements is 10% (or less) of the true mean (i.e., if the standard deviation is less than 22). a. Give the null and alternative hypotheses for a test to determine whether, in fact, the standard deviation of all grassland pixels is less than 22. b. A MINITAB analysis of the data is provided below. Locate and interpret the p-value of the test. Use α=.10α=.10. Test for One Standard Deviation Method Null hypothesisSigma = 22 Method Alternative hypothesisSigma = < 22 The standard method is only for the normal distribution. Statistics NStDevVariance 10020.0400 Tests
The manager at Publix recently received information that customer satisfaction dropped at noon due to overcrowding in the checkout aisle. As a result, the manager went to the main floor to record the number of customers waiting in aisles 1-10 at noon.
Which of the following choices would be an accurate description of the way the "number of customers" is used in this data set?
a. individuals for the data set
b. continuous qualitative variable for this data set
c. discrete qualitative variable for this data set
d. qualitative variable for this set
e. continuous quantitative variable for this data set
f. discrete quantitative variable for this data set
Iron is very important for babies' growth. A common belief is that breastfeeding will help the baby to get more iron than formula feeding. To justify the belief, a study followed 2 groups of babies from born to 6 months. With one group babies are breast fed, and the other group are formula fed without iron supplements. Data below shows iron levels of those two groups of babies. $$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\left|{c}\right|}{c}{\mid}\right\rbrace}{h}{l}\in{e}{G}{r}{o}{u}{p}&{S}{a}\mp\le\ {s}{i}{z}{e}&{m}{e}{a}{n}&{S}{\tan{{d}}}{a}{r}{d}\ {d}{e}{v}{i}{a}{t}{i}{o}{n}\backslash{h}{l}\in{e}{B}{r}{e}\ast-{f}{e}{d}&{23}&{13.3}&{1.7}\backslash{h}{l}\in{e}{F}{\quad\text{or}\quad}\mu{l}{a}-{f}{e}{d}&{23}&{12.4}&{1.8}\backslash{h}{l}\in{e}{D}{I}{F}{F}={B}{r}{e}\ast-{F}{\quad\text{or}\quad}\mu{l}{a}&{23}&{0.9}&{1.4}\backslash{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$ (1) There are two groups we need to compare for the study: Breast-Fed and Formula- Fed. Are those two groups dependent or independent? Based on your answer, what inference procedure should we apply for this research? (2) Please perform the inference you decided in (1), and make sure to follow the 5-step procedure for any hypothesis test. (3) Based on your conclusion in (2), what kind of error could you make? Explain the type of error using the context words for this research
Is the gift you purchased for that special someone really appreciated? This was the question investigated in the Journal of Experimental Social Psychology (Vol. 45, 2009). Toe researchers examined the link between engagement ring price (dollars) and level of appreciation of the recipient $$\displaystyle{\left(\text{measured on a 7-point scale where}\ {1}=\ \text{"not at all" and}\ {7}=\ \text{to a great extent"}\right)}.$$ Participants for the study were those who used a popular Web site for engaged couples. The Web site's directory was searched for those with "average" American names (e.g., "John Smith," "Sara Jones"). These individuals were then invited to participate in an online survey in exchange for a \$10 gift certificate. Of the respondents, those who paid really high or really low prices for the ring were excluded, leaving a sample size of 33 respondents. a) Identify the experimental units for this study. b) What are the variables of interest? Are they quantitative or qualitative in nature? c) Describe the population of interest. d) Do you believe the sample of 33 respondents is representative of the population? Explain. e. In a second, designed study, the researchers investigated whether the link between gift price and level of appreciation was stronger for birthday gift givers than for birthday gift receivers. Toe participants were randomly assigned to play the role of gift-giver or gift-receiver. Assume that the sample consists of 50 individuals. Use a random number generator to randomly assign 25 individuals to play the gift-receiver role and 25 to play the gift-giver role.