# Using the daily high and low temperature readings at Chicago's O'Hare International Airport for an entire year, a meteorologist made a scatterplot relating y = high temperature to x = low temperature, both in degrees Fahrenheit. After verifying that the conditions for the regression model were met, the meteorologist calculated the equation of the population regression line to be \left[\mu_y=16.6+1.02\right] with \left[\sigma = 6.6+^\circ F\right]. About what percent of days with a low temperature of 40^circ F have a high temperature greater than 70^\circ F?

Question
Normal distributions
Using the daily high and low temperature readings at Chicago's O'Hare International Airport for an entire year, a meteorologist made a scatterplot relating y = high temperature to x = low temperature, both in degrees Fahrenheit.
After verifying that the conditions for the regression model were met, the meteorologist calculated the equation of the population regression line to be $$\displaystyle{\left[\mu_{{y}}={16.6}+{1.02}\right]}{w}{i}{t}{h}{\left[\sigma={6.6}+^{\circ}{F}\right]}$$.
About what percent of days with a low temperature of $$\displaystyle{40}^{\circ}$$ F have a high temperature greater than $$\displaystyle{70}^{\circ}$$ F?

2020-12-29
Step 1
Given:
$$\displaystyle{\left[\mu_{{y}}={16.6}+{1.02}{x}{r}{i}{>}{h}\right]}$$ (Equation population regression line)
$$\displaystyle{\left[\sigma={6.64}\right]}$$
The average high temperature on days where the low temperature is $$\displaystyle{40}^{\circ}$$ F according to the population regression line can be found by replacing 2 in the regression line equation by 40 and evaluating.
$$\displaystyle{\left[\mu_{{y}}={16.6}+{1.02}{\left({40}\right)}={16.6}+{40.8}={57.4}\right]}$$
Thus the mean is 57.4 and the standard deviation is 6.64.
Since the conditions are met, the response y varies according to a Normal distribution.
The z-score is the value decreased by the mean, divided by the standard deviation.
$$\displaystyle{\left[{z}={\frac{{{x}-\mu}}{{\sigma}}}={\frac{{{70}-{57.4}}}{{{6.64}}}}\approx{1.90}\right]}$$
Determine the corresponding probability using the normal probability table in the appendix. $$\displaystyle{\left[{P}{\left({Z}{<}{1.90}\right)}\right]}$$</span> is given in the row starting with 1.9 and in the column starting with .00 of the standard normal probability table in the appendix.
P(X>70)=P(Z>1.90)
=1-P(Z
=1-0.9713
=0.0287
=2.87%
Thus about 2.87% of the days with a low temperature of $$\displaystyle{40}^{\circ}$$ F are expected to have a high temperature that is greater than $$\displaystyle{70}^{\circ}$$ F.
Result: 2.87%

### Relevant Questions

Using the daily high and low temperature readings at Chicago's O'Hare International Airport for an entire year, a meteorologist made a scatterplot relating y = high temperature to x = low temperature, both in degrees Fahrenheit.
After verifying that the conditions for the regression model were met, the meteorologist calculated the equation of the population regression line to be $$\displaystyle{\left[\mu_{{y}}={16.6}+{1.02}\right]}{w}{i}{t}{h}{\left[\sigma={6.6}+^{\circ}{F}\right]}$$.
About what percent of days with a low temperature of $$\displaystyle{40}^{\circ}$$ F?
Using the health records of ever student at a high school, the school nurse created a scatterplot relating y = height (in centimeters) to x = age (in years).
After verifying that the conditions for the regression model were met, the nurse calculated the equation of the population regression line to be $$\displaystyle\mu_{{0}}={105}+{4.2}{x}{w}{i}{t}{h}\sigma={7}{c}{m}$$.
About what percent of 15-year-old students at this school are taller than 180 cm?
Using the health records of ever student at a high school, the school nurse created a scatterplot relating $$\displaystyle{y}=\ \text{height (in centimeters) to}\ {x}=\ \text{age (in years).}$$
$$\displaystyle\text{After verifying that the conditions for the regression model were met, the nurse calculated the equation of the population regression line to be}\ \mu_{{{0}}}={105}\ +\ {4.2}{x}\ \text{with}\ \sigma={7}\ {c}{m}.$$ About what percent of 15-year-old students at this school are taller than 180 cm?
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of $$25^{\circ}F$$. However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to $$25^{\circ}F$$. One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a $$5\%$$ level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}$$
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
$$df_{N} = ?$$
$$df_{D} = ?$$
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
Use the technology of your choice to do the following tasks. The National Oceanic and Atmospheric Administration publishes temperature and precipitation information for cities around the world in Climates of the World. Data on average high temperature (in degrees Fahrenheit) in July and average precipitation (in inches) in July for 48 cities are on the WeissStats CD. For part (d), predict the average July precipitation of a city with an average July temperature of $$\displaystyle{83}^{{\circ}}{F}$$ a) Construct and interpret a scatterplot for the data. b) Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f). c) Determine and interpret the regression equation. d) Make the indicated predictions. e) Compute and interpret the correlation coefficient. f) Identify potential outliers and influential observations.
1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance.
The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P $$(A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).$$
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
American automobiles produced in 2012 and classified as “large” had a mean fuel economy of 19.6 miles per gallon with a standard deviation of 3.36 miles per gallon. A particular model on this list was rated at 23 miles per gallon, giving it a z-score of about 1.01. Which statement is true based on this information? A) Because the standard deviation is small compared to the mean, a Normal model is appropriate and we can say that about 84.4% of “large” automobiles have a fuel economy of 23 miles per gallon or less. B) Because a z-score was calculated, it is appropriate to use a Normal model to say that about 84.4% of “large” automobiles have a fuel economy of 23 miles per gallon or less. C) Because 23 miles per gallon is greater than the mean of 19.6 miles per gallon, the distribution is skewed to the right. This means the z-score cannot be used to calculate a proportion.D) Because no information was given about the shape of the distribution, it is not appropriate to use the z-score to calculate the proportion of automobiles with a fuel economy of 23 miles per gallon or less. E) Because no information was given about the shape of the distribution, it is not appropriate to calculate a z-score, so the z-score has no meaning in this situation.
Replacement of paint on highways and streets represents a large investment of funds by state and local governments each year. A new, cheaper brand of paint is tested for durability after one month’s time by reflectometer readings. For the new brand to be acceptable, it must have a mean reflectometer reading greater than 19.6. The sample data, based on 35 randomly selected readings, show $$x =19.8\ and\ s=1.5$$. Do the sample data provide sufficient evidence to conclude that the new brand is acceptable? Conduct hypothesis test using $$a=.05$$. Use the traditional approach and the p-value approach to hypothesis testing! Show all of the steps of the hypothesis test for each approach.
Case: Dr. Jung’s Diamonds Selection
With Christmas coming, Dr. Jung became interested in buying diamonds for his wife. After perusing the Web, he learned about the “4Cs” of diamonds: cut, color, clarity, and carat. He knew his wife wanted round-cut earrings mounted in white gold settings, so he immediately narrowed his focus to evaluating color, clarity, and carat for that style earring.
After a bit of searching, Dr. Jung located a number of earring sets that he would consider purchasing. But he knew the pricing of diamonds varied considerably. To assist in his decision making, Dr. Jung decided to use regression analysis to develop a model to predict the retail price of different sets of round-cut earrings based on their color, clarity, and carat scores. He assembled the data in the file Diamonds.xls for this purpose. Use this data to answer the following questions for Dr. Jung.
1) Prepare scatter plots showing the relationship between the earring prices (Y) and each of the potential independent variables. What sort of relationship does each plot suggest?
2) Let X1, X2, and X3 represent diamond color, clarity, and carats, respectively. If Dr. Jung wanted to build a linear regression model to estimate earring prices using these variables, which variables would you recommend that he use? Why?
3) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
4) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
5) Dr. Jung now remembers that it sometimes helps to perform a square root transformation on the dependent variable in a regression problem. Modify your spreadsheet to include a new dependent variable that is the square root on the earring prices (use Excel’s SQRT( ) function). If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
1
6) Suppose Dr. Jung decides to use clarity (X2) and carats (X3) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
7) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must actually square the model’s estimates to convert them to price estimates.) Which sets of earring appears to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
8) Dr. Jung now also remembers that it sometimes helps to include interaction terms in a regression model—where you create a new independent variable as the product of two of the original variables. Modify your spreadsheet to include three new independent variables, X4, X5, and X6, representing interaction terms where: X4 = X1 × X2, X5 = X1 × X3, and X6 = X2 × X3. There are now six potential independent variables. If Dr. Jung wanted to build a linear regression model to estimate the square root of earring prices using the same independent variables as before, which variables would you recommend that he use? Why?
9) Suppose Dr. Jung decides to use color (X1), carats (X3) and the interaction terms X4 (color * clarity) and X5 (color * carats) as independent variables in a regression model to predict the square root of the earring prices. What is the estimated regression equation? What is the value of the R2 and adjusted-R2 statistics?
10) Use the regression equation identified in the previous question to create estimated prices for each of the earring sets in Dr. Jung’s sample. (Remember, your model estimates the square root of the earring prices. So you must square the model’s estimates to convert them to actual price estimates.) Which sets of earrings appear to be overpriced and which appear to be bargains? Based on this analysis, which set of earrings would you suggest that Dr. Jung purchase?
...