Question

The National Oceanic and Atmospheric Administration publishes temperature information of cities around the world in Climates of the World.

Scatterplots
ANSWERED
asked 2021-03-02
The National Oceanic and Atmospheric Administration publishes temperature information of cities around the world in Climates of the World. A random sample of 50 cities gave the data on average high and low temperatures in January shown.

a) Obtain a scatterplot for the data.

b) Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f).

c) Determine and interpret the regression equation for the data.

d) Identify potential outliers and influential observations.

e) In case a potential outlier is present, remove it and discuss the effect.

f) In case a potential influential observation is present, remove it and discuss the effect.

Answers (1)

2021-03-03

Given: \(\displaystyle{n}=\ \text{Sample size}\ ={50}\) a) High is on the horizontal axis and Low is on the vertical axis. image b) It is reasonable to find a regression lien for the data if there is no strong curvature present in the scatterplot. We note that there is no strong curvature in the scatterplot of part (a) and thus it is reasonable to find a regression line for the data.

c) Let us first determine the necessary sums: \(\displaystyle\sum\ {x}_{{{i}}}={2843}\)
\(\displaystyle\sum\ {{x}_{{{i}}}^{{{2}}}}={181233}\)
\(\displaystyle\sum\ {y}_{{{i}}}={2228}\)
\(\displaystyle\sum\ {x}_{{{i}}}{y}_{{{i}}}={144636}\) Next, we can determine \(\displaystyle{S}_{{\times}}\) and \(\displaystyle{S}_{{{x}{y}}}\)
\(\displaystyle{S}_{{xx}}=\ \sum\ {{x}_{{{i}}}^{{{2}}}}\ -\ {\frac{{{\left(\sum\ {x}_{{{i}}}\right)}^{{{2}}}}}{{{n}}}}={181233}\ -\ {\frac{{{2843}^{{{2}}}}}{{{50}}}}={19580.02}\)
\(\displaystyle{S}_{{{x}{y}}}=\ \sum\ {x}_{{{i}}}{y}_{{{i}}}\ -\ {\frac{{{\left(\sum\ {x}_{{{i}}}\right)}{\left(\sum\ {y}_{{{i}}}\right)}}}{{{n}}}}={144636}\ -\ {\frac{{{2843}\ \cdot\ {2228}}}{{{50}}}}={17951.92}\) The estimate b of the slope \(\displaystyle\beta\) is the ratio of \(\displaystyle{S}_{{{x}{y}}}\) and \(\displaystyle{S}_{{xx}}\): \(\displaystyle{b}=\ {\frac{{{S}_{{{x}{y}}}}}{{{S}_{{xx}}}}}=\ {\frac{{{17951.92}}}{{{19580.02}}}}={0.9168}\) The mean is the sum of all values divided by the number of values: \(\displaystyle\overline{{{x}}}=\ {\frac{{\sum\ {x}_{{{i}}}}}{{{n}}}}=\ {\frac{{{2843}}}{{{50}}}}={56.86}\)
\(\displaystyle\overline{{{y}}}=\ {\frac{{\sum\ {y}_{{{i}}}}}{{{n}}}}=\ {\frac{{{2228}}}{{{50}}}}={44.56}\) The estimate a of the intercept \(\displaystyle\alpha\) is the average of y decreased by the product of the estimate of the slope and the average of x. \(\displaystyle{a}=\ \overline{{{y}}}\ -\ {b}\ \overline{{{x}}}={44.56}\ -\ {0.9168}\ \cdot\ {56.86}=\ -{7.5720}\) General least-squares equation: \(\displaystyle\hat{{{y}}}=\ \alpha\ +\ \beta\ {x}\). Replace \(\displaystyle\alpha\) by \(\displaystyle{a}=\ -{7.5720}\) and \(\displaystyle\beta\) by \(\displaystyle{b}={0.9168}\) in the general least-squares equation: \(\displaystyle{y}={a}\ +\ {b}{x}=\ -{7.5720}\ +\ {0.9168}{x}\)

d) There appear to be no outliers, because no points in the graph deviate strongly from the general pattern in the other points. There appear to be no influential observatioons, because all data values lie near the regression line.

e) Not applicable, because we concluded that there are no potential outliers in part (d).

f) Not applicable, because we concluded that there are no potential outliers in part (d).

0
 
Best answer

expert advice

Need a better answer?

Relevant Questions

asked 2020-12-30
Use the technology of your choice to do the following tasks. The National Oceanic and Atmospheric Administration publishes temperature and precipitation information for cities around the world in Climates of the World. Data on average high temperature (in degrees Fahrenheit) in July and average precipitation (in inches) in July for 48 cities are on the WeissStats CD. For part (d), predict the average July precipitation of a city with an average July temperature of \(\displaystyle{83}^{{\circ}}{F}\) a) Construct and interpret a scatterplot for the data. b) Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f). c) Determine and interpret the regression equation. d) Make the indicated predictions. e) Compute and interpret the correlation coefficient. f) Identify potential outliers and influential observations.
asked 2021-01-07

The U.S. Census Bureau publishes information on the population of the United States in Current Population Reports. The following table gives the resident U.S. population, in millions of persons, for the years 1990-2009. Forecast the U.S. population in the years 2010 and 2011

\(\begin{array}{|c|c|} \hline \text{Year} & \text{Population (millions)} \\ \hline 1990 & 250 \\ \hline 1991 & 253\\ \hline 1992 & 257\\ \hline 1993 & 260\\ \hline 1994 & 263\\ \hline 1995 & 266\\ \hline 1996 & 269\\ \hline 1997 & 273\\ \hline 1998 & 276\\ \hline 1999 & 279\\ \hline 2000 & 282\\ \hline 2001 & 285\\ \hline 2002 & 288\\ \hline 2003 & 290\\ \hline 2004 & 293\\ \hline 2005 & 296\\ \hline 2006 & 299\\ \hline 2007 & 302\\ \hline 2008 & 304\\ \hline 2009 & 307\\ \hline \end{array}\)

a) Obtain a scatterplot for the data.

b) Find and interpret the regression equation.

c) Mace the specified forecasts.

asked 2021-06-24
State whether the investigation in question is an observational study or a designed experiment. Justify your answer in each case. The National Association of Colleges and Employers (NACE) compiles information on salary offers to new college graduates and publishes the results in Salary Survey.
asked 2021-07-04

Using the daily high and low temperature readings at Chicago's O'Hare International Airport for an entire year, a meteorologist made a scatterplot relating y = high temperature to x = low temperature, both in degrees Fahrenheit.

After verifying that the conditions for the regression model were met, the meteorologist calculated the equation of the population regression line to be  \(\left[\mu_y=16.6+1.02\right] with \left[\sigma = 6.6+^\circ F\right]\)

If the meteorologist used a random sample of 10 days to calculate the regression line instead of using all the days in the year, would the slope of the sample regression line be exactly 1.02? Explain your answer.

...