The ability to estimate the volume of a tree based on a simple measurement, such as the tree’s diameter, is important to the lumber industry, ecologists, and conservationists. Data on volume, in cubic feet, and diameter at breast height, in inches, for 70 shortleaf pines were reported in C. Bruce and F. X. Schumacher’s Forest Mensuration (New York: McGraw-Hill, 1935) and analyzed by A. C. Akinson in the article “Transforming Both Sides of a Tree” (The American Statistician, Vol. 48, pp. 307–312). a) Obtain a scatterplot for the data. b) Decide whether finding a regression line for the data is reasonable. If so, then also do parts (c)-(f). c) Determine and interpret the regression equation for the data. d) Identify potential outliers and influential observations. e) In case a potential outlier is present, remove it and discuss the effect. f) In case a potential influential observation is present, remove it and discuss the effect.

Question

Fatema Sutton · Accepted Answer

Given: n= Sample size =70 a) Diameter is on the horizontal axis and Volume is on the vertical axis.  b) It is reasonable to find a regression lien for the data if there is no strong curvature present in the scatterplot. We note that there is no strong curvature in the scatterplot of part (a) and thus it is reasonable to find a regression line for the data. c) Let us first determine the necessary sums: ∑ xi=782.9 ∑ xi2=9934.65 ∑ yi=2442.7 ∑ xiyi=35376.74 Next, we can determine Sxx and Sxy Sxx= ∑ xi2 − (∑ xi)2n=9934.65 − 782.9270=1178.4727 Sxy= ∑ xiyi − (∑ xi)(∑ yi)n=35376.74 − 782.9 ⋅ 2442.770=8056.8853 The estimate b of the slope β is the ratio of Sxy and Sxx: b= SxySxx= 8056.88531178.4727=6.8367 The mean is the sum of all values divided by the number of values: x―= ∑ xin= 782.970=11.1843 y―= ∑ yin= 2442.770=34.8957 The estimate a of the intercept α is the average of y decreased by the product of the estimate of the slope and the average of x. a= y― − b x―=34,8957 − 6,8367 ⋅ 11,1843= −41,5681 General least-squares equation: y^= α + β x Replace α by a= −41.5681 and β by b=6.8367 in the general least-squares equation: y=a + bx= −41.5681 + 6.8367x d) There appear to be one outliers, because the point in the top right corner of the scatterplot list more to the right than all other points in the scatterplot. There appear to be no influential observations beside the outlier, because all data values lie near the regression line except fro the outlier. e) Let us first determine the necessary sums: ∑ xi=759.5 ∑ xi2=9387.09 ∑ yi=2279.2 ∑ xiyi=31550.84 Next, we can determine

The ability to estimate the volume of a tree based on a simple measurement, such as the tree’s diameter, is important to the lumber industry, ecologis

Answered question

Answer & Explanation

New Questions in College Statistics