# College Statistics: modeling data distributions solving # Recent questions in Modeling data distributions

Modeling data distributions
ANSWERED ### 1. The standard error of the estimate is the same at all points along the regression line because we assumed that A. The observed values of y are normally distributed around each estimated value of y-hat. B. The variance of the distributions around each possible value of y-hat is the same. C. All available data were taken into account when the regression line was calculated. D. The regression line minimized the sum of the squared errors. E. None of the above.

Modeling data distributions
ANSWERED ### Here’s an interesting challenge you can give to a friend. Hold a \$1 (or larger!) bill by an upper corner. Have a friend prepare to pinch a lower corner, putting her fingers near but not touching the bill. Tell her to try to catch the bill when you drop it by simply closing her fingers. This seems like it should be easy, but it’s not. After she sees that you have released the bill, it will take her about 0.25 s to react and close her fingers-which is not fast enough to catch the bill. How much time does it take for the bill to fall beyond her grasp? The length of a bill is 16 cm.

Modeling data distributions
ANSWERED ### The following table represents the Frequency Distribution and Cumulative Distributions for this data set: 12, 13, 17, 18, 18, 24, 26, 27, 27, 30, 30, 35, 37, 41, 42, 43, 44, 46, 53, 58 $$\begin{array}{|c|c|} \hline \text{Class}&\text{Frequency}&\text{Relative Frequency}&\text{Cumulative Frequency}\\ \hline \text{10 but les than 20}&5\\ \hline \text{20 but les than 30}&4\\ \hline \text{30 but les than 40}&4\\ \hline \text{40 but les than 50}&5\\ \hline \text{50 but les than 60}&2\\ \hline \text{TOTAL}\\ \hline \end{array}$$ What is the Relative Frequency for the class: 20 but less than 30? State you answer as a value with exactly two digits after the decimal. for example 0.30 or 0.35

Modeling data distributions
ANSWERED ### Let x be a continuous random variable with a standard normal distribution. Using the accompanying standard normal distribution table, find $$P(x\geq2.26)$$

Modeling data distributions
ANSWERED ### A random sample of $$n_1 = 14$$ winter days in Denver gave a sample mean pollution index $$x_1 = 43$$. Previous studies show that $$\sigma_1 = 19$$. For Englewood (a suburb of Denver), a random sample of $$n_2 = 12$$ winter days gave a sample mean pollution index of $$x_2 = 37$$. Previous studies show that $$\sigma_2 = 13$$. Assume the pollution index is normally distributed in both Englewood and Denver. (a) State the null and alternate hypotheses. $$H_0:\mu_1=\mu_2.\mu_1>\mu_2$$ $$H_0:\mu_1<\mu_2.\mu_1=\mu_2$$ $$H_0:\mu_1=\mu_2.\mu_1<\mu_2$$ $$H_0:\mu_1=\mu_2.\mu_1\neq\mu_2$$ (b) What sampling distribution will you use? What assumptions are you making? The Student's t. We assume that both population distributions are approximately normal with known standard deviations. The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations. The standard normal. We assume that both population distributions are approximately normal with known standard deviations. The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations. (c) What is the value of the sample test statistic? Compute the corresponding z or t value as appropriate. (Test the difference $$\mu_1 - \mu_2$$. Round your answer to two decimal places.) (d) Find (or estimate) the P-value. (Round your answer to four decimal places.) (e) Based on your answers in parts (i)−(iii), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level \alpha? At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant. At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are not statistically significant. (f) Interpret your conclusion in the context of the application. Reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver. Reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. Fail to reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver. Fail to reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. (g) Find a 99% confidence interval for $$\mu_1 - \mu_2$$. (Round your answers to two decimal places.) lower limit upper limit (h) Explain the meaning of the confidence interval in the context of the problem. Because the interval contains only positive numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver. Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, we can not say that the mean population pollution index for Englewood is different than that of Denver. Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver. Because the interval contains only negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is less than that of Denver.

Modeling data distributions
ANSWERED ### a) To calculate: The least squares regression line for the data points using the table given below. \begin{array}{|c|c|} \hline Fertilizer & x & 100 & 150 & 200 & 250 \\ \hline Yield & y & 35 & 44 & 50 & 56 \\ \hline \end{array} b)To calculate: The approximate yield when 175 pounds of fertizers were used per acre of land.

Modeling data distributions
ANSWERED ### An automobile tire manufacturer collected the data in the table relating tire pressure x​ (in pounds per square​ inch) and mileage​ (in thousands of​ miles). A mathematical model for the data is given by $$\displaystyle​ f{{\left({x}\right)}}=-{0.554}{x}^{2}+{35.5}{x}-{514}.$$ $$\begin{array}{|c|c|} \hline x & Mileage \\ \hline 28 & 45 \\ \hline 30 & 51\\ \hline 32 & 56\\ \hline 34 & 50\\ \hline 36 & 46\\ \hline \end{array}$$ ​(A) Complete the table below. $$\begin{array}{|c|c|} \hline x & Mileage & f(x) \\ \hline 28 & 45 \\ \hline 30 & 51\\ \hline 32 & 56\\ \hline 34 & 50\\ \hline 36 & 46\\ \hline \end{array}$$ ​(Round to one decimal place as​ needed.) $$A. 20602060xf(x)$$ A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (28,45), (30,51), (32,56), (34,50), and (36,46). A parabola opens downward and passes through the points (28,45.7), (30,52.4), (32,54.7), (34,52.6), and (36,46.0). All points are approximate. $$B. 20602060xf(x)$$ Acoordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (43,30), (45,36), (47,41), (49,35), and (51,31). A parabola opens downward and passes through the points (43,30.7), (45,37.4), (47,39.7), (49,37.6), and (51,31). All points are approximate. $$C. 20602060xf(x)$$ A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (43,45), (45,51), (47,56), (49,50), and (51,46). A parabola opens downward and passes through the points (43,45.7), (45,52.4), (47,54.7), (49,52.6), and (51,46.0). All points are approximate. $$D.20602060xf(x)$$ A coordinate system has a horizontal x-axis labeled from 20 to 60 in increments of 2 and a vertical y-axis labeled from 20 to 60 in increments of 2. Data points are plotted at (28,30), (30,36), (32,41), (34,35), and (36,31). A parabola opens downward and passes through the points (28,30.7), (30,37.4), (32,39.7), (34,37.6), and (36,31). All points are approximate. ​(C) Use the modeling function​ f(x) to estimate the mileage for a tire pressure of 29 $$\displaystyle​\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$ and for 35 $$\displaystyle​\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$ The mileage for the tire pressure $$\displaystyle{29}\frac{{{l}{b}{s}}}{{{s}{q}}}\in.$$ is The mileage for the tire pressure $$\displaystyle{35}\frac{{{l}{b}{s}}}{{{s}{q}}}$$ in. is (Round to two decimal places as​ needed.) (D) Write a brief description of the relationship between tire pressure and mileage. A. As tire pressure​ increases, mileage decreases to a minimum at a certain tire​ pressure, then begins to increase. B. As tire pressure​ increases, mileage decreases. C. As tire pressure​ increases, mileage increases to a maximum at a certain tire​ pressure, then begins to decrease. D. As tire pressure​ increases, mileage increases.

Modeling data distributions
ANSWERED ### Let's say the widget maker has developed the following table that shows the highest dollar price p. widget where you can sell N widgets. Number N Price p $$200 53.00$$ $$250 52.50$$ $$300 52.00$$ $$35051.50$$ (a) Find a formula for pin terms of N modeling the data in the table. (b) Use a formula to express the total monthly revenue R, in dollars, of this manufacturer in month as a function of the number N of widgets produced in a month. $$R=$$ Is Ra linear function of N? (c) On the basis of the tables in this exercise and using cost, $$C= 35N + 900$$, use a formula to express the monthly profit P, in dollars, of this manufacturer asa function of the number of widgets produced in a month $$p=$$ ?

Modeling data distributions
ANSWERED ### Define the term Mean.

Modeling data distributions
ANSWERED ### It is estimated that aproximately $$\displaystyle{8.36}\%$$ Americans are afflicted with Diabetes . Suppose that a ceratin diagnostic evaluation for diabetes will correctly diagnose $$\displaystyle{94.5}\%$$ of all adults over 40 with diabetes as having the disease and incorrectly diagnoses $$\displaystyle{2}\%$$ of all adults over 40 without diabetes as having the disease . 1) Find the probability that a randamly selected adult over 40 doesn't have diabetes and is diagnosed as having diabetes ( such diagnoses are called "false positives"). 2) Find the probability that a randomly selected adult of 40 is diagnosed as not having diabetes. 3) Find the probability that a randomly selected adult over 40 actually has diabetes , given that he/she is diagnosed as not having diabetes (such diagnoses are called "false negatives"). Note: It will be helpful to first draw an appropriate tree diagram modeling the situation.

Modeling data distributions
ANSWERED ### Decide which of the following statements are true. Answer: square Normal distributions are bell-shaped, but they do not have to be symmetric. square The line of symmetry for all normal distributions is x square On any normal distribution curve, you can find data values more than 5 standard v square deviations above the mean. square The x-axis is a horizontal asymptote for all normal distributions.

Modeling data distributions
ANSWERED ### M. F. Driscoll and N. A. Weiss discussed the modeling and solution of problems concerning motel reservation networks in “An Application of Queuing Theory to Reservation Networks” (TIMS, Vol. 22, No. 5, pp. 540–546). They defined a Type 1 call to be a call from a motel’s computer terminal to the national reservation center. For a certain motel, the number, X, of Type 1 calls per hour has a Poisson distribution with parameter $$\displaystyle\lambda={1.7}$$. Determine the probability that the number of Type 1 calls made from this motel during a period of 1 hour will be: a) exactly one. b) at most two. c) at least two. (Hint: Use the complementation rule.) d. Find and interpret the mean of the random variable X. e. Determine the standard deviation of X.

Modeling data distributions
ANSWERED ### A parks and recreation department is constructing a new bike path. The path will be parallel to the railroad tracks shown and pass through the parking area al the point $$\displaystyle{\left({4},\ {5}\right)}.$$ Write an equation that represents the path.

Modeling data distributions
ANSWERED ### In an experiment designed to study the effects of illumination level on task performance (“Performance of Complex Tasks Under Different Levels of Illumination,” J. Illuminating Eng., 1976: 235–242), subjects were required to insert a fine-tipped probe into the eyeholes of ten needles in rapid succession both for a low light level with a black background and a higher level with a white background. Each data value is the time (sec) required to complete the task. $$\begin{array}{|c|c|} \hline Subject & (1) & (2) & (3) & (4) & (5) &(6) & (7) & (8) & (9) \\ \hline Black & 25.85 & 28.84 & 32.05 & 25.74 & 20.89 & 41.05 & 25.01 & 24.96 & 27.47 \\ \hline White & 18.28 & 20.84 & 22.96 & 19.68 & 19.509 & 24.98 & 16.61 & 16.07 & 24.59 \\ \hline \end{array}$$ Does the data indicate that the higher level of illumination yields a decrease of more than 5 sec in true average task completion time? Test the appropriate hypotheses using the P-value approach.

Modeling data distributions
ANSWERED ### To determine: The number of luxury home sales S(t) in a major Canadian urban area over a period of 12 year is given by: $$\displaystyle\Rightarrow\ {S}{\left({t}\right)}={5.8}\ {t}^{{{2}}}\ -\ {81.2}\ {t}\ +\ {1200}$$

Modeling data distributions
ANSWERED ### Suppose the manufacturer of widgets has developed the following table showing the highest price p, in dollars, of a widget at which N widgets can be sold. $$\begin{array}{|c|c|} \hline Number\ N & Price\ p\\ \hline 200 & 53.00\\ \hline 250 & 52.50\\\hline 300 & 52.00\\ \hline 350 & 51.50\\ \hline \end{array}$$ (a) Find a formula for p in terms of N modeling the data in the table. $$\displaystyle{p}=$$ (b) Use a formula to express the total monthly revenue R, in dollars, of this manufacturer in a month as a function of the number N of widgets produced in a month. $$\displaystyle{R}=$$ Is R a linear function of N? (c) On the basis of the tables in this exercise and using cost, $$\displaystyle{C}={35}{N}+{900}$$, use a formula to express the monthly profit P, in dollars, of this manufacturer as a function of the number of widgets produced in a month. $$\displaystyle{P}=$$ (d) Is P a linear function of N?

Modeling data distributions
ANSWERED ### 1)What factors influence the correspondence between the binomial and normal distributions? 1.Twenty percent of individuals who seek psychotherapy will recover from their symptoms irrespective of whether they receive treatment. A research finds that a particular type of psychotherapy is successful with 30 out of 100 clients. Using an alpha level of 0.05 as a criterion, what should she conclude about the effectiveness of this psychotherapeutic approach? 2.How does the size of the data set help cut down on the size of the error terms in the approximation process?

Modeling data distributions
ANSWERED ### An experiment designed to study the relationship between hypertension and cigarette smoking yielded the following data. $$\begin{array}{|c|c|} \hline Tension\ level & Non-smoker & Moderate\ smoker & Heavy\ smoker \\ \hline Hypertension & 20 & 38 & 28 \\ \hline No\ hypertension & 50 & 27 & 18 \\ \hline \end{array}$$ Test the hypothesis that whether or not an individual has hypertension is independent of how much that person smokes.

Modeling data distributions
ANSWERED ### The following observations are lifetimes (days) subsequent to diagnosis for individuals suffering from blood cancer ("A Goodness of Fit Approach to the Class of Life Distributions with Unknown Age," Quality and Reliability Engr. Intl., $$2012: 761-766): 115, 181, 255, 418, 441, 461, 516, 739, 743, 789, 807, 865, 924, 983, 1025, 1062, 1063, 1165, 1191, 1222, 1222, 1251, 1277, 1290, 1357, 1369, 1408, 1455, 1278, 1519, 1578, 1578, 1599, 1603, 1605, 1696, 1735, 1799, 1815, 1852, 1899, 1925, 1965.$$ a) can a confidence interval for true average lifetime be calculated without assuming anything about the nature of the lifetime distribution? Explain your reasoning. [Note: A normal probability plot of data exhibits a reasonably linear pattern.] b) Calculate and interpret a confidence interval with a 99% confidence level for true average lifetime. [Hint: mean $$= 1191.6, s = 506.6$$.]
ANSWERED 