# A random sample of displaystyle{n}_{{1}}={16} communities in western Kansas gave the following information for

Question
Modeling data distributions

A random sample of $$\displaystyle{n}_{{1}}={16}$$ communities in western Kansas gave the following information for people under 25 years of age.
$$\displaystyle{X}_{{1}}:$$ Rate of hay fever per 1000 population for people under 25
$$\begin{array}{|c|c|} \hline 97 & 91 & 121 & 129 & 94 & 123 & 112 &93\\ \hline 125 & 95 & 125 & 117 & 97 & 122 & 127 & 88 \\ \hline \end{array}$$
A random sample of $$\displaystyle{n}_{{2}}={14}$$ regions in western Kansas gave the following information for people over 50 years old.
$$\displaystyle{X}_{{2}}:$$ Rate of hay fever per 1000 population for people over 50
$$\begin{array}{|c|c|} \hline 94 & 109 & 99 & 95 & 113 & 88 & 110\\ \hline 79 & 115 & 100 & 89 & 114 & 85 & 96\\ \hline \end{array}$$
(i) Use a calculator to calculate $$\displaystyle\overline{{x}}_{{1}},{s}_{{1}},\overline{{x}}_{{2}},{\quad\text{and}\quad}{s}_{{2}}.$$ (Round your answers to two decimal places.)
(ii) Assume that the hay fever rate in each age group has an approximately normal distribution. Do the data indicate that the age group over 50 has a lower rate of hay fever? Use $$\displaystyle\alpha={0.05}.$$
(a) What is the level of significance?
State the null and alternate hypotheses.
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}<\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}>\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}\ne\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}>\mu_{{2}},{H}_{{1}}:\mu_{{1}}=\mu_{{12}}$$
(b) What sampling distribution will you use? What assumptions are you making?
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations,
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations,
The Student's t. We assume that both population distributions are approximately normal with known standard deviations,
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimalplaces.)

What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimal places.)
(c) Find (or estimate) the P-value.
P-value $$\displaystyle>{0.250}$$
$$\displaystyle{0.125}<{P}-\text{value}<{0},{250}$$
$$\displaystyle{0},{050}<{P}-\text{value}<{0},{125}$$
$$\displaystyle{0},{025}<{P}-\text{value}<{0},{050}$$
$$\displaystyle{0},{005}<{P}-\text{value}<{0},{025}$$
P-value $$\displaystyle<{0.005}$$
Sketch the sampling distribution and show the area corresponding to the P-value.
P.vaiue Pevgiue
P-value f P-value

2020-10-24

To calculate mean you just need to add all the data points in the sample and divide by sample size.
$$\displaystyle\overline{{x}}_{{1}}=\frac{{{97}+{91}+{121}+\ldots\ldots..+{127}+{88}}}{{16}}={109.75}$$
$$\displaystyle\overline{{x}}_{{2}}=\frac{{{94}+{109}+{99}+\ldots\ldots..+{85}+{96}}}{{14}}={99}$$
To calculate sample standard deviation, we use the following formula
$$\displaystyle{S}_{{i}}=\sqrt{{\frac{{\sum{\left({x}_{{k}}-\overline{{x}}_{{i}}\right)}^{2}}}{{{n}_{{i}}-{1}}}}}$$
$$\displaystyle{S}_{{1}}=\sqrt{{\frac{{\sum{\left({x}_{{k}}-\overline{{x}}_{{1}}\right)}^{2}}}{{{n}_{{1}}-{1}}}}}$$
$$\displaystyle=\frac{\sqrt{{{\left({97}-{109.75}\right)}^{2}+{\left({91}-{109.75}\right)}^{2}+\ldots\ldots+{\left({127}-{109.75}\right)}^{2}+{\left({88}-{109.75}\right)}^{2}}}}{{{16}-{1}}}$$
$$\displaystyle={15.36}$$
$$\displaystyle{S}_{{2}}=\sqrt{{\frac{{\sum{\left({x}_{{k}}-\overline{{x}}_{{2}}\right)}^{2}}}{{{n}_{{2}}-{1}}}}}$$
$$\displaystyle=\frac{\sqrt{{{\left({94}-{99}\right)}^{2}+{\left({109}-{99}\right)}^{2}{\left({99}-{99}\right)}^{2}+\ldots\ldots+{\left({85}-{99}\right)}^{2}+{\left({96}-{99}\right)}^{2}}}}{{{14}-{1}}}$$
$$\displaystyle={11.66}$$
a) Level of significance $$\displaystyle={0.05}$$
$$\displaystyle\mu_{{1}}$$ : mean of group under 25
$$\displaystyle\mu_{{2}}$$ : mean of group over 50
The claim is generally forms the Alternate hypotheses. As the claim is that the mean (rate of hay fever) of group over 50 is lower, the alternate hypotheses will reflect the claim and null hypotheses will be a null statement indicating no change or no difference.
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}\succ\mu_{{2}}$$
b)
With the assumption that the population for both groups are normally distributed and because the population standard deviations are unknown, we will use students' t test. Students t test. We assume that both population distributions are approximately normal with unknown standard deviations.
$$T=\frac{\bar{x_{1}}-\bar{x_{2}}}{\text{Standard error of diference}}=\frac{109.75-99}{\frac{\sqrt{s_{1}^{2}}}{n_{1}}+\frac{s_{2}^{2}}{n-2}}=\frac{10.75}{\sqrt{14.74583+9.703297}}=2.174$$
C) p -value can be determined using t statistic and degrees of freedom
Formula for degrees of freedom is given by
$$\displaystyle{d}\frac{ f{{\left({\left({{S}_{{1}}^{{2}}}\text{/}{n}_{{1}}+{{S}_{{2}}^{{2}}}\text{/}{n}_{{2}}\right)}^{2}\right)}}}{{{\left(\frac{{{S}_{{1}}^{{2}}}}{{n}_{{1}}}\right)}^{2}\text{/}{n}_{{1}}-{1}+{\left(\frac{{{S}_{{2}}^{{2}}}}{{n}_{{2}}}\right)}^{2}\text{/}{n}_{{2}}-{1}}}={27}$$
Right tailed probability $$\displaystyle=={T}.{D}{I}{S}{T}.{R}{T}{\left({2.174},{27}\right)}={0.0193}$$

### Relevant Questions

Would you rather spend more federal taxes on art? Of a random sample of $$n_{1} = 86$$ politically conservative voters, $$r_{1} = 18$$ responded yes. Another random sample of $$n_{2} = 85$$ politically moderate voters showed that $$r_{2} = 21$$ responded yes. Does this information indicate that the population proportion of conservative voters inclined to spend more federal tax money on funding the arts is less than the proportion of moderate voters so inclined? Use $$\alpha = 0.05.$$

a) State the null and alternate hypotheses.

$$H_0:p_{1} = p_{2}, H_{1}:p_{1} > p_2$$

$$H_0:p_{1} = p_{2}, H_{1}:p_{1} < p_2$$

$$H_0:p_{1} = p_{2}, H_{1}:p_{1} \neq p_2$$

$$H_{0}:p_{1} < p_{2}, H_{1}:p_{1} = p_{2}$$

b) What sampling distribution will you use? What assumptions are you making? The Student's t. The number of trials is sufficiently large. The standard normal. The number of trials is sufficiently large.The standard normal. We assume the population distributions are approximately normal. The Student's t. We assume the population distributions are approximately normal.

c)What is the value of the sample test statistic? (Test the difference $$p_{1} - p_{2}$$. Do not use rounded values. Round your final answer to two decimal places.)

d) Find (or estimate) the P-value. (Round your answer to four decimal places.)

e) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level alpha? At the $$\alpha = 0.05$$ level, we reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.05$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.05$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant. At the $$\alpha = 0.05$$ level, we reject the null hypothesis and conclude the data are not statistically significant.

f) Interpret your conclusion in the context of the application. Reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters.

The following quadratic function in general form, $$\displaystyle{S}{\left({t}\right)}={5.8}{t}^{2}—{81.2}{t}+{1200}$$ models the number of luxury home sales, S(t), in a major Canadian urban area, according to statistical data gathered over a 12 year period. Luxury home sales are defined in this market as sales of properties worth over \$3 Million (inflation adjusted). In this case, $$\displaystyle{\left\lbrace{t}\right\rbrace}={\left\lbrace{0}\right\rbrace}\ \text{represents}\ {2000}{\quad\text{and}\quad}{\left\lbrace{t}\right\rbrace}={\left\lbrace{11}\right\rbrace}$$represents 2011. Use a calculator to find the year when the smallest number of luxury home sales occurred. Without sketching the function, interpret the meaning of this function, on the given practical domain, in one well-expressed sentence.

The following table shows the average yearly tuition and required fees, in thousand of dollars, charged by a certain private university in the school year beginning in the given year.
$$\begin{array}{|c|c|}\hline \text{Year} & \text{Average tuition} \\ \hline 2005 & 17.6 \\ \hline 2007 & 18.1 \\ \hline 2009 & 19.5 \\ \hline 2011 & 20.7 \\ \hline 2013 & 21.8 \\ \hline \end{array}$$
What prediction does the formula modeling this data give for average yearly tuition and required fees for the university for the academic year beginning in 2019?

1)A rewiew of voted registration record in a small town yielded the dollowing data of the number of males and females registered as Democrat, Republican, or some other affilation:

$$\begin{array}{c} Gender \\ \hline Affilation & Male & Female \\ \hline Democrat & 300 & 600 \\ Republican & 500 & 300 \\ Other & 200 & 100 \\ \hline \end{array}$$

What proportion of all voters is male and registered as a Democrat? 2)A survey was conducted invocted involving 303 subject concerning their preferences with respect to the size of car thay would consider purchasing. The following table shows the count of the responses by gender of the respondents:

$$\begin{array}{c} Size\ of\ Car \\ \hline Gender & Small & Medium & lange & Total \\ \hline Female & 58 & 63 & 17 & 138 \\ Male & 79 & 61 & 25 & 165 \\ Total & 137 & 124 & 42 & 303 \\ \hline \end{array}$$

the data are to be summarized by constructing marginal distributions. In the marginal distributio for car size, the entry for mediums car is ?

An analysis of laboratory data collected with the goal of modeling the weight (in grams) of a bacterial culture after several hours of growth produced the least squares regression line $$\log(weight) = 0.25 + 0.61$$hours. Estimate the weight of the culture after 3 hours.

A) 0.32 g

B) 2.08 g

C) 8.0 g

D) 67.9 g

E) 120.2 g

The table shows the population of various cities, in thousands, and the average walking speed, in feet per second, of a person living in the city. $$\begin{array}{|c|c|} \hline Population\ (thousands) & Walking Speed\ (feet\ per\ second) \\ \hline 5.5 & 0.6 \\ \hline 14 & 1.0\\ \hline 71 & 1.6\\ \hline 138 & 1.9\\ \hline 342 & 2.2\\ \hline \end{array}$$

The tables show the battery lives (in hours) of two brands of laptops. a) Make a double box-and-whisker plot that represent's the data. b) Identifity the shape of each distribution. c) Which brand's battery lives are more spread out? Explain. d) Compare the distributions using their shapes and appropriate measures of center and variation.

Determine which of the following functions $$\displaystyle{f{{\left({x}\right)}}}={c}{x},\ {g{{\left({x}\right)}}}={c}{x}^{{{2}}},\ {h}{\left({x}\right)}={c}\sqrt{{{\left|{x}\right|}}},\ \text{and}\ {r}{\left({x}\right)}=\ {\frac{{{c}}}{{{x}}}}$$ can be used to model the data and determine the value of the constant c that will make the function fit the data in the table. $$\begin{array}{|c|c|} \hline x & -4 & -1 & 0 & 1 & 4 \\ \hline y & -32 & -2 & 0 & -2 & -32 \\ \hline \end {array}$$

The article “Stochastic Modeling for Pavement Warranty Cost Estimation” (J. of Constr. Engr. and Mgmnt., 2009: 352–359) proposes the following model for the distribution of Y = time to pavement failure. Let $$\displaystyle{X}_{{{1}}}$$ be the time to failure due to rutting, and $$\displaystyle{X}_{{{2}}}$$ be the time to failure due to transverse cracking, these two rvs are assumed independent. Then $$\displaystyle{Y}=\min{\left({X}_{{{1}}},{X}_{{{2}}}\right)}$$. The probability of failure due to either one of these distress modes is assumed to be an increasing function of time t. After making certain distributional assumptions, the following form of the cdf for each mode is obtained: $$\displaystyle\Phi{\left[\frac{{{a}+{b}{t}}}{{\left({c}+{\left.{d}{t}\right.}+{e}{t}^{{{2}}}\right)}^{{\frac{{1}}{{2}}}}}\right]}$$ where $$\Uparrow \Phi$$ is the standard normal cdf. Values of the five parameters a, b, c, d, and e are -25.49, 1.15, 4.45, -1.78, and .171 for cracking and -21.27, .0325, .972, -.00028, and .00022 for rutting. Determine the probability of pavement failure within $$\displaystyle{t}={5}$$ years and also $$\displaystyle{t}={10}$$ years.
$$\begin{array}{|c|c|} \hline Tension\ level & Non-smoker & Moderate\ smoker & Heavy\ smoker \\ \hline Hypertension & 20 & 38 & 28 \\ \hline No\ hypertension & 50 & 27 & 18 \\ \hline \end{array}$$