# How are the smoking habits of students related to their parents' smoking? Here is a two-way table from a survey of student s in eight Arizona high schools: begin{array}{c|c}&text{Student smokes}&text{Student does not smoke}&text{Total}hlinetext{Both parents smoke}&400&1380&400+1380=1780hlinetext{One parent smokes}&416&1823&416+1823=2239hlinetext{Neither parent smokes}&188&1168&188+1168=1356hlinetext{Total}&400+416+188=1004&1380+1823+1168=4371&1004+4371=5375end{array} (a) Write the null and alternative hypotheses for the question of interest. (b) Find the expected cell counts. Write a sentence that explains in simple language what "expected counts" are. (c) Find the chi-square statistic, its degrees of freedom, and the P-value. (d) What is your conclusion about significance?

Question
Two-way tables
How are the smoking habits of students related to their parents' smoking? Here is a two-way table from a survey of student s in eight Arizona high schools:
$$\begin{array}{c|c}&\text{Student smokes}&\text{Student does not smoke}&\text{Total}\\\hline\text{Both parents smoke}&400&1380&400+1380=1780\\\hline\text{One parent smokes}&416&1823&416+1823=2239\\\hline\text{Neither parent smokes}&188&1168&188+1168=1356\\\hline\text{Total}&400+416+188=1004&1380+1823+1168=4371&1004+4371=5375\end{array}$$
(a) Write the null and alternative hypotheses for the question of interest.
(b) Find the expected cell counts. Write a sentence that explains in simple language what "expected counts" are.
(c) Find the chi-square statistic, its degrees of freedom, and the P-value.

2021-03-05

Let us assume:
$$\alpha=0.05=5\%$$
(a) The null hypothesis states that there is no association between the variables, while the alternative hypothesis states that there is an association between the variables.
$$H_0:$$ There is no association between student smoking habit and parent smoking habit
$$H_{\alpha}:$$ There is no association between student smoking habit and parent smoking habit
(b) Determine the row and column totals of the given table:
$$\begin{array}{c|c}&\text{Student smokes}&\text{Student does not smoke}&\text{Total}\\\hline\text{Both parents smoke}&400&1380&400+1380=1780\\\hline\text{One parent smokes}&416&1823&416+1823=2239\\\hline\text{Neither parent smokes}&188&1168&188+1168=1356\\\hline\text{Total}&400+416+188=1004&1380+1823+1168=4371&1004+4371=5375\end{array}$$
The expected frequencies E are the product of the column and row total, divided by the table total.
$$E_{11}=\frac{r_1\times c_1}{n}=\frac{1780\times 1004}{5375}\approx332.4874$$
$$E_{12}=\frac{r_1\times c_2}{n}=\frac{1780\times4371}{5375}\approx1447.5126$$
$$E_{21}=\frac{r_2\times c_1}{n}=\frac{2239\times1004}{5375}\approx418.2244$$
$$E_{22}=\frac{r_2\times c_2}{n}=\frac{2239\times4371}{5375}\approx1820.7756$$
$$E_{31}=\frac{r_3\times c_1}{n}=\frac{1356\times1004}{5375}\approx253.2882$$
$$E_{32}=\frac{r_3\times c_2}{n}=\frac{1356\times4371}{5375}\approx1102.7118$$
Expected counts are the counts that we expect based on the row and column totals, when there is no association between the variables.
(c) The chi-square subtotals are the squared differences between the observed abd expected frequencies, divivded by the expected frequency.
The value of the test-statistic is then the sum of the chi-square subtotals:
$$X^2=\sum\frac{(O-E)^2}{E}$$
$$=\frac{(400-322.4874)^2}{332.4874}+\frac{(1380-1447.5126)^2}{1447.5126}+\frac{(416-418.2244)^2}{418.2244}+\frac{(1823-1820.7756)^2}{1820.7756}+\frac{(188-253.2882)^2}{253.2882}+\frac{1168-1102.7119)^2}{1102.7118}$$
The degrees of freedom is the product od the number of row and the number of columns, both decreased by 1.
$$df=(r-1)(c-1)=(3-1)(2-1)=2$$
The P-value is the probability of obtaining the value of the test statistic, or a value more extreme. The P-value is the number (or interval) in the column title of the chi-square distribution table in the appendix containing the $$X^2$$ -value in the row $$df=2:$$
$$P<0.001$$
(d) If the P-value is less than or equal to the significance level, then the null hypothesis is rejected:
$$P<0.05\Rightarrow\text{Reject }H_0$$
There is sufficient evidence to support the claim that there is an association between student smoking habit and parent smoking habit.
Result: (a)
$$H_0:$$ There is no association between smoking habit and parent smoking habit.
$$H_{\alpha}:$$ There is an association between student smoking habit and parent smoking habit.
(b) 332.4874, 1447.5126, 418.2244, 1820.7756, 253.2882, 1102.7118
Expected counts are the counts that we expect based on the row and column totals, when there is no association between the variables.
(c) $$X^2=37.5664$$, degrees of freedom, P<0.01
(d) There is sufficient evidence to support the claim that there is an association between student smoking habit and parent smoking habit.

### Relevant Questions

A random sample of $$n_1 = 14$$ winter days in Denver gave a sample mean pollution index $$x_1 = 43$$.
Previous studies show that $$\sigma_1 = 19$$.
For Englewood (a suburb of Denver), a random sample of $$n_2 = 12$$ winter days gave a sample mean pollution index of $$x_2 = 37$$.
Previous studies show that $$\sigma_2 = 13$$.
Assume the pollution index is normally distributed in both Englewood and Denver.
(a) State the null and alternate hypotheses.
$$H_0:\mu_1=\mu_2.\mu_1>\mu_2$$
$$H_0:\mu_1<\mu_2.\mu_1=\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1<\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1\neq\mu_2$$
(b) What sampling distribution will you use? What assumptions are you making? NKS The Student's t. We assume that both population distributions are approximately normal with known standard deviations.
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations.
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations.
(c) What is the value of the sample test statistic? Compute the corresponding z or t value as appropriate.
(Test the difference $$\mu_1 - \mu_2$$. Round your answer to two decimal places.) NKS (d) Find (or estimate) the P-value. (Round your answer to four decimal places.)
(e) Based on your answers in parts (i)−(iii), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level \alpha?
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are not statistically significant.
(f) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. (g) Find a 99% confidence interval for
$$\mu_1 - \mu_2$$.
lower limit
upper limit
(h) Explain the meaning of the confidence interval in the context of the problem.
Because the interval contains only positive numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, we can not say that the mean population pollution index for Englewood is different than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains only negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is less than that of Denver.
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of $$25^{\circ}F$$. However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to $$25^{\circ}F$$. One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a $$5\%$$ level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}$$
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
$$df_{N} = ?$$
$$df_{D} = ?$$
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
The following is a two-way table showing preferences for an award (A, B, C) by gender for the students sampled in survey. Test whether the data indicate there is some association between gender and preferred award.
$$\begin{array}{|c|c|c|}\hline &\text{A}&\text{B}&\text{C}&\text{Total}\\\hline \text{Female} &20&76&73&169\\ \hline \text{Male}&11&73&109&193 \\ \hline \text{Total}&31&149&182&360 \\ \hline \end{array}\\$$
Chi-square statistic=?
p-value=?
Conclusion: (reject or do not reject $$H_0$$)
Does the test indicate an association between gender and preferred award? (yes/no)
Find the expected count and the contribution to the chi-square statistic for the (Group 1, Yes) cell in the two-way table below.
$$\begin{array}{|c|c|c|}\hline&\text{Yes}&\text{No}&\text{Total}\\\hline\text{Group 1} &710 & 277 & 987\\ \hline\text{Group 2}& 1175 & 323&1498\\\hline \ \text{Total}&1885&600&2485 \\ \hline \end{array}$$
Round your answer for the excepted count to one decimal place, and your answer for the contribution to the chi-square statistic to three decimal places.
Expected count=?
contribution to the chi-square statistic=?
Find the expected count and the contribution to the chi-square statistic for the (Control, Disagree) cell in the two-way table below. $$\begin{array}{|c|c|c|}\hline&\text{Strongly Agree}&\text{Agree}&\text{Neutral}&\text{Disagree}&\text{Strongly Disagree}\\\hline\text{Control} &38&47&2&12&11\\ \hline \text{Treatment}&60&45&9&4&2 \\ \hline \end{array}\\$$
Round your answer for the excepted count to one decimal place, and your answer for the contribution to the chi-square statistic to three decimal places.
Expected count ?
Contribution to the chi-square statistic ?
Find the expected count and the contribution to the chi-square statistic for the (Control, Disagree) cell in the two-way table below.
$$\begin{array}{|c|c|c|}\hline&\text{Strongly Agree}&\text{Agree}&\text{Neutral}&\text{Disagree}&\text{Strongly Disagree}\\\hline\text{Control} &38&47&2&12&11\\ \hline \text{Treatment}&60&45&9&4&2 \\ \hline \end{array}\\$$
Round your answer for the excepted count to one decimal place, and your answer for the contribution to the chi-square statistic to three decimal places.
Expected count ?
Contribution to the chi-square statistic ?
Is there a relationship between gender and relative finger length? To find out, we randomly selected 452 U.S. high school students who completed a survey. The two-way table summarizes the relationship between gender and which finger was longer on the left hand (index finger or ring finger).
$$\begin{array} {lc} & \text{Gender} \ \text {Longer finger} & \begin{array}{l|c|r|r} & \text { Female } & \text { Male } & \text { Total } \\\hline \text { Index finger } & 78 & 45 & 123 \\\hline \text{ Ring finger } & 82 & 152 & 234 \\ \hline \text { Same length } & 52 & 43 & 95 \\ \hline \text { Total } & 212 & 240 & 452 \end{array}\ \end{array}$$
Suppose we randomly select one of the survey respondents. Define events R: ring finger longer and F: female. Given that the chosen student does not have a longer ring finger, what's the probability that this person is male? Write your answer as a probability statement using correct symbols for the events.
1950 randomly selected adults were asked if they think they are financially better off than their parents. The following table gives the two-way classification of the responses based on the education levels of the persons included in the survey and whether they are financially better off, the same as, or worse off than their parents
$$\begin{array}{|c|c|c|}\hline &\text{Less Than High School}&\text{High School}&\text{More Than High School}\\\hline \text{Better off} &140&440&430\\ \hline \text{Same as}&60&230&110\\ \hline \text{Worse off}&180&280&80\\ \hline\end{array}\\$$
Suppose one adult is selected at random from these 1950 adults. Find the following probablity.
$$P(\text{more than high school or worse off})=?$$
A survey of 120 students about which sport , baseball , basketball , football ,hockey , or other , they prefer to watch on TV yielded the following two-way frequency table . What is the conditional relative frequency that a student prefers to watch baseball , given that the student is a girl? Round the answer to two decimal places as needed
$$\begin{array}{|c|c|c|}\hline &\text{Baseball}&\text{Basketball}&\text{Football}&\text{Hockey}&\text{Other}&\text{Total}\\\hline \text{Boys} &18&14&20&6&2&60\\ \hline \text{Girls}&14&16&13&5&12&60\\ \hline \text{Total}&32&30&33&11&14&120\\ \hline \end{array}\\$$
a) 11.67%
b) 23.33%
c) 43.75%
d) 53.33%
A 10 kg objectexperiences a horizontal force which causes it to accelerate at 5 $$\displaystyle\frac{{m}}{{s}^{{2}}}$$, moving it a distance of 20 m, horizontally.How much work is done by the force?