# When the chi squared statistic is used in testing hypothesis? Include underlying assumptions and the test statistic for testing hypothesis on a single population. What is the criterion for rejecting the null hypothesis for both non-directional and directional tests? How do you find the p value in each case?

Question
Modeling data distributions
When the chi squared statistic is used in testing hypothesis? Include underlying assumptions and the test statistic for testing hypothesis on a single population. What is the criterion for rejecting the null hypothesis for both non-directional and directional tests? How do you find the p value in each case?

2021-02-07
Chi-square statistic for testing single variance:
The Chi-square statistic is used to test the population variance of a single sample.
The necessary assumptions for Chi-square test:
The sample should be collected using simple random sampling.
The population from which the sample is drawn should follow normal distribution.
The data should be continuous.
The chi-square test statistic is obtained as given below:
$$\displaystyle{x}^{2}=\frac{{{\left({n}-{1}\right)}{s}^{2}}}{{\sigma^{2}}}$$
$$\displaystyle{n}=$$ Sample size
$$\displaystyle{s}^{2}=$$ Sample variance
$$\displaystyle\sigma^{2}=$$ Population variance
Decision rule based on P-value approach for both directional and non-directional tests:
The level of significance is $$\alpha.$$
If P-value $$\displaystyle\le\alpha$$, then reject the null hypothesis $$\displaystyle{H}_{{0}}$$.
If P-value $$\displaystyle>\alpha$$, then fail to reject the null hypothesis $$\displaystyle{H}_{{0}}$$.
P-value:
The P­-value will be obtained from the chi-square distribution table based on the value of test statistic and the degrees of freedom $$\displaystyle{\left({n}–{1}\right)}$$ and the type of hypothesis test (Two tailed, right tailed or left tailed).
Chi-square statistic for testing distribution:
The Chi-square goodness of fit test is used to test whether the sample data are consistent with a hypothesized distribution or not.
The chi-square goodness-of-fit test is used to test whether a sample of data comes from a population with a specific distribution. The chi-square goodness-of-fit can also be applied to discrete distributions
The necessary assumptions for Chi-square test for goodness of fit are given below:
The sample should be collected using simple random sampling.
The variable of interest must be categorical.
The expected value of each cell should not be less than 5.
Evidently, the test is to determine whether a sample of data comes from a population with a specific distribution.
Chi-square goodness of fit is a right tailed test. Therefore, it is a directional test.
The chi-square test statistic is obtained as given below:
$$\displaystyle{x}^{2}=\frac{{{\sum_{{{i}={1}}}^{{n}}}{\left({O}_{{i}}-{E}_{{i}}\right)}^{2}}}{{{E}_{{i}}}}$$
$$\displaystyle{O}_{{i}}=$$ Observed frenquency
$$\displaystyle{E}_{{i}}=$$ Expected frequency
Decision rule based on P-value approach:
The level of significance is $$\alpha.$$
If P-value $$\displaystyle\le\alpha$$, then reject the null hypothesis $$\displaystyle{H}_{{0}}$$.
If P-value $$\displaystyle>\alpha$$, then fail to reject the null hypothesis \displaystyle $${H}_{{0}}$$.
The P­-value will be obtained from the chi-square distribution table based on the value of test statistic and the degrees of freedom $$\displaystyle{\left({n}–{1}\right)}$$ for the right tailed test.

### Relevant Questions

A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of $$25^{\circ}F$$. However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to $$25^{\circ}F$$. One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a $$5\%$$ level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}$$
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
$$df_{N} = ?$$
$$df_{D} = ?$$
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
Would you rather spend more federal taxes on art? Of a random sample of $$n_{1} = 86$$ politically conservative voters, $$r_{1} = 18$$ responded yes. Another random sample of $$n_{2} = 85$$ politically moderate voters showed that $$r_{2} = 21$$ responded yes. Does this information indicate that the population proportion of conservative voters inclined to spend more federal tax money on funding the arts is less than the proportion of moderate voters so inclined? Use $$\alpha = 0.05.$$ (a) State the null and alternate hypotheses. $$H_0:p_{1} = p_{2}, H_{1}:p_{1} > p_2$$
$$H_0:p_{1} = p_{2}, H_{1}:p_{1} < p_2$$
$$H_0:p_{1} = p_{2}, H_{1}:p_{1} \neq p_2$$
$$H_{0}:p_{1} < p_{2}, H_{1}:p_{1} = p_{2}$$ (b) What sampling distribution will you use? What assumptions are you making? The Student's t. The number of trials is sufficiently large. The standard normal. The number of trials is sufficiently large.The standard normal. We assume the population distributions are approximately normal. The Student's t. We assume the population distributions are approximately normal. (c)What is the value of the sample test statistic? (Test the difference $$p_{1} - p_{2}$$. Do not use rounded values. Round your final answer to two decimal places.) (d) Find (or estimate) the P-value. (Round your answer to four decimal places.) (e) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level alpha? At the $$\alpha = 0.05$$ level, we reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.05$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the $$\alpha = 0.05$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant. At the $$\alpha = 0.05$$ level, we reject the null hypothesis and conclude the data are not statistically significant. (f) Interpret your conclusion in the context of the application. Reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters.
A random sample of $$\displaystyle{n}_{{1}}={16}$$ communities in western Kansas gave the following information for people under 25 years of age.
$$\displaystyle{X}_{{1}}:$$ Rate of hay fever per 1000 population for people under 25
$$\begin{array}{|c|c|} \hline 97 & 91 & 121 & 129 & 94 & 123 & 112 &93\\ \hline 125 & 95 & 125 & 117 & 97 & 122 & 127 & 88 \\ \hline \end{array}$$
A random sample of $$\displaystyle{n}_{{2}}={14}$$ regions in western Kansas gave the following information for people over 50 years old.
$$\displaystyle{X}_{{2}}:$$ Rate of hay fever per 1000 population for people over 50
$$\begin{array}{|c|c|} \hline 94 & 109 & 99 & 95 & 113 & 88 & 110\\ \hline 79 & 115 & 100 & 89 & 114 & 85 & 96\\ \hline \end{array}$$
(i) Use a calculator to calculate $$\displaystyle\overline{{x}}_{{1}},{s}_{{1}},\overline{{x}}_{{2}},{\quad\text{and}\quad}{s}_{{2}}.$$ (Round your answers to two decimal places.)
(ii) Assume that the hay fever rate in each age group has an approximately normal distribution. Do the data indicate that the age group over 50 has a lower rate of hay fever? Use $$\displaystyle\alpha={0.05}.$$
(a) What is the level of significance?
State the null and alternate hypotheses.
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}<\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}>\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}\ne\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}>\mu_{{2}},{H}_{{1}}:\mu_{{1}}=\mu_{{12}}$$
(b) What sampling distribution will you use? What assumptions are you making?
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations,
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations,
The Student's t. We assume that both population distributions are approximately normal with known standard deviations,
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimalplaces.)
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimal places.)
(c) Find (or estimate) the P-value.
P-value $$\displaystyle>{0.250}$$
$$\displaystyle{0.125}<{P}-\text{value}<{0},{250}$$
$$\displaystyle{0},{050}<{P}-\text{value}<{0},{125}$$
$$\displaystyle{0},{025}<{P}-\text{value}<{0},{050}$$
$$\displaystyle{0},{005}<{P}-\text{value}<{0},{025}$$
P-value $$\displaystyle<{0.005}$$
Sketch the sampling distribution and show the area corresponding to the P-value.
P.vaiue Pevgiue
P-value f P-value
Which possible statements about the chi-squared distribution are true?
a) The statistic X^2, that is used to estimate the variance S^2 of a random sample, has a Chi-squared distribution.
b) The sum of the squares of k independent standard normal random variables has a Chi-squared distribution with k degrees of freedom.
c) The Chi-squared distribution is used in hypothesis testing and estimation.
d) The Chi-squared distribution is a particular case of the Gamma distribution.
e)All of the above.
1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance.
Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of $$\alpha = 0.05$$. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.) Lemons and Car Crashes Listed below are annual data for various years. The data are weights (metric tons) of lemons imported from Mexico and U.S. car crash fatality rates per 100,000 population [based on data from “The Trouble with QSAR (or How I Learned to Stop Worrying and Embrace Fallacy),” by Stephen Johnson, Journal of Chemical Information and Modeling, Vol. 48, No. 1]. Is there sufficient evidence to conclude that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates? Do the results suggest that imported lemons cause car fatalities? $$\begin{matrix} \text{Lemon Imports} & 230 & 265 & 358 & 480 & 530\\ \text{Crashe Fatality Rate} & 15.9 & 15.7 & 15.4 & 15.3 & 14.9\\ \end{matrix}$$
factor in determining the usefulness of an examination as a measure of demonstrated ability is the amount of spread that occurs in the grades. If the spread or variation of examination scores is very small, it usually means that the examination was either too hard or too easy. However, if the variance of scores is moderately large, then there is a definite difference in scores between "better," "average," and "poorer" students. A group of attorneys in a Midwest state has been given the task of making up this year's bar examination for the state. The examination has 500 total possible points, and from the history of past examinations, it is known that a standard deviation of around 60 points is desirable. Of course, too large or too small a standard deviation is not good. The attorneys want to test their examination to see how good it is. A preliminary version of the examination (with slight modifications to protect the integrity of the real examination) is given to a random sample of 20 newly graduated law students. Their scores give a sample standard deviation of 70 points. Using a 0.01 level of significance, test the claim that the population standard deviation for the new examination is 60 against the claim that the population standard deviation is different from 60.
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H_{0}:\sigma=60,\ H_{1}:\sigma\ <\ 60H_{0}:\sigma\ >\ 60,\ H_{1}:\sigma=60H_{0}:\sigma=60,\ H_{1}:\sigma\ >\ 60H_{0}:\sigma=60,\ H_{1}:\sigma\ \neq\ 60$$
(b) Find the value of the chi-square statistic for the sample. (Round your answer to two decimal places.)
What are the degrees of freedom?
What assumptions are you making about the original distribution?
We assume a binomial population distribution.We assume a exponential population distribution. We assume a normal population distribution.We assume a uniform population distribution.
For a two-tailed hypothesis test with level of significance a and null hypothesis $$H_{0} : \mu = k$$ we reject Ho whenever k falls outside the $$c = 1 — \alpha$$ confidence interval for $$\mu$$ based on the sample data. When A falls within the $$c = 1 — \alpha$$ confidence interval. we do reject $$H_{0}$$.
For a one-tailed hypothesis test with level of significance Ho : $$\mu = k$$ and null hypothesiswe reject Ho whenever A falls outsidethe $$c = 1 — 2\alpha$$ confidence interval for p based on the sample data. When A falls within the $$c = 1 — 2\alpha$$ confidence interval, we do not reject $$H_{0}$$.
A corresponding relationship between confidence intervals and two-tailed hypothesis tests is also valid for other parameters, such as p, $$\mu1 — \mu_2,\ and\ p_{1}, - p_{2}$$.
(a) Consider the hypotheses $$H_{0} : \mu_{1} — \mu_{2} = O\ and\ H_{1} : \mu_{1} — \mu_{2} \neq$$ Suppose a 95% confidence interval for $$\mu_{1} — \mu_{2}$$ contains only positive numbers. Should you reject the null hypothesis when $$\alpha = 0.05$$? Why or why not?