When the chi squared statistic is used in testing hypothesis? Include underlying assumptions and the test statistic for testing hypothesis on a single population. What is the criterion for rejecting the null hypothesis for both non-directional and directional tests? How do you find the p value in each case?

Question
Modeling data distributions
asked 2021-02-06
When the chi squared statistic is used in testing hypothesis? Include underlying assumptions and the test statistic for testing hypothesis on a single population. What is the criterion for rejecting the null hypothesis for both non-directional and directional tests? How do you find the p value in each case?

Answers (1)

2021-02-07
Chi-square statistic for testing single variance:
The Chi-square statistic is used to test the population variance of a single sample.
The necessary assumptions for Chi-square test:
The sample should be collected using simple random sampling.
The population from which the sample is drawn should follow normal distribution.
The data should be continuous.
The chi-square test statistic is obtained as given below:
\(\displaystyle{x}^{2}=\frac{{{\left({n}-{1}\right)}{s}^{2}}}{{\sigma^{2}}}\)
\(\displaystyle{n}=\) Sample size
\(\displaystyle{s}^{2}=\) Sample variance
\(\displaystyle\sigma^{2}=\) Population variance
Decision rule based on P-value approach for both directional and non-directional tests:
The level of significance is \(\alpha.\)
If P-value \(\displaystyle\le\alpha\), then reject the null hypothesis \(\displaystyle{H}_{{0}}\).
If P-value \(\displaystyle>\alpha\), then fail to reject the null hypothesis \(\displaystyle{H}_{{0}}\).
P-value:
The P­-value will be obtained from the chi-square distribution table based on the value of test statistic and the degrees of freedom \(\displaystyle{\left({n}–{1}\right)}\) and the type of hypothesis test (Two tailed, right tailed or left tailed).
Chi-square statistic for testing distribution:
The Chi-square goodness of fit test is used to test whether the sample data are consistent with a hypothesized distribution or not.
The chi-square goodness-of-fit test is used to test whether a sample of data comes from a population with a specific distribution. The chi-square goodness-of-fit can also be applied to discrete distributions
The necessary assumptions for Chi-square test for goodness of fit are given below:
The sample should be collected using simple random sampling.
The variable of interest must be categorical.
The expected value of each cell should not be less than 5.
Evidently, the test is to determine whether a sample of data comes from a population with a specific distribution.
Chi-square goodness of fit is a right tailed test. Therefore, it is a directional test.
The chi-square test statistic is obtained as given below:
\(\displaystyle{x}^{2}=\frac{{{\sum_{{{i}={1}}}^{{n}}}{\left({O}_{{i}}-{E}_{{i}}\right)}^{2}}}{{{E}_{{i}}}}\)
\(\displaystyle{O}_{{i}}=\) Observed frenquency
\(\displaystyle{E}_{{i}}=\) Expected frequency
Decision rule based on P-value approach:
The level of significance is \(\alpha.\)
If P-value \(\displaystyle\le\alpha\), then reject the null hypothesis \(\displaystyle{H}_{{0}}\).
If P-value \(\displaystyle>\alpha\), then fail to reject the null hypothesis \displaystyle \({H}_{{0}}\).
The P­-value will be obtained from the chi-square distribution table based on the value of test statistic and the degrees of freedom \(\displaystyle{\left({n}–{1}\right)}\) for the right tailed test.
0

Relevant Questions

asked 2021-01-17
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of \(25^{\circ}F\). However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to \(25^{\circ}F\). One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a \(5\%\) level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
\(H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}\)
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
\(df_{N} = ?\)
\(df_{D} = ?\)
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
asked 2020-12-07
Would you rather spend more federal taxes on art? Of a random sample of \(n_{1} = 86\) politically conservative voters, \(r_{1} = 18\) responded yes. Another random sample of \(n_{2} = 85\) politically moderate voters showed that \(r_{2} = 21\) responded yes. Does this information indicate that the population proportion of conservative voters inclined to spend more federal tax money on funding the arts is less than the proportion of moderate voters so inclined? Use \(\alpha = 0.05.\) (a) State the null and alternate hypotheses. \(H_0:p_{1} = p_{2}, H_{1}:p_{1} > p_2\)
\(H_0:p_{1} = p_{2}, H_{1}:p_{1} < p_2\)
\(H_0:p_{1} = p_{2}, H_{1}:p_{1} \neq p_2\)
\(H_{0}:p_{1} < p_{2}, H_{1}:p_{1} = p_{2}\) (b) What sampling distribution will you use? What assumptions are you making? The Student's t. The number of trials is sufficiently large. The standard normal. The number of trials is sufficiently large.The standard normal. We assume the population distributions are approximately normal. The Student's t. We assume the population distributions are approximately normal. (c)What is the value of the sample test statistic? (Test the difference \(p_{1} - p_{2}\). Do not use rounded values. Round your final answer to two decimal places.) (d) Find (or estimate) the P-value. (Round your answer to four decimal places.) (e) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level alpha? At the \(\alpha = 0.05\) level, we reject the null hypothesis and conclude the data are statistically significant. At the \(\alpha = 0.05\) level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the \(\alpha = 0.05\) level, we fail to reject the null hypothesis and conclude the data are not statistically significant. At the \(\alpha = 0.05\) level, we reject the null hypothesis and conclude the data are not statistically significant. (f) Interpret your conclusion in the context of the application. Reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is sufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Fail to reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters. Reject the null hypothesis, there is insufficient evidence that the proportion of conservative voters favoring more tax dollars for the arts is less than the proportion of moderate voters.
asked 2020-10-23
A random sample of \(\displaystyle{n}_{{1}}={16}\) communities in western Kansas gave the following information for people under 25 years of age.
\(\displaystyle{X}_{{1}}:\) Rate of hay fever per 1000 population for people under 25
\(\begin{array}{|c|c|} \hline 97 & 91 & 121 & 129 & 94 & 123 & 112 &93\\ \hline 125 & 95 & 125 & 117 & 97 & 122 & 127 & 88 \\ \hline \end{array}\)
A random sample of \(\displaystyle{n}_{{2}}={14}\) regions in western Kansas gave the following information for people over 50 years old.
\(\displaystyle{X}_{{2}}:\) Rate of hay fever per 1000 population for people over 50
\(\begin{array}{|c|c|} \hline 94 & 109 & 99 & 95 & 113 & 88 & 110\\ \hline 79 & 115 & 100 & 89 & 114 & 85 & 96\\ \hline \end{array}\)
(i) Use a calculator to calculate \(\displaystyle\overline{{x}}_{{1}},{s}_{{1}},\overline{{x}}_{{2}},{\quad\text{and}\quad}{s}_{{2}}.\) (Round your answers to two decimal places.)
(ii) Assume that the hay fever rate in each age group has an approximately normal distribution. Do the data indicate that the age group over 50 has a lower rate of hay fever? Use \(\displaystyle\alpha={0.05}.\)
(a) What is the level of significance?
State the null and alternate hypotheses.
\(\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}<\mu_{{2}}\)
\(\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}>\mu_{{2}}\)
\(\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}\ne\mu_{{2}}\)
\(\displaystyle{H}_{{0}}:\mu_{{1}}>\mu_{{2}},{H}_{{1}}:\mu_{{1}}=\mu_{{12}}\)
(b) What sampling distribution will you use? What assumptions are you making?
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations,
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations,
The Student's t. We assume that both population distributions are approximately normal with known standard deviations,
What is the value of the sample test statistic? (Test the difference \(\displaystyle\mu_{{1}}-\mu_{{2}}\). Round your answer to three decimalplaces.)
What is the value of the sample test statistic? (Test the difference \(\displaystyle\mu_{{1}}-\mu_{{2}}\). Round your answer to three decimal places.)
(c) Find (or estimate) the P-value.
P-value \(\displaystyle>{0.250}\)
\(\displaystyle{0.125}<{P}-\text{value}<{0},{250}\)
\(\displaystyle{0},{050}<{P}-\text{value}<{0},{125}\)
\(\displaystyle{0},{025}<{P}-\text{value}<{0},{050}\)
\(\displaystyle{0},{005}<{P}-\text{value}<{0},{025}\)
P-value \(\displaystyle<{0.005}\)
Sketch the sampling distribution and show the area corresponding to the P-value.
P.vaiue Pevgiue
P-value f P-value
asked 2021-01-19
Which possible statements about the chi-squared distribution are true?
a) The statistic X^2, that is used to estimate the variance S^2 of a random sample, has a Chi-squared distribution.
b) The sum of the squares of k independent standard normal random variables has a Chi-squared distribution with k degrees of freedom.
c) The Chi-squared distribution is used in hypothesis testing and estimation.
d) The Chi-squared distribution is a particular case of the Gamma distribution.
e)All of the above.
asked 2020-10-23
1. Find each of the requested values for a population with a mean of \(? = 40\), and a standard deviation of \(? = 8\) A. What is the z-score corresponding to \(X = 52?\) B. What is the X value corresponding to \(z = - 0.50?\) C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of \(M=42\) for a sample of \(n = 4\) scores? E. What is the z-scores corresponding to a sample mean of \(M= 42\) for a sample of \(n = 6\) scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: \(a. -2.00 b. 1.25 c. 3.50 d. -0.34\) 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with \(\mu = 78\) and \(\sigma = 12\). Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: \(82, 74, 62, 68, 79, 94, 90, 81, 80\). 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about \($12 (\mu = 42, \sigma = 12)\). You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is $44.50 from tips. Test for a difference between this value and the population mean at the \(\alpha = 0.05\) level of significance.
asked 2020-11-08
Testing for a Linear Correlation. In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of \(\alpha = 0.05\). Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.) Lemons and Car Crashes Listed below are annual data for various years. The data are weights (metric tons) of lemons imported from Mexico and U.S. car crash fatality rates per 100,000 population [based on data from “The Trouble with QSAR (or How I Learned to Stop Worrying and Embrace Fallacy),” by Stephen Johnson, Journal of Chemical Information and Modeling, Vol. 48, No. 1]. Is there sufficient evidence to conclude that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates? Do the results suggest that imported lemons cause car fatalities? \(\begin{matrix} \text{Lemon Imports} & 230 & 265 & 358 & 480 & 530\\ \text{Crashe Fatality Rate} & 15.9 & 15.7 & 15.4 & 15.3 & 14.9\\ \end{matrix}\)
asked 2021-01-31
factor in determining the usefulness of an examination as a measure of demonstrated ability is the amount of spread that occurs in the grades. If the spread or variation of examination scores is very small, it usually means that the examination was either too hard or too easy. However, if the variance of scores is moderately large, then there is a definite difference in scores between "better," "average," and "poorer" students. A group of attorneys in a Midwest state has been given the task of making up this year's bar examination for the state. The examination has 500 total possible points, and from the history of past examinations, it is known that a standard deviation of around 60 points is desirable. Of course, too large or too small a standard deviation is not good. The attorneys want to test their examination to see how good it is. A preliminary version of the examination (with slight modifications to protect the integrity of the real examination) is given to a random sample of 20 newly graduated law students. Their scores give a sample standard deviation of 70 points. Using a 0.01 level of significance, test the claim that the population standard deviation for the new examination is 60 against the claim that the population standard deviation is different from 60.
(a) What is the level of significance?
State the null and alternate hypotheses.
\(H_{0}:\sigma=60,\ H_{1}:\sigma\ <\ 60H_{0}:\sigma\ >\ 60,\ H_{1}:\sigma=60H_{0}:\sigma=60,\ H_{1}:\sigma\ >\ 60H_{0}:\sigma=60,\ H_{1}:\sigma\ \neq\ 60\)
(b) Find the value of the chi-square statistic for the sample. (Round your answer to two decimal places.)
What are the degrees of freedom?
What assumptions are you making about the original distribution?
We assume a binomial population distribution.We assume a exponential population distribution. We assume a normal population distribution.We assume a uniform population distribution.
asked 2020-12-24
In there a relationship between confidence intervals and two-tailed hypothesis tests? The answer is yes. Let c be the level of confidence used to construct a confidence interval from sample data. Let * be the level of significance for a two-tailed hypothesis test. The following statement applies to hypothesis tests of the mean: For a two-tailed hypothesis test with level of significance a and null hypothesis H_0 : mu = k we reject Ho whenever k falls outside the c = 1 — alpha confidence interval for mu based on the sample data. When A falls within the c = 1 — alpha confidence interval. we do reject H_0. For a one-tailed hypothesis test with level of significance Ho : mu = k and null hypothesiswe reject Ho whenever A falls outsidethe c = 1 — 2alpha confidence interval for p based on the sample data. When A falls within thec = 1 — 2alpha confidence interval, we do not reject H_0. A corresponding relationship between confidence intervals and two-tailed hypothesis tests is also valid for other parameters, such as p,mu1 — mu_2, and p_1, - p_2. (b) Consider the hypotheses H_0 : p_1 — p_2 = O and H_1 : p_1 — p_2 != Suppose a 98% confidence interval for p_1 — p_2 contains only positive numbers. Should you reject the null hypothesis when alpha = 0.05? Why or why not?
asked 2021-01-31
In there a relationship between confidence intervals and two-tailed hypothesis tests? The answer is yes. Let c be the level of confidence used to construct a confidence interval from sample data. Let * be the level of significance for a two-tailed hypothesis test. The following statement applies to hypothesis tests of the mean:
For a two-tailed hypothesis test with level of significance a and null hypothesis \(H_{0} : \mu = k\) we reject Ho whenever k falls outside the \(c = 1 — \alpha\) confidence interval for \(\mu\) based on the sample data. When A falls within the \(c = 1 — \alpha\) confidence interval. we do reject \(H_{0}\).
For a one-tailed hypothesis test with level of significance Ho : \(\mu = k\) and null hypothesiswe reject Ho whenever A falls outsidethe \(c = 1 — 2\alpha\) confidence interval for p based on the sample data. When A falls within the \(c = 1 — 2\alpha\) confidence interval, we do not reject \(H_{0}\).
A corresponding relationship between confidence intervals and two-tailed hypothesis tests is also valid for other parameters, such as p, \(\mu1 — \mu_2,\ and\ p_{1}, - p_{2}\).
(a) Consider the hypotheses \(H_{0} : \mu_{1} — \mu_{2} = O\ and\ H_{1} : \mu_{1} — \mu_{2} \neq\) Suppose a 95% confidence interval for \(\mu_{1} — \mu_{2}\) contains only positive numbers. Should you reject the null hypothesis when \(\alpha = 0.05\)? Why or why not?
asked 2020-12-07
Hypothesis Testing Review
For each problem below, simply identify the null and alternative hypotheses. Use appropriate notation/symbols. You do not have to run any hypothesis tests, although it's good practice and I'll post answers for all of them.
1) A simple random sample of 44 men from a normally distributed population results in a standard deviation of 10.7 beats per minute. The normal range of pulse rates of adults is typically given as 60 to 100 beats per minute. If the range rule of thumb is applied to that normal range, the result is a standard deviation of 10 beats per minute. Use the sample results with a 0.10 significance level to test the claim that pulse rates of men have a standard deviation equal to 10 beats per minute.
2) In 1997, a survey of 880 households showed that 145 of them use e-mail. Use those sample results to test the claim that more than 15% of households use e-mail. Use a 0.05 significance level.
...