# (6%) When we conclude that two populations are compared, should we always use an experiment with matched pairs? Provide correct and appropriate reasons, regardless of your answer YES or not).

Question
Comparing two groups
($$6\%$$) When we conclude that two populations are compared, should we always use an experiment with matched pairs? Provide correct and appropriate reasons, regardless of your answer YES or not).

2021-02-15
No. The inference depends on whether the populations are the same or if the populations are different.
For example, we have 10 people whose BP is checked before medication. 2 hours after the medication, the BP of the same 10 patients are checked to see the effect of medication. This is a matched pairs t test, as the populations in question are the same. In a matched pairs test, the sample size has to be the same.
When the underlying populations are different, then we use a 2 sample ( t or z ) test. For example, a set of 25 students of AAA college had SAT scores of 560 and another set of 30 Students in BB college have SAT scores of 495. Now the populations here are completely different, and therefore we do not use a matched pairs t test

### Relevant Questions

A random sample of $$n_1 = 14$$ winter days in Denver gave a sample mean pollution index $$x_1 = 43$$.
Previous studies show that $$\sigma_1 = 19$$.
For Englewood (a suburb of Denver), a random sample of $$n_2 = 12$$ winter days gave a sample mean pollution index of $$x_2 = 37$$.
Previous studies show that $$\sigma_2 = 13$$.
Assume the pollution index is normally distributed in both Englewood and Denver.
(a) State the null and alternate hypotheses.
$$H_0:\mu_1=\mu_2.\mu_1>\mu_2$$
$$H_0:\mu_1<\mu_2.\mu_1=\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1<\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1\neq\mu_2$$
(b) What sampling distribution will you use? What assumptions are you making? NKS The Student's t. We assume that both population distributions are approximately normal with known standard deviations.
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations.
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations.
(c) What is the value of the sample test statistic? Compute the corresponding z or t value as appropriate.
(Test the difference $$\mu_1 - \mu_2$$. Round your answer to two decimal places.) NKS (d) Find (or estimate) the P-value. (Round your answer to four decimal places.)
(e) Based on your answers in parts (i)−(iii), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level \alpha?
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are not statistically significant.
(f) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. (g) Find a 99% confidence interval for
$$\mu_1 - \mu_2$$.
lower limit
upper limit
(h) Explain the meaning of the confidence interval in the context of the problem.
Because the interval contains only positive numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, we can not say that the mean population pollution index for Englewood is different than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains only negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is less than that of Denver.
The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P $$(A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).$$
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
When should we use pooled, non-pooled or paired output to compare the means of two populations. you need to be specific and give the correct answer.
Prove an example of two independent samples and another example of dependent samples providing your reasoning To compare two means or two proportions, one works with two groups. The group are classfied either aas independent or matched pairs. Independent groups meam that the two samples taken are independent, that is, sample values selected from one population are not related in any way to sample values selected from the other population. Matched pairs consist of two samples that are dependent. The parameter tested using matched pairs is the population mean. The parameters tested using independent groups are either population means or population proportion.

When a gas is taken from a to c along the curved path in the figure (Figure 1) , the work done by the gas is W = -40 J and the heat added to the gas is Q = -140 J . Along path abc, the work done by the gas is W = -50 J . (That is, 50 J of work is done on the gas.)
I keep on missing Part D. The answer for part D is not -150,150,-155,108,105( was close but it said not quite check calculations)
Part A
What is Q for path abc?
Express your answer to two significant figures and include the appropriate units.
Part B
f Pc=1/2Pb, what is W for path cda?
Express your answer to two significant figures and include the appropriate units.
Part C
What is Q for path cda?
Express your answer to two significant figures and include the appropriate units.
Part D
What is Ua?Uc?
Express your answer to two significant figures and include the appropriate units.
Part E
If Ud?Uc=42J, what is Q for path da?
Express your answer to two significant figures and include the appropriate units.
Use either the critical-value approach or the P-value approach to perform the required hypothesis test. For several years, evidence had been mounting that folic acid reduces major birth defects. A. Czeizel and I. Dudas of the National Institute of Hygiene in Budapest directed a study that provided the strongest evidence to date. Their results were published in the paper “Prevention of the First Occurrence of Neural-Tube Defects by Periconceptional Vitamin Supplementation” (New England Journal of Medicine, Vol. 327(26), p. 1832). For the study, the doctors enrolled women prior to conception and divided them randomly into two groups. One group, consisting of 2701 women, took daily multivitamins containing 0.8 mg of folic acid, the other group, consisting of 2052 women, received only trace elements. Major birth defects occurred in 35 cases when the women took folic acid and in 47 cases when the women did not. a. At the 1% significance level, do the data provide sufficient evidence to conclude that women who take folic acid are at lesser risk of having children with major birth defects? b. Is this study a designed experiment or an observational study? Explain your answer. c. In view of your answers to parts (a) and (b), could you reasonably conclude that taking folic acid causes a reduction in major birth defects? Explain your answer.

A two-sample inference deals with dependent and independent inferences. In a two-sample hypothesis testing problem, underlying parameters of two different populations are compared. In a longitudinal (or follow-up) study, the same group of people is followed over time. Two samples are said to be paired when each data point in the first sample is matched and related to a unique data point in the second sample.
This problem demonstrates inference from two dependent (follow-up) samples using the data from the hypothetical study of new cases of tuberculosis (TB) before and after the vaccination was done in several geographical areas in a country in sub-Saharan Africa. Conclusion about the null hypothesis is to note the difference between samples.
The problem that demonstrates inference from two dependent samples uses hypothetical data from the TB vaccinations and the number of new cases before and after vaccination. $$\begin{array}{|c|c|} \hline Geographical\ regions & Before\ vaccination & After\ vaccination\\ \hline 1 & 85 & 11\\ \hline 2 & 77 & 5\\ \hline 3 & 110 & 14\\ \hline 4 & 65 & 12\\ \hline 5 & 81 & 10\\\hline 6 & 70 & 7\\ \hline 7 & 74 & 8\\ \hline 8 & 84 & 11\\ \hline 9 & 90 & 9\\ \hline 10 & 95 & 8\\ \hline \end{array}$$
Using the Minitab statistical analysis program to enter the data and perform the analysis, complete the following: Construct a one-sided $$\displaystyle{95}\%$$ confidence interval for the true difference in population means. Test the null hypothesis that the population means are identical at the 0.05 level of significance.

Give a full and correct answer Why is it important that a sample be random and representative when conducting hypothesis testing? Representative Sample vs. Random Sample: An Overview Economists and researchers seek to reduce sampling bias to near negligible levels when employing statistical analysis. Three basic characteristics in a sample reduce the chances of sampling bias and allow economists to make more confident inferences about a general population from the results obtained from the sample analysis or study: * Such samples must be representative of the chosen population studied. * They must be randomly chosen, meaning that each member of the larger population has an equal chance of being chosen. * They must be large enough so as not to skew the results. The optimal size of the sample group depends on the precise degree of confidence required for making an inference. Representative sampling and random sampling are two techniques used to help ensure data is free of bias. These sampling techniques are not mutually exclusive and, in fact, they are often used in tandem to reduce the degree of sampling error in an analysis and allow for greater confidence in making statistical inferences from the sample in regard to the larger group. Representative Sample A representative sample is a group or set chosen from a larger statistical population or group of factors or instances that adequately replicates the larger group according to whatever characteristic or quality is under study. A representative sample parallels key variables and characteristics of the large society under examination. Some examples include sex, age, education level, socioeconomic status (SES), or marital status. A larger sample size reduced sampling error and increases the likelihood that the sample accurately reflects the target population. Random Sample A random sample is a group or set chosen from a larger population or group of factors of instances in a random manner that allows for each member of the larger group to have an equal chance of being chosen. A random sample is meant to be an unbiased representation of the larger population. It is considered a fair way to select a sample from a larger population since every member of the population has an equal chance of getting selected. Special Considerations: People collecting samples need to ensure that bias is minimized. Representative sampling is one of the key methods of achieving this because such samples replicate as closely as possible elements of the larger population under study. This alone, however, is not enough to make the sampling bias negligible. Combining the random sampling technique with the representative sampling method reduces bias further because no specific member of the representative population has a greater chance of selection into the sample than any other. Summarize this article in 250 words.
A new vaccine was tested to see if it could prevent the ear infections that many infants suffer from. Babies about a year old were randomly divided into two groups. One group received vaccinations, and the other did not. The following year, only 328 of 2460 vaccinated children had ear infections, compared to 508 of 2453 unvaccinated children. Complete parts a) through c) below. a) Are the conditions for inference satisfied? A. Yes. The data were generated by a randomized experiment, less than 10% of the population was sampled, the groups were independent, and there were more than 10 successes and failures in each group. B. No. It was not a random sample. C. No. The groups were not independent. D. No. More than 10% of the population was sampled.
A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of $$25^{\circ}F$$. However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to $$25^{\circ}F$$. One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a $$5\%$$ level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)
(a) What is the level of significance?
State the null and alternate hypotheses.
$$H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}$$
(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)
What are the degrees of freedom?
$$df_{N} = ?$$
$$df_{D} = ?$$
What assumptions are you making about the original distribution?
The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.
(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)
(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?
At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.
(e) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.
...