The finally answer : one categorical question is required based on answer of which the groups are divided
And one numerical question, whose result is compared across the groups.
So one numerical and one categorical

Question

asked 2020-10-23

The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.

Suspect was Armed:

Black - 543

White - 1176

Hispanic - 378

Total - 2097

Suspect was unarmed:

Black - 60

White - 67

Hispanic - 38

Total - 165

Total:

Black - 603

White - 1243

Hispanic - 416

Total - 2262

Give your answer as a decimal to at least three decimal places.

a) What percent are Black?

b) What percent are Unarmed?

c) In order for two variables to be Independent of each other, the P \((A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).\)

This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).

Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).

Remember, the previous answer is only correct if the variables are Independent.

d) Now let's get the real percent that are Black and Unarmed by using the table?

If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.

Let's compare the percentage of unarmed shot for each race.

e) What percent are White and Unarmed?

f) What percent are Hispanic and Unarmed?

If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.

Why is that?

This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.

Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades

The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.

g) What percent of blacks shot and killed by police were unarmed?

h) What percent of whites shot and killed by police were unarmed?

i) What percent of Hispanics shot and killed by police were unarmed?

You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.

j) Why do you believe this is happening?

Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.

Suspect was Armed:

Black - 543

White - 1176

Hispanic - 378

Total - 2097

Suspect was unarmed:

Black - 60

White - 67

Hispanic - 38

Total - 165

Total:

Black - 603

White - 1243

Hispanic - 416

Total - 2262

Give your answer as a decimal to at least three decimal places.

a) What percent are Black?

b) What percent are Unarmed?

c) In order for two variables to be Independent of each other, the P \((A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).\)

This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).

Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).

Remember, the previous answer is only correct if the variables are Independent.

d) Now let's get the real percent that are Black and Unarmed by using the table?

If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.

Let's compare the percentage of unarmed shot for each race.

e) What percent are White and Unarmed?

f) What percent are Hispanic and Unarmed?

If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.

Why is that?

This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.

Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades

The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.

g) What percent of blacks shot and killed by police were unarmed?

h) What percent of whites shot and killed by police were unarmed?

i) What percent of Hispanics shot and killed by police were unarmed?

You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.

j) Why do you believe this is happening?

Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.

asked 2021-02-25

Give a full and correct answer
Why is it important that a sample be random and representative when conducting hypothesis testing?
Representative Sample vs. Random Sample: An Overview
Economists and researchers seek to reduce sampling bias to near negligible levels when employing statistical analysis. Three basic characteristics in a sample reduce the chances of sampling bias and allow economists to make more confident inferences about a general population from the results obtained from the sample analysis or study:
* Such samples must be representative of the chosen population studied.
* They must be randomly chosen, meaning that each member of the larger population has an equal chance of being chosen.
* They must be large enough so as not to skew the results. The optimal size of the sample group depends on the precise degree of confidence required for making an inference.
Representative sampling and random sampling are two techniques used to help ensure data is free of bias. These sampling techniques are not mutually exclusive and, in fact, they are often used in tandem to reduce the degree of sampling error in an analysis and allow for greater confidence in making statistical inferences from the sample in regard to the larger group.
Representative Sample
A representative sample is a group or set chosen from a larger statistical population or group of factors or instances that adequately replicates the larger group according to whatever characteristic or quality is under study.
A representative sample parallels key variables and characteristics of the large society under examination. Some examples include sex, age, education level, socioeconomic status (SES), or marital status. A larger sample size reduced sampling error and increases the likelihood that the sample accurately reflects the target population.
Random Sample
A random sample is a group or set chosen from a larger population or group of factors of instances in a random manner that allows for each member of the larger group to have an equal chance of being chosen. A random sample is meant to be an unbiased representation of the larger population. It is considered a fair way to select a sample from a larger population since every member of the population has an equal chance of getting selected.
Special Considerations:
People collecting samples need to ensure that bias is minimized. Representative sampling is one of the key methods of achieving this because such samples replicate as closely as possible elements of the larger population under study. This alone, however, is not enough to make the sampling bias negligible. Combining the random sampling technique with the representative sampling method reduces bias further because no specific member of the representative population has a greater chance of selection into the sample than any other.
Summarize this article in 250 words.

asked 2021-02-25

Iron is very important for babies' growth. A common belief is that breastfeeding will help the baby to get more iron than formula feeding. To justify the belief, a study followed 2 groups of babies from born to 6 months. With one group babies are breast fed, and the other group are formula fed without iron supplements. Data below shows iron levels of those two groups of babies.
\(\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{\left|{c}\right|}{c}{\mid}\right\rbrace}{h}{l}\in{e}{G}{r}{o}{u}{p}&{S}{a}\mp\le\ {s}{i}{z}{e}&{m}{e}{a}{n}&{S}{\tan{{d}}}{a}{r}{d}\ {d}{e}{v}{i}{a}{t}{i}{o}{n}\backslash{h}{l}\in{e}{B}{r}{e}\ast-{f}{e}{d}&{23}&{13.3}&{1.7}\backslash{h}{l}\in{e}{F}{\quad\text{or}\quad}\mu{l}{a}-{f}{e}{d}&{23}&{12.4}&{1.8}\backslash{h}{l}\in{e}{D}{I}{F}{F}={B}{r}{e}\ast-{F}{\quad\text{or}\quad}\mu{l}{a}&{23}&{0.9}&{1.4}\backslash{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}\)
(1) There are two groups we need to compare for the study: Breast-Fed and Formula- Fed. Are those two groups dependent or independent? Based on your answer, what inference procedure should we apply for this research?
(2) Please perform the inference you decided in (1), and make sure to follow the 5-step procedure for any hypothesis test.
(3) Based on your conclusion in (2), what kind of error could you make? Explain the type of error using the context words for this research

asked 2021-01-22

The Kroger Company is one of the largest grocery retailers in the United States, with over 2000 grocery stores across the country. Kroger uses an online customer opinion questionnaire to obtain performance data about its products and services and learn about what motivates its customers (Kroger website, April 2012). In the survey, Kroger customers were asked if they would be willing to pay more for products that had each of the following
four characteristics.

The four questions were: Would you pay more for:

products that have a brand name?

products that are environmentally friendly?

products that are organic?

products that have been recommended by others?

For each question, the customers had the option of responding Yes if they would pay more or No if they would not pay more.

a. Are the data collected by Kroger in this example categorical or quantitative?

The four questions were: Would you pay more for:

products that have a brand name?

products that are environmentally friendly?

products that are organic?

products that have been recommended by others?

For each question, the customers had the option of responding Yes if they would pay more or No if they would not pay more.

a. Are the data collected by Kroger in this example categorical or quantitative?

asked 2020-12-29

The presidential election is coming. Five survey companies (A, B, C, D, and E) are doing survey to forecast whether or not the Republican candidate will win the election. Each company randomly selects a sample size between 1000 and 1500 people. All of these five companies interview people over the phone during Tuesday and Wednesday. The interviewee will be asked if he or she is 18 years old or above and U.S. citizen who are registered to vote. If yes, the interviewee will be further asked: will you vote for the Republican candidate? On Thursday morning, these five companies announce their survey sample and results at the same time on the newspapers. The results show that a% (from A), b% (from B), c% (from C), d% (from D), and e% (from E) will support the Republican candidate. The margin of error is plus/minus 3% for all results. Suppose that \(\displaystyle{c}{>}{a}{>}{d}{>}{e}{>}{b}\). When you see these results from the newspapers, can you exactly identify which result(s) is (are) not reliable and not accurate? That is, can you identify which estimation interval(s) does (do) not include the true population proportion? If you can, explain why you can, if no, explain why you cannot and what information you need to identify. Discuss and explain your reasons. You must provide your statistical analysis and reasons.

asked 2021-01-17

A new thermostat has been engineered for the frozen food cases in large supermarkets. Both the old and new thermostats hold temperatures at an average of \(25^{\circ}F\). However, it is hoped that the new thermostat might be more dependable in the sense that it will hold temperatures closer to \(25^{\circ}F\). One frozen food case was equipped with the new thermostat, and a random sample of 21 temperature readings gave a sample variance of 5.1. Another similar frozen food case was equipped with the old thermostat, and a random sample of 19 temperature readings gave a sample variance of 12.8. Test the claim that the population variance of the old thermostat temperature readings is larger than that for the new thermostat. Use a \(5\%\) level of significance. How could your test conclusion relate to the question regarding the dependability of the temperature readings? (Let population 1 refer to data from the old thermostat.)

(a) What is the level of significance?

State the null and alternate hypotheses.

\(H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}\)

(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)

What are the degrees of freedom?

\(df_{N} = ?\)

\(df_{D} = ?\)

What assumptions are you making about the original distribution?

The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.

(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)

(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?

At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.

(e) Interpret your conclusion in the context of the application.

Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.

(a) What is the level of significance?

State the null and alternate hypotheses.

\(H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}>?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}\neq?_{2}^{2}H0:?_{1}^{2}=?_{2}^{2},H1:?_{1}^{2}?_{2}^{2},H1:?_{1}^{2}=?_{2}^{2}\)

(b) Find the value of the sample F statistic. (Round your answer to two decimal places.)

What are the degrees of freedom?

\(df_{N} = ?\)

\(df_{D} = ?\)

What assumptions are you making about the original distribution?

The populations follow independent normal distributions. We have random samples from each population.The populations follow dependent normal distributions. We have random samples from each population.The populations follow independent normal distributions.The populations follow independent chi-square distributions. We have random samples from each population.

(c) Find or estimate the P-value of the sample test statistic. (Round your answer to four decimal places.)

(d) Based on your answers in parts (a) to (c), will you reject or fail to reject the null hypothesis?

At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we fail to reject the null hypothesis and conclude the data are statistically significant. At the ? = 0.05 level, we reject the null hypothesis and conclude the data are not statistically significant.At the ? = 0.05 level, we reject the null hypothesis and conclude the data are statistically significant.

(e) Interpret your conclusion in the context of the application.

Reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings.Fail to reject the null hypothesis, there is sufficient evidence that the population variance is larger in the old thermostat temperature readings. Fail to reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.Reject the null hypothesis, there is insufficient evidence that the population variance is larger in the old thermostat temperature readings.

asked 2020-11-26

Use either the critical-value approach or the P-value approach to perform the required hypothesis test. For several years, evidence had been mounting that folic acid reduces major birth defects. A. Czeizel and I. Dudas of the National Institute of Hygiene in Budapest directed a study that provided the strongest evidence to date. Their results were published in the paper “Prevention of the First Occurrence of Neural-Tube Defects by Periconceptional Vitamin Supplementation” (New England Journal of Medicine, Vol. 327(26), p. 1832). For the study, the doctors enrolled women prior to conception and divided them randomly into two groups. One group, consisting of 2701 women, took daily multivitamins containing 0.8 mg of folic acid, the other group, consisting of 2052 women, received only trace elements. Major birth defects occurred in 35 cases when the women took folic acid and in 47 cases when the women did not. a. At the 1% significance level, do the data provide sufficient evidence to conclude that women who take folic acid are at lesser risk of having children with major birth defects? b. Is this study a designed experiment or an observational study? Explain your answer. c. In view of your answers to parts (a) and (b), could you reasonably conclude that taking folic acid causes a reduction in major birth defects? Explain your answer.

asked 2020-12-28

Is statistical inference intuitive to babies? In other words, are babies able to generalize from sample to population? In this study,1 8-month-old infants watched someone draw a sample of five balls from an opaque box. Each sample consisted of four balls of one color (red or white) and one ball of the other color. After observing the sample, the side of the box was lifted so the infants could see all of the balls inside (the population). Some boxes had an “expected” population, with balls in the same color proportions as the sample, while other boxes had an “unexpected” population, with balls in the opposite color proportion from the sample. Babies looked at the unexpected populations for an average of 9.9 seconds (sd = 4.5 seconds) and the expected populations for an average of 7.5 seconds (sd = 4.2 seconds). The sample size in each group was 20, and you may assume the data in each group are reasonably normally distributed. Is this convincing evidence that babies look longer at the unexpected population, suggesting that they make inferences about the population from the sample?
Let group 1 and group 2 be the time spent looking at the unexpected and expected populations, respectively.
A) Calculate the relevant sample statistic.
Enter the exact answer.
Sample statistic: _____
B) Calculate the t-statistic.
Round your answer to two decimal places.
t-statistic = ___________
C) Find the p-value.
Round your answer to three decimal places.
p-value =

asked 2021-02-02

Potential buyers for a new car were randomly divided into two groups. One group was shown the "A" version of an ad for the car, while the other group was shown the "B" version of the ad. All were then tested on their recall of key points made in the ad. The researcher should
run a hypothesis test based upon a comparison of means for ?

In another study, a healthcare insurance company took measures of subscribers’ cardiac (heart) health. The people were then provided an app for their phones which provided "nudges" and reminders about heart-healthy behaviors, such as eating more vegetables and less fried or fatty food, taking walks and breaks from sitting too long, and getting enough sleep. After 4 months of having the app, the cardiac health measures were taken again, with the objective of seeing if nudges from the app would result in decreased cardiac risk. The researcher should run a hypothesis test based on a comparison of means for?

In another study, a healthcare insurance company took measures of subscribers’ cardiac (heart) health. The people were then provided an app for their phones which provided "nudges" and reminders about heart-healthy behaviors, such as eating more vegetables and less fried or fatty food, taking walks and breaks from sitting too long, and getting enough sleep. After 4 months of having the app, the cardiac health measures were taken again, with the objective of seeing if nudges from the app would result in decreased cardiac risk. The researcher should run a hypothesis test based on a comparison of means for?

asked 2021-01-27

Prove an example of two independent samples and another example of dependent samples providing your reasoning
To compare two means or two proportions, one works with two groups. The group are classfied either aas independent or matched pairs. Independent groups meam that the two samples taken are independent, that is, sample values selected from one population are not related in any way to sample values selected from the other population. Matched pairs consist of two samples that are dependent. The parameter tested using matched pairs is the population mean. The parameters tested using independent groups are either population means or population proportion.