# Would you expect distributions of these variables to be uniform, unimodal, or bimodal? Symmetric or skewed? Explain why. a) Ages of people at a Little League game. b) Number of siblings of people in your class. c) Pulse rates of college-age males. d) Number of times each face of a die shows in 100 tosses.

Question
Normal distributions
Would you expect distributions of these variables to be uniform, unimodal, or bimodal? Symmetric or skewed? Explain why. a) Ages of people at a Little League game. b) Number of siblings of people in your class. c) Pulse rates of college-age males. d) Number of times each face of a die shows in 100 tosses.

2020-10-22
a) Bimodal and skewed right. There will be two distinct groups of Little League players and spectators. The spectators' ages will range as parents and grandparents may be present, so it is more likely to be skewed right. b) Unimodal and skewed right. Most people will have 0 or 1 sibling, with a few outliers who have larger families. c) Unimodal and symmetric. Most college-age males should have pulse rates close to the national mean for males their age. However, there may be a second mode is athletes are present in the data. d) Uniform and no skew. Assuming that the die is not loaded, each face should come up about the same number of times.

### Relevant Questions

Would you expect distributions of these variables to be uniform, unimodal, or bimodal? Symmetric or skewed? Explain why.
a) The number of speeding tickets each student in the senior class of a college has ever had.
b) Players’ scores (number of strokes) at the U.S. Open golf tournament in a given year.
c) Weights of female babies born in a particular hospital over the course of a year.
d) The length of the average hair on the heads of students in a large class.
The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P $$(A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).$$
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
The presidential election is coming. Five survey companies (A, B, C, D, and E) are doing survey to forecast whether or not the Republican candidate will win the election. Each company randomly selects a sample size between 1000 and 1500 people. All of these five companies interview people over the phone during Tuesday and Wednesday. The interviewee will be asked if he or she is 18 years old or above and U.S. citizen who are registered to vote. If yes, the interviewee will be further asked: will you vote for the Republican candidate? On Thursday morning, these five companies announce their survey sample and results at the same time on the newspapers. The results show that a% (from A), b% (from B), c% (from C), d% (from D), and e% (from E) will support the Republican candidate. The margin of error is plus/minus 3% for all results. Suppose that $$\displaystyle{c}{>}{a}{>}{d}{>}{e}{>}{b}$$. When you see these results from the newspapers, can you exactly identify which result(s) is (are) not reliable and not accurate? That is, can you identify which estimation interval(s) does (do) not include the true population proportion? If you can, explain why you can, if no, explain why you cannot and what information you need to identify. Discuss and explain your reasons. You must provide your statistical analysis and reasons.
Marks will be awarded for accuracy in the rounding of final answers where you are asked to round. To ensure that you receive these marks, take care in keeping more decimals in your intermediate steps than what the question is asking you to round your final answer to.
A fair 7 -sided die with the numbers 1 trough 7 is rolled five times. Express each of your answers as a decimal rounded to 3 decimal places.
(a) What is the probability that exactly one 3 is rolled?
(b) What is the probability that at least one 3 is rolled?
(c) What is the probability that exactly four of the rolls show an even number?
1. A researcher is interested in finding a 98% confidence interval for the mean number of times per day that college students text. The study included 144 students who averaged 44.7 texts per day. The standard deviation was 16.5 texts. a. To compute the confidence interval use a ? z t distribution. b. With 98% confidence the population mean number of texts per day is between and texts. c. If many groups of 144 randomly selected members are studied, then a different confidence interval would be produced from each group. About percent of these confidence intervals will contain the true population number of texts per day and about percent will not contain the true population mean number of texts per day. 2. You want to obtain a sample to estimate how much parents spend on their kids birthday parties. Based on previous study, you believe the population standard deviation is approximately $$\displaystyle\sigma={40.4}$$ dollars. You would like to be 90% confident that your estimate is within 1.5 dollar(s) of average spending on the birthday parties. How many parents do you have to sample? n = 3. You want to obtain a sample to estimate a population mean. Based on previous evidence, you believe the population standard deviation is approximately $$\displaystyle\sigma={57.5}$$. You would like to be 95% confident that your estimate is within 0.1 of the true population mean. How large of a sample size is required?
1)A rewiew of voted registration record in a small town yielded the dollowing data of the number of males and females registered as Democrat, Republican, or some other affilation: $$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{c}\right\rbrace}{G}{e}{n}{d}{e}{r}\backslash{h}{l}\in{e}{A}{f}{f}{i}{l}{a}{t}{i}{o}{n}&{M}{a}\le&{F}{e}{m}{a}\le\backslash{h}{l}\in{e}{D}{e}{m}{o}{c}{r}{a}{t}&{300}&{600}\backslash{R}{e}{p}{u}{b}{l}{i}{c}{a}{n}&{500}&{300}\backslash{O}{t}{h}{e}{r}&{200}&{100}\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$ What proportion of all voters is male and registered as a Democrat? 2)A survey was conducted invocted involving 303 subject concerning their preferences with respect to the size of car thay would consider purchasing. The following table shows the count of the responses by gender of the respondents: $$\displaystyle{b}{e}{g}\in{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}{\left\lbrace{c}\right\rbrace}{S}{i}{z}{e}\ {o}{f}\ {C}{a}{r}\backslash{h}{l}\in{e}{G}{e}{n}{d}{e}{r}&{S}{m}{a}{l}{l}&{M}{e}{d}{i}{u}{m}&{l}{a}{n}\ge&{T}{o}{t}{a}{l}\backslash{h}{l}\in{e}{F}{e}{m}{a}\le&{58}&{63}&{17}&{138}\backslash{M}{a}\le&{79}&{61}&{25}&{165}\backslash{T}{o}{t}{a}{l}&{137}&{124}&{42}&{303}\backslash{h}{l}\in{e}{e}{n}{d}{\left\lbrace{a}{r}{r}{a}{y}\right\rbrace}$$ the data are to be summarized by constructing marginal distributions. In the marginal distributio for car size, the entry for mediums car is ?
Several models have been proposed to explain the diversification of life during geological periods. According to Benton (1997), The diversification of marine families in the past 600 million years (Myr) appears to have followed two or three logistic curves, with equilibrium levels that lasted for up to 200 Myr. In contrast, continental organisms clearly show an exponential pattern of diversification, and although it is not clear whether the empirical diversification patterns are real or are artifacts of a poor fossil record, the latter explanation seems unlikely. In this problem, we will investigate three models fordiversification. They are analogous to models for populationgrowth, however, the quantities involved have a differentinterpretation. We denote by N(t) the diversification function,which counts the number of taxa as a function of time, and by rthe intrinsic rate of diversification.
(a) (Exponential Model) This model is described by $$\displaystyle{\frac{{{d}{N}}}{{{\left.{d}{t}\right.}}}}={r}_{{{e}}}{N}\ {\left({8.86}\right)}.$$ Solve (8.86) with the initial condition N(0) at time 0, and show that $$\displaystyle{r}_{{{e}}}$$ can be estimated from $$\displaystyle{r}_{{{e}}}={\frac{{{1}}}{{{t}}}}\ {\ln{\ }}{\left[{\frac{{{N}{\left({t}\right)}}}{{{N}{\left({0}\right)}}}}\right]}\ {\left({8.87}\right)}$$
(b) (Logistic Growth) This model is described by $$\displaystyle{\frac{{{d}{N}}}{{{\left.{d}{t}\right.}}}}={r}_{{{l}}}{N}\ {\left({1}\ -\ {\frac{{{N}}}{{{K}}}}\right)}\ {\left({8.88}\right)}$$ where K is the equilibrium value. Solve (8.88) with the initial condition N(0) at time 0, and show that $$\displaystyle{r}_{{{l}}}$$ can be estimated from $$\displaystyle{r}_{{{l}}}={\frac{{{1}}}{{{t}}}}\ {\ln{\ }}{\left[{\frac{{{K}\ -\ {N}{\left({0}\right)}}}{{{N}{\left({0}\right)}}}}\right]}\ +\ {\frac{{{1}}}{{{t}}}}\ {\ln{\ }}{\left[{\frac{{{N}{\left({t}\right)}}}{{{K}\ -\ {N}{\left({t}\right)}}}}\right]}\ {\left({8.89}\right)}$$ for $$\displaystyle{N}{\left({t}\right)}\ {<}\ {K}.$$
(c) Assume that $$\displaystyle{N}{\left({0}\right)}={1}$$ and $$\displaystyle{N}{\left({10}\right)}={1000}.$$ Estimate $$\displaystyle{r}_{{{e}}}$$ and $$\displaystyle{r}_{{{l}}}$$ for both $$\displaystyle{K}={1001}$$ and $$\displaystyle{K}={10000}.$$
(d) Use your answer in (c) to explain the following quote from Stanley (1979): There must be a general tendency for calculated values of $$\displaystyle{\left[{r}\right]}$$ to represent underestimates of exponential rates,because some radiation will have followed distinctly sigmoid paths during the interval evaluated.
(e) Explain why the exponential model is a good approximation to the logistic model when $$\displaystyle\frac{{N}}{{K}}$$ is small compared with 1.
Which would you expect to have a density curve that is higher at the mean: the standard normal distribution or a normal distribution with standard deviation 0.5? Explain.
A random sample of $$\displaystyle{n}_{{1}}={16}$$ communities in western Kansas gave the following information for people under 25 years of age.
$$\displaystyle{X}_{{1}}:$$ Rate of hay fever per 1000 population for people under 25
$$\begin{array}{|c|c|} \hline 97 & 91 & 121 & 129 & 94 & 123 & 112 &93\\ \hline 125 & 95 & 125 & 117 & 97 & 122 & 127 & 88 \\ \hline \end{array}$$
A random sample of $$\displaystyle{n}_{{2}}={14}$$ regions in western Kansas gave the following information for people over 50 years old.
$$\displaystyle{X}_{{2}}:$$ Rate of hay fever per 1000 population for people over 50
$$\begin{array}{|c|c|} \hline 94 & 109 & 99 & 95 & 113 & 88 & 110\\ \hline 79 & 115 & 100 & 89 & 114 & 85 & 96\\ \hline \end{array}$$
(i) Use a calculator to calculate $$\displaystyle\overline{{x}}_{{1}},{s}_{{1}},\overline{{x}}_{{2}},{\quad\text{and}\quad}{s}_{{2}}.$$ (Round your answers to two decimal places.)
(ii) Assume that the hay fever rate in each age group has an approximately normal distribution. Do the data indicate that the age group over 50 has a lower rate of hay fever? Use $$\displaystyle\alpha={0.05}.$$
(a) What is the level of significance?
State the null and alternate hypotheses.
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}<\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}>\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}=\mu_{{2}},{H}_{{1}}:\mu_{{1}}\ne\mu_{{2}}$$
$$\displaystyle{H}_{{0}}:\mu_{{1}}>\mu_{{2}},{H}_{{1}}:\mu_{{1}}=\mu_{{12}}$$
(b) What sampling distribution will you use? What assumptions are you making?
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations,
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations,
The Student's t. We assume that both population distributions are approximately normal with known standard deviations,
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimalplaces.)
What is the value of the sample test statistic? (Test the difference $$\displaystyle\mu_{{1}}-\mu_{{2}}$$. Round your answer to three decimal places.)
(c) Find (or estimate) the P-value.
P-value $$\displaystyle>{0.250}$$
$$\displaystyle{0.125}<{P}-\text{value}<{0},{250}$$
$$\displaystyle{0},{050}<{P}-\text{value}<{0},{125}$$
$$\displaystyle{0},{025}<{P}-\text{value}<{0},{050}$$
$$\displaystyle{0},{005}<{P}-\text{value}<{0},{025}$$
P-value $$\displaystyle<{0.005}$$
Sketch the sampling distribution and show the area corresponding to the P-value.
P.vaiue Pevgiue
P-value f P-value