# If we add a constant (say, d) for all data values, how this will affect the geometric mean? Give an example.

Question
Data distributions
If we add a constant (say, d) for all data values, how this will affect the geometric mean? Give an example.

2021-02-04

Geometric mean of a data distribution $$\displaystyle{a}_{{1}},{a}_{{2}},{a}_{{3}},..,{a}_{{n}}$$ is $$n^{th}$$ square root of product of all data points and is calculated using the below mentioned formula $$\displaystyle{G}{M}={\sqrt[{{n}}]{{{a}_{{1}}\cdot{a}_{{2}}\cdot\ldots\cdot{a}_{{n}}}}}={\left({x}_{{1}}\cdot{x}_{{2}}\cdot\ldots\cdot{x}_{{n}}\right)}^{{\frac{{1}}{{n}}}}$$
When a constant d is added to all data values, the product of all numbers will increase, so the geometric mean of the data distribution also increases by adding a constant to all values.
Let us consider an example data distribution 2,4,8,16,32
The geometric mean is calculated as shown below
$$\displaystyle{G}{M}={\sqrt[{{n}}]{{{a}_{{1}}\cdot{a}_{{2}}\cdot\ldots\cdot{a}_{{n}}}}}$$
$$\displaystyle={\sqrt[{{5}}]{{{2}\cdot{4}\cdot{8}\cdot{16}\cdot{32}}}}$$
$$\displaystyle={\sqrt[{{5}}]{{{32768}}}}$$
$$=8$$
If we add a constant 2 to all data points then the data distribution will be 4 ,6 ,10, 18, 34. The geometric mean is calculated as shown below:
$$\displaystyle{G}{M}={\sqrt[{{n}}]{{{a}_{{1}}\cdot{a}_{{2}}\cdot\ldots\cdot{a}_{{n}}}}}$$
$$\displaystyle={\sqrt[{{5}}]{{{4}\cdot{6}\cdot{10}\cdot{18}\cdot{34}}}}$$
$$\displaystyle={\sqrt[{{5}}]{{{146880}}}}$$
$$=10.7992$$
Here it is clearly shown that adding a constant to all data values increases the geometric mean of the data distribution.

### Relevant Questions

We will now add support for register-memory ALU operations to the classic five-stage RISC pipeline. To offset this increase in complexity, all memory addressing will be restricted to register indirect (i.e., all addresses are simply a value held in a register; no offset or displacement may be added to the register value). For example, the register-memory instruction add x4, x5, (x1) means add the contents of register x5 to the contents of the memory location with address equal to the value in register x1 and put the sum in register x4. Register-register ALU operations are unchanged. The following items apply to the integer RISC pipeline:
a. List a rearranged order of the five traditional stages of the RISC pipeline that will support register-memory operations implemented exclusively by register indirect addressing.
b. Describe what new forwarding paths are needed for the rearranged pipeline by stating the source, destination, and information transferred on each needed new path.
c. For the reordered stages of the RISC pipeline, what new data hazards are created by this addressing mode? Give an instruction sequence illustrating each new hazard.
d. List all of the ways that the RISC pipeline with register-memory ALU operations can have a different instruction count for a given program than the original RISC pipeline. Give a pair of specific instruction sequences, one for the original pipeline and one for the rearranged pipeline, to illustrate each way.
Hint for (d): Give a pair of instruction sequences where the RISC pipeline has “more” instructions than the reg-mem architecture. Also give a pair of instruction sequences where the RISC pipeline has “fewer” instructions than the reg-mem architecture.

A random sample of $$n_1 = 14$$ winter days in Denver gave a sample mean pollution index $$x_1 = 43$$.
Previous studies show that $$\sigma_1 = 19$$.
For Englewood (a suburb of Denver), a random sample of $$n_2 = 12$$ winter days gave a sample mean pollution index of $$x_2 = 37$$.
Previous studies show that $$\sigma_2 = 13$$.
Assume the pollution index is normally distributed in both Englewood and Denver.
(a) State the null and alternate hypotheses.
$$H_0:\mu_1=\mu_2.\mu_1>\mu_2$$
$$H_0:\mu_1<\mu_2.\mu_1=\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1<\mu_2$$
$$H_0:\mu_1=\mu_2.\mu_1\neq\mu_2$$
(b) What sampling distribution will you use? What assumptions are you making? NKS The Student's t. We assume that both population distributions are approximately normal with known standard deviations.
The standard normal. We assume that both population distributions are approximately normal with unknown standard deviations.
The standard normal. We assume that both population distributions are approximately normal with known standard deviations.
The Student's t. We assume that both population distributions are approximately normal with unknown standard deviations.
(c) What is the value of the sample test statistic? Compute the corresponding z or t value as appropriate.
(Test the difference $$\mu_1 - \mu_2$$. Round your answer to two decimal places.) NKS (d) Find (or estimate) the P-value. (Round your answer to four decimal places.)
(e) Based on your answers in parts (i)−(iii), will you reject or fail to reject the null hypothesis? Are the data statistically significant at level \alpha?
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are not statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we fail to reject the null hypothesis and conclude the data are statistically significant.
At the $$\alpha = 0.01$$ level, we reject the null hypothesis and conclude the data are not statistically significant.
(f) Interpret your conclusion in the context of the application.
Reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is insufficient evidence that there is a difference in mean pollution index for Englewood and Denver.
Fail to reject the null hypothesis, there is sufficient evidence that there is a difference in mean pollution index for Englewood and Denver. (g) Find a 99% confidence interval for
$$\mu_1 - \mu_2$$.
(Round your answers to two decimal places.)
lower limit
upper limit
(h) Explain the meaning of the confidence interval in the context of the problem.
Because the interval contains only positive numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, we can not say that the mean population pollution index for Englewood is different than that of Denver.
Because the interval contains both positive and negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is greater than that of Denver.
Because the interval contains only negative numbers, this indicates that at the 99% confidence level, the mean population pollution index for Englewood is less than that of Denver.
1. The standard error of the estimate is the same at all points along the regression line because we assumed that A. The observed values of y are normally distributed around each estimated value of y-hat. B. The variance of the distributions around each possible value of y-hat is the same. C. All available data were taken into account when the regression line was calculated. D. The regression line minimized the sum of the squared errors. E. None of the above.
The table below shows the number of people for three different race groups who were shot by police that were either armed or unarmed. These values are very close to the exact numbers. They have been changed slightly for each student to get a unique problem.
Suspect was Armed:
Black - 543
White - 1176
Hispanic - 378
Total - 2097
Suspect was unarmed:
Black - 60
White - 67
Hispanic - 38
Total - 165
Total:
Black - 603
White - 1243
Hispanic - 416
Total - 2262
Give your answer as a decimal to at least three decimal places.
a) What percent are Black?
b) What percent are Unarmed?
c) In order for two variables to be Independent of each other, the P $$(A and B) = P(A) \cdot P(B) P(A and B) = P(A) \cdot P(B).$$
This just means that the percentage of times that both things happen equals the individual percentages multiplied together (Only if they are Independent of each other).
Therefore, if a person's race is independent of whether they were killed being unarmed then the percentage of black people that are killed while being unarmed should equal the percentage of blacks times the percentage of Unarmed. Let's check this. Multiply your answer to part a (percentage of blacks) by your answer to part b (percentage of unarmed).
Remember, the previous answer is only correct if the variables are Independent.
d) Now let's get the real percent that are Black and Unarmed by using the table?
If answer c is "significantly different" than answer d, then that means that there could be a different percentage of unarmed people being shot based on race. We will check this out later in the course.
Let's compare the percentage of unarmed shot for each race.
e) What percent are White and Unarmed?
f) What percent are Hispanic and Unarmed?
If you compare answers d, e and f it shows the highest percentage of unarmed people being shot is most likely white.
Why is that?
This is because there are more white people in the United States than any other race and therefore there are likely to be more white people in the table. Since there are more white people in the table, there most likely would be more white and unarmed people shot by police than any other race. This pulls the percentage of white and unarmed up. In addition, there most likely would be more white and armed shot by police. All the percentages for white people would be higher, because there are more white people. For example, the table contains very few Hispanic people, and the percentage of people in the table that were Hispanic and unarmed is the lowest percentage.
Think of it this way. If you went to a college that was 90% female and 10% male, then females would most likely have the highest percentage of A grades. They would also most likely have the highest percentage of B, C, D and F grades
The correct way to compare is "conditional probability". Conditional probability is getting the probability of something happening, given we are dealing with just the people in a particular group.
g) What percent of blacks shot and killed by police were unarmed?
h) What percent of whites shot and killed by police were unarmed?
i) What percent of Hispanics shot and killed by police were unarmed?
You can see by the answers to part g and h, that the percentage of blacks that were unarmed and killed by police is approximately twice that of whites that were unarmed and killed by police.
j) Why do you believe this is happening?
Do a search on the internet for reasons why blacks are more likely to be killed by police. Read a few articles on the topic. Write your response using the articles as references. Give the websites used in your response. Your answer should be several sentences long with at least one website listed. This part of this problem will be graded after the due date.
The bulk density of soil is defined as the mass of dry solidsper unit bulk volume. A high bulk density implies a compact soilwith few pores. Bulk density is an important factor in influencing root development, seedling emergence, and aeration. Let X denotethe bulk density of Pima clay loam. Studies show that X is normally distributed with $$\displaystyle\mu={1.5}$$ and $$\displaystyle\sigma={0.2}\frac{{g}}{{c}}{m}^{{3}}$$.
(a) What is thedensity for X? Sketch a graph of the density function. Indicate onthis graph the probability that X lies between 1.1 and 1.9. Findthis probability.
(b) Find the probability that arandomly selected sample of Pima clay loam will have bulk densityless than $$\displaystyle{0.9}\frac{{g}}{{c}}{m}^{{3}}$$.
(c) Would you be surprised if a randomly selected sample of this type of soil has a bulkdensity in excess of $$\displaystyle{2.0}\frac{{g}}{{c}}{m}^{{3}}$$? Explain, based on theprobability of this occurring.
(d) What point has the property that only 10% of the soil samples have bulk density this high orhigher?
(e) What is the moment generating function for X?
The following is an 8051 instruction: CJNE A, # 'Q' ,AHEAD
a) what is the opcode for this instruction?
b) how many bytes long is this instruction?
c) explain the purpose of each byte of this instruction.
d) how many machine cycles are required to execute this instruction?
e) If an 8051 is operating from a 10 MHz crystal, how longdoes this instruction take to execute?
1. Find each of the requested values for a population with a mean of $$? = 40$$, and a standard deviation of $$? = 8$$ A. What is the z-score corresponding to $$X = 52?$$ B. What is the X value corresponding to $$z = - 0.50?$$ C. If all of the scores in the population are transformed into z-scores, what will be the values for the mean and standard deviation for the complete set of z-scores? D. What is the z-score corresponding to a sample mean of $$M=42$$ for a sample of $$n = 4$$ scores? E. What is the z-scores corresponding to a sample mean of $$M= 42$$ for a sample of $$n = 6$$ scores? 2. True or false: a. All normal distributions are symmetrical b. All normal distributions have a mean of 1.0 c. All normal distributions have a standard deviation of 1.0 d. The total area under the curve of all normal distributions is equal to 1 3. Interpret the location, direction, and distance (near or far) of the following zscores: $$a. -2.00 b. 1.25 c. 3.50 d. -0.34$$ 4. You are part of a trivia team and have tracked your team’s performance since you started playing, so you know that your scores are normally distributed with $$\mu = 78$$ and $$\sigma = 12$$. Recently, a new person joined the team, and you think the scores have gotten better. Use hypothesis testing to see if the average score has improved based on the following 8 weeks’ worth of score data: $$82, 74, 62, 68, 79, 94, 90, 81, 80$$. 5. You get hired as a server at a local restaurant, and the manager tells you that servers’ tips are $42 on average but vary about $$12 (\mu = 42, \sigma = 12)$$. You decide to track your tips to see if you make a different amount, but because this is your first job as a server, you don’t know if you will make more or less in tips. After working 16 shifts, you find that your average nightly amount is$44.50 from tips. Test for a difference between this value and the population mean at the $$\alpha = 0.05$$ level of significance.