Sample Standard Deviation vs. Population Standard Deviation. I have an HP 50g graphing calculator and I am using it to calculate the standard deviation of some data. In the statistics calculation there is a type which can have two values: Sample Population

link223mh 2022-10-26 Answered
Sample Standard Deviation vs. Population Standard Deviation
I have an HP 50g graphing calculator and I am using it to calculate the standard deviation of some data. In the statistics calculation there is a type which can have two values:
Sample Population
I didn't change it, but I kept getting the wrong results for the standard deviation. When I changed it to "Population" type, I started getting correct results!
Why is that? As far as I know, there is only one type of standard deviation which is to calculate the root-mean-square of the values!
Did I miss something?
You can still ask an expert for help

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

Solve your problem for the price of one coffee

  • Available 24/7
  • Math expert for every subject
  • Pay only if we can solve it
Ask Question

Answers (1)

Ostrakodec3
Answered 2022-10-27 Author has 18 answers
Step 1
There are, in fact, two different formulas for standard deviation here: The population standard deviation σ and the sample standard deviation s.
If x 1 , x 2 , , x N denote all N values from a population, then the (population) standard deviation is
σ = 1 N i = 1 N ( x i μ ) 2 ,
where μ is the mean of the population.
If x 1 , x 2 , , x N denote N values from a sample, however, then the (sample) standard deviation is
s = 1 N 1 i = 1 N ( x i x ¯ ) 2 ,
where x ¯ is the mean of the sample.
Step 2
The reason for the change in formula with the sample is this: When you're calculating s you are normally using s 2 (the sample variance) to estimate σ 2 (the population variance). The problem, though, is that if you don't know σ you generally don't know the population mean μ, either, and so you have to use x ¯ in the place in the formula where you normally would use μ. Doing so introduces a slight bias into the calculation: Since x ¯ is calculated from the sample, the values of xi are on average closer to x i than they would be to μ, and so the sum of squares i = 1 N ( x i x ¯ ) 2 turns out to be smaller on average than i = 1 N ( x i μ ) 2 . It just so happens that that bias can be corrected by dividing by N 1 instead of N. (Proving this is a standard exercise in an advanced undergraduate or beginning graduate course in statistical theory.) The technical term here is that s 2 (because of the division by N 1) is an unbiased estimator of σ 2 .
Another way to think about it is that with a sample you have N independent pieces of information. However, since x ¯ is the average of those N pieces, if you know x 1 x ¯ , x 2 x ¯ , , x N 1 x ¯ , you can figure out what x N x ¯ is. So when you're squaring and adding up the residuals x i x ¯ , there are really only N 1independent pieces of information there. So in that sense perhaps dividing by N 1 rather than N makes sense. The technical term here is that there are N 1 degrees of freedom in the residuals x i x ¯ .
Did you like this example?
Subscribe for all access

Expert Community at Your Service

  • Live experts 24/7
  • Questions are typically answered in as fast as 30 minutes
  • Personalized clear answers
Learn more

You might be interested in

asked 2022-11-02
Solve PDE using method of characteristics with non-local boundary conditions.
Given the population model by the following linear first order PDE in u(a,t) with constants b and μ:
u a + u t = μ t u a , t > 0
u ( a , 0 ) = u 0 ( a ) a 0
u ( 0 , t ) = F ( t ) = b 0 u ( a , t ) d a
We can split the integral in two with our non-local boundary data:
F ( t ) = b 0 t u ( a , t ) d a + b t u ( a , t ) d a
Choosing the characteristic coordinates ( ξ , τ ) and re-arranging the expression to form the normal to the solution surface we have the following equation with initial conditions:
( u a , u t , 1 ) ( 1 , 1 , μ t u ) = 0
x ( 0 ) = ξ , t ( 0 ) = 0 , u ( 0 ) = u 0 ( ξ )
Characteristic equations:
d a d τ = 1 , d t d τ = 1 , d u d τ = μ t u
Solving each of these ODE's in τ gives the following:
( 1 ) d a = d τ ( 2 ) d t = d τ ( 3 ) d u = μ t u d τ
a = τ + F ( ξ ) t = τ + F ( ξ )
a = τ + ξ t = τ
d u = μ τ u d τ
1 u d u = μ τ d τ
ln u = 1 2 μ τ 2 + F ( ξ )
u = G ( ξ ) e 1 2 μ τ 2
u = u 0 ( ξ ) e 1 2 μ τ 2
Substituting back the original coordinates we can re-write this expression with a coordinate change:
ξ = a t τ = t
u ( a , t ) = u 0 ( a t ) e 1 2 t 2
Now this is where I get stuck, how do I use the boundary data to come up with a well-posed solution?
u ( 0 , t ) = u 0 ( t ) e 1 2 μ t 2 = b 0 t u ( a , t ) d a + b t u ( a , t ) d a
asked 2022-11-22
Is the posterior always a compromise between the prior and the data?
Suppose that we are interested in learning the proportion of the population θ with a particular property (for instance, the fraction of the population who are male). Suppose that we randomly sample n members of this population (with replacement, to make things easier) and observe that y of them have the property (so the fraction of the sample with the property is y/n). We start with a continuous prior p( θ) with full support [0,1] and update this using Bayes rule.
Question: does the expected value of the posterior always lie between the prior expectation and the sample fraction y/n?
asked 2022-11-04
Finding the ( X X ¯ ) 2 of first 5 data of dataset given mean and population variance
Mean and population variance of the dataset x 1 , x 2 . . x 10 are 19 and 49 respectively. If the value i = 6 10 x i 2 = 1900, what is the value of i = 1 5 x i 2 = ?.
I've solved it as following and it is wrong:
Population variance:
S 2 = ( x i x ¯ ) 2 n 49 = i = 1 5 x i 2 10 + 1900 10 49 = i = 1 5 x i 2 10 + 190 190 + 49 = i = 1 5 x i 2 10 i = 1 5 x i 2 = 141 10 = 1410
And this solution is wrong. How to solve this problem?
asked 2022-10-28
Modelling wealth with a Pareto distribution: how do I estimate the parameters?
I wish to create a function that will estimate the wealth of a person in the United States. It would be used to make a table with each decile and their estimated wealth.
This estimate will be based on very rudimentary data, and is only for personal interest. The data is:
- The total wealth of the bottom 90% is equal to the total wealth of the top 0.1%.
- Both proportions have 22% of the total wealth.
- The total wealth under the distribution is $80 trillion.
- The total population is 160 million households.
Given this data, how would I create parameter estimates for the exponent and scale of a pareto distribution? What would be f(x) where x is from (0,1), and the solution is the wealth of someone richer than that proportion of people? For example f(0.1) is someone richer than or equal to exactly 10% of the least wealthy, and could equal 1,000 dollars. F(0.5) is the median wealth, and could be 200,000 dollars. F(0.9999) is richer than 99.99% and would be somewhere in the tens or hundreds of millions of dollars.
asked 2022-11-19
Degree of freedom and corrected standard deviation
It is often said that degree of freedom causes the need for standard deviation formula to be corrected. When explaining degree of freedom, it is often said that when one knows the mean of the formula, only n 1 data are actually needed, as the last data can be determined using mean and n 1 data. However, I see the same thing occuring in population - not just in sample. So what's going on here, and how is this justification really working?
For example, in simple linear regression model, variance of error terms are often sum of variance of each data divided by n 2. This is justified as said above. But if this justification is also true for population, not just sample, how is this really working?
asked 2022-11-07
Analytical solution to mixed distribution fit to failure time data - Lambert W perhaps?
I have a set of n device failure times { t i > 0 } for i = 1... n and N n devices which have not yet failed. Using maximum likelihood I am attempting to find a closed-form analytical solution to fit the data to the following cumulative distribution function:
F ( t | λ , p ) = p ( 1 e λ t )
where 0 < p < 1 is the asymptotic fraction of units to eventually fail and λ > 0 the sub-population failure rate. The likelihood for this MLE attempt is given by:
L = ( 1 F ( t n ) ) N n i = 1 n f ( t i )
and
ln L = ( N n ) ln ( 1 p + p e λ t n ) + n λ p λ i = 1 n t i
with pdf of f ( t ) = d F / d t = λ p e λ t . Here we take λ p L = 0 or λ p ln L = 0 to solve for p and λ at max likelihood (or log likelihood). I've just recently learned a smidgen about the Lambert W function and was hoping that someone with a more nimble mind than mine might be able to derive a closed form solution using this and/or other cleverness.
asked 2022-10-23
The exercise statement (roughly): Assume there is a terrorist prevention system that has a 99% chance of correctly identifying a future terrorist and 99.9% chance of correctly identifying someone that is not a future terrorist. If there are 1000 future terrorists among the 300 million people population, and one individual is chosen randomly from the population, then processed by the system and deemed a terrorist. What is the chance that the individual is a future terrorist?
Attempted exercise solution:
I use the following event labels:
A -> The person is a future terrorist
B -> The person is identified as a terrorist
Then, some other data:
P ( A ) = 10 3 3 10 8 = 1 3 10 5
P ( A ¯ ) = 1 P ( A )
P ( B A ) = 0.99
P ( B ¯ A ) = 1 P ( B A )
P ( B ¯ A ¯ ) = 0.999
P ( B A ¯ ) = 1 P ( B ¯ A ¯ )
What I need to find is the chance that someone identified as a terrorist, is actually a terrorist. I express that through P(A | B) and use Bayes Theorem to find its value.
P ( A B ) = P ( A B ) P ( B ) = P ( B A ) P ( A ) P ( B A ) P ( A ) + P ( B A ¯ ) P ( A ¯ )
The answer I get after plugging-in all the values is: 3.29 10 3 , the book's answer is 3.29 10 4 .
Can someone help me identify what I'm doing wrong? Also, in either case, I find that it is very unintuitive that the probability of success is so small. If someone could explain it to me in more intuitive terms I'd be very grateful.

New questions