Improve Your Understanding of College Level Statistics Problems

Recent questions in College Statistics
College StatisticsAnswered question
atgnybo4fq atgnybo4fq 2022-11-04

Determining sample size of a set of boolean data where the probability is not 50%
I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.
Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".
However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.
Here are my current assumptions about the state of this problem:
- My initial experiment of rolling until success is one of geometric distribution.
- However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.
A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:
n = N × X X + N 1
X = Z α / 2 2 ­ × p × ( 1 p ) M O E 2
n is sample size
N is population size
Z is critical value ( α is 1 c o n f i d e n c e   l e v e l   a s   p r o b a b i l i t y )
p is sample proportion
MOE is margin of error
As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?
Here is the math executed on my above numbers. I will use Z a / 2 = 2.58 for α = 0.01, p = 0.001 and M O E = 0.005. As stated above, N = 699 on account of there being 699 failure cases that I would like to sample with a certain level of confidence.
Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.
Doing the math, X = 265.989744 and n = 192.8722086653 193, implying that I can have a sample size of 193 to fulfill this confidence level and interval.
My main question is whether my assumption about p = 1 1000 is valid. If it's not, and I use the conservative p = 0.5, then my sample size shoots up to 692. So I would like to know if my assumptions about what sample proportion actually is are correct.
More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all?

It’s not surprising that we encounter numerous ""I need help with statistics problems"" requests online because these are met everywhere, not only in economics or engineering. The majority of college statistics problems are also met in Sociology, Journalism, Healthcare, and Political Science. As long as you have statistics problems with solutions and answers, you will find solutions. Take a look at our college statistics math problems to find the answers. These will help college statistics problems be resolved. As you seek help with statistics problems, take your time to explore provided solutions and compare them with your initial instructions.