 Baardegem3Gw

2022-11-24

How do I know if a Binomial model is appropriate?
I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with $x=0,1,2,3,4,5$ and freq, 2,7,11,12,6,2.
I have worked out the unbiased population mean and estimate - but then I'm not sure whether binomial what I need or not? I have to decide if a bionomial model is appropriate.
I can see that the data is discrete but its not binary like "event happens" or "event does not happen". It seems relatively symmetrical - and almost normally distributed? I'm not really sure how to work this out? Is a binomial model right or not? Henry Arellano

Expert

Step 1
If this is your first chi-squared test, the clues in the comments may be a bit too sparse. Without working the problem for you, I offer the following more complete outline: (Use it along with whatever examples your text or class notes may have to offer.)
It is appropriate to try a binomial model, and obviously $n=5.$ From the given data you can find the sample mean of the 40 observations.
By looking at the PDF of Binom(5,0.495). you can find the expected counts ${E}_{i}.$ (multiply the probabilities by 40.) Your observed counts are $F=\left(2,7,11,12,6,2\right).$
Step 2
Next, you can find the chi-squared statistic $Q=\sum _{i=0}^{5}\frac{\left({F}_{i}-{E}_{i}{\right)}^{2}}{{E}_{i}},$ which is approximately distributed as $\mathsf{C}\mathsf{h}\mathsf{i}\mathsf{s}\mathsf{q}\left(\nu =4\right).$ [Ordinarily, a chi-squared test with 6 categories would have $\nu =6-1=5,$ but you have used the data to estimate parameter p, so you 'lose' a degree of freedom for that and $\nu =4.\right]$]
I got $Q=1.1815.$ The critical value for a chi-squared test with $\nu =4$ at the 5% level is the 95th percentile $c=9.487$ of $\mathsf{C}\mathsf{h}\mathsf{i}\mathsf{s}\mathsf{q}\left(\nu =4\right).$ You can find this number in printed tables of the chi-squared distribution or using software (as with R below).
qchisq(.95, 4)
9.487729
This means that you would reject the null hypothesis that the data are consistent with $\mathsf{B}\mathsf{i}\mathsf{n}\mathsf{o}\mathsf{m}\left(n=5,p=0.495\right)$ only if $Q\ge c=9.487.$
There is one remaining difficulty. The chi-squared test is usually deemed to be accurate only if all expected counts exceed 5. Your first and last about:blanks are too small. One cure for this is to combine 'categories' 0 and 1, and 'categories' 4 and 5. In each tail, combine categories by adding the two observed frequencies and adding the two expected frequencies.
You will now have four categories and $\nu =4-1-1=2$ degrees of freedom. Re-compute Q and find the new c (as below). [According to my computations, you will still not reject ${H}_{0}.\right]$]
qchisq(.95, 2)
 5.991465

Do you have a similar question?