How do I know if a Binomial model is appropriate?I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with x = 0 , 1 , 2 , 3 , 4 , 5 and freq, 2,7,11,12,6,2.I have worked out the unbiased population mean and estimate - but then I'm not sure whether binomial what I need or not? I have to decide if a bionomial model is appropriate.I can see that the data is discrete but its not binary like "event happens" or "event does not happen". It seems relatively symmetrical - and almost normally distributed? I'm not really sure how to work this out? Is a binomial model right or not?

Question

How do I know if a Binomial model is appropriate?I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with   x  =  0  ,  1  ,  2  ,  3  ,  4  ,  5 and freq, 2,7,11,12,6,2.I have worked out the unbiased population mean and estimate - but then I&#039;m not sure whether binomial what I need or not? I have to decide if a bionomial model is appropriate.I can see that the data is discrete but its not binary like &quot;event happens&quot; or &quot;event does not happen&quot;. It seems relatively symmetrical - and almost normally distributed? I&#039;m not really sure how to work this out? Is a binomial model right or not?

Henry Arellano · Accepted Answer

Step 1If this is your first chi-squared test, the clues in the comments may be a bit too sparse. Without working the problem for you, I offer the following more complete outline: (Use it along with whatever examples your text or class notes may have to offer.)It is appropriate to try a binomial model, and obviously   n  =  5. From the given data you can find the sample mean of the 40 observations. By looking at the PDF of Binom(5,0.495). you can find the expected counts       E    i    . (multiply the probabilities by 40.) Your observed counts are   F  =  (  2  ,  7  ,  11  ,  12  ,  6  ,  2  )  .Step 2Next, you can find the chi-squared statistic   Q  =      ∑          i      =      0        5              (              F        i            −              E        i                    )        2                    E      i        , which is approximately distributed as       C    h    i    s    q    (  ν  =  4  )  . [Ordinarily, a chi-squared test with 6 categories would have   ν  =  6  −  1  =  5  , but you have used the data to estimate parameter p, so you &#039;lose&#039; a degree of freedom for that and   ν  =  4.  ]]I got   Q  =  1.1815. The critical value for a chi-squared test with   ν  =  4 at the 5% level is the 95th percentile   c  =  9.487 of       C    h    i    s    q    (  ν  =  4  )  . You can find this number in printed tables of the chi-squared distribution or using software (as with R below).qchisq(.95, 4)9.487729This means that you would reject the null hypothesis that the data are consistent with       B    i    n    o    m    (  n  =  5  ,  p  =  0.495  ) only if   Q  ≥  c  =  9.487.There is one remaining difficulty. The chi-squared test is usually deemed to be accurate only if all expected counts exceed 5. Your first and last about:blanks are too small. One cure for this is to combine &#039;categories&#039; 0 and 1, and &#039;categories&#039; 4 and 5. In each tail, combine categories by adding the two observed frequencies and adding the two expected frequencies.You will now have four categories and   ν  =  4  −  1  −  1  =  2 degrees of freedom. Re-compute Q and find the new c (as below). [According to my computations, you will still not reject       H    0    .  ]]qchisq(.95, 2)[1] 5.991465

How do I know if a Binomial model is appropriate? I have a question which is about the number of weeks out of 5 in which an event occurs. I have a frequency table with a sample of 40 - with x = 0,1,2,3,4,5 and freq, 2,7,11,12,6,2.

Answered question

Answer & Explanation

New Questions in College Statistics