I have heteroskedastic data of unequal sample sizes and would like to run a two way welch ANOVA.1.) Is this appropriate? Why or why not?2.) How do you do this in r?3.) What are other ways of dealing with this situation?

Question

Deanna Sweeney · Accepted Answer

(1) If group variances differ, it is probably better to use the Welch ANOVA than the standard ANOVA. For a two-sample t test, it is clear from many simulation studies that the Welch test is better than the &#039;pooled&#039; test. For an ANOVA with only three treatment groups, there are many simulation studies to do. In my view, not enough of them have been done to be sure yet whether the Welch ANOVA should be used as the default method.(2) See brief demo below. More extensive demonstrations on various Internet sites show more detail, including diagnostics and multiple-comparison procedures.(3) Depending on the nature of the data, variances might be made more nearly the same by transforming the data. Two examples: if data are exponential, using logs of the data tends to make variances more nearly equal; if data are Poisson, taking square roots of the counts makes variances more nearly equal, but multiple comparisons are not straightforward, and it is not clear that the transformation gives better power.Illustration of Welch test in R for a one-factor ANOVA design. Heteroscedastic data. For the Welch test notice that denominator df for the F-test is about 18, not 27. For the particular simulated data used, there is little difference in the P-value.# Simulated data: 3 groups, 10 replications per groupset.seed(1214) # use same seed for same datax1 = rnorm(10, 100, 15);  x2 = rnorm(10, 105, 20);  x3 = rnorm(10, 110, 15)x = c(x1, x2, x3);  gp = as.factor(rep(1:3, each=10))# Welch ANOVAoneway.test(x ~ gp)    One-way analysis of means (not assuming equal variances)data:  x and gpF = 3.8698, num df = 2.00, denom df = 17.91, p-value = 0.0401# standard ANOVA   &amp;gt; summary(aov(x ~ gp))            Df Sum Sq Mean Sq F value Pr(&amp;gt;F)  gp           2   2129  1064.7    3.63 0.0402 *Residuals   27   7919   293.3                 ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

I have heteroskedastic data of unequal sample sizes and would like to run a two way welch ANOVA. 1.) Is this appropriate? Why or why not? 2.) How do you do this in r? 3.) What are other ways of dealing with this situation?

Answered question

Answer & Explanation

New Questions in High school statistics