I have heteroskedastic data of unequal sample sizes and would like to run a two way welch ANOVA. 1.) Is this appropriate? Why or why not? 2.) How do you do this in r? 3.) What are other ways of dealing with this situation?

spasiocuo43

spasiocuo43

Answered question

2022-11-04

I have heteroskedastic data of unequal sample sizes and would like to run a two way welch ANOVA.
1.) Is this appropriate? Why or why not?
2.) How do you do this in r?
3.) What are other ways of dealing with this situation?

Answer & Explanation

Deanna Sweeney

Deanna Sweeney

Beginner2022-11-05Added 14 answers

(1) If group variances differ, it is probably better to use the Welch ANOVA than the standard ANOVA. For a two-sample t test, it is clear from many simulation studies that the Welch test is better than the 'pooled' test. For an ANOVA with only three treatment groups, there are many simulation studies to do. In my view, not enough of them have been done to be sure yet whether the Welch ANOVA should be used as the default method.
(2) See brief demo below. More extensive demonstrations on various Internet sites show more detail, including diagnostics and multiple-comparison procedures.
(3) Depending on the nature of the data, variances might be made more nearly the same by transforming the data. Two examples: if data are exponential, using logs of the data tends to make variances more nearly equal; if data are Poisson, taking square roots of the counts makes variances more nearly equal, but multiple comparisons are not straightforward, and it is not clear that the transformation gives better power.
Illustration of Welch test in R for a one-factor ANOVA design. Heteroscedastic data. For the Welch test notice that denominator df for the F-test is about 18, not 27. For the particular simulated data used, there is little difference in the P-value.
# Simulated data: 3 groups, 10 replications per group
set.seed(1214) # use same seed for same data
x1 = rnorm(10, 100, 15); x2 = rnorm(10, 105, 20); x3 = rnorm(10, 110, 15)
x = c(x1, x2, x3); gp = as.factor(rep(1:3, each=10))
# Welch ANOVA
oneway.test(x ~ gp)
One-way analysis of means (not assuming equal variances)
data: x and gp
F = 3.8698, num df = 2.00, denom df = 17.91, p-value = 0.0401

# standard ANOVA
> summary(aov(x ~ gp))
Df Sum Sq Mean Sq F value Pr(>F)
gp 2 2129 1064.7 3.63 0.0402 *
Residuals 27 7919 293.3
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Do you have a similar question?

Recalculate according to your conditions!

New Questions in High school statistics

Ask your question.
Get an expert answer.

Let our experts help you. Answer in as fast as 15 minutes.

Didn't find what you were looking for?