This document will show how to perform an ANOVA across multiply imputed data sets.

There is a built-in function from the ‘miceadds’ package that can perform multiple imputation for analysis of variance (called ‘mi.anova()’).

Let’s begin by reading in the data set.

data_AcadAchiev = read.csv('/Users/jhelm/Desktop/data_AcadAchiev.csv')

Performing an ANOVA

We will need the car package to calculate type 3 sums of squares.

library(car)

As an initial set, we need to make sure we are using an effect coding strategy. This detail is specific to ANOVA, not multiple imputation.

options(contrasts = c('contr.sum', 'contr.poly'))

Now lets fit the model that tests for guardian differences across Math scores that were collected from the first semester.

model.01 = lm(Math01 ~ Guardian, data = data_AcadAchiev)

Anova(model.01, type = 3)
## Anova Table (Type III tests)
## 
## Response: Math01
##              Sum Sq  Df  F value  Pr(>F)    
## (Intercept) 10295.6   1 948.5357 < 2e-16 ***
## Guardian       69.5   2   3.2033 0.04221 *  
## Residuals    2865.5 264                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Performing an ANOVA with Multiply Imputed Data Sets

First, we need to create the multiply imputed data sets.

library(mice)
library(miceadds)

Create the imputed data sets

imp_data = mice(data_AcadAchiev, m = 40, seed = 142)

    # This will create 40 imputed data sets to fill in the missing
    # values from the data set 'data_AcadAchiev'

    # If we set the seed value (Jon recommends this), then we will
    # reproduce the results if we rerun the imputation 

Perform the analysis on each of the imputed data sets with the ‘mi.anova()’ function from the ‘miceadds’ library.

mi.anova(mi.res = imp_data, 
        formula = "Math01 ~ 1 + Guardian", 
        type = 3)
## Univariate ANOVA for Multiply Imputed Data (Type 3)  
## 
## lm Formula:  Math01 ~ 1 + Guardian
## R^2=0.0147 
## ..........................................................................
## ANOVA Table 
##                 SSQ df1     df2 F value  Pr(>F)    eta2 partial.eta2
## Guardian   62.95282   2 58409.1  2.7323 0.06508 0.01472      0.01472
## Residual 4214.45732  NA      NA      NA      NA      NA           NA

Performing a Two-Way ANOVA

We will need the car package to calculate type 3 sums of squares.

library(car)

As an initial set, we need to make sure we are using an effect coding strategy. This detail is specific to ANOVA, not multiple imputation.

options(contrasts = c('contr.sum', 'contr.poly'))

Now lets fit the model that tests for relationship status by biological sex differences across Math scores that were collected from the first semester.

model.01 = lm(Math01 ~ Rel_status + Sex + Rel_status * Sex, data = data_AcadAchiev)

Anova(model.01, type = 3)
## Anova Table (Type III tests)
## 
## Response: Math01
##                 Sum Sq  Df   F value  Pr(>F)    
## (Intercept)    28064.8   1 2562.9967 < 2e-16 ***
## Rel_status        13.2   1    1.2056 0.27320    
## Sex               35.3   1    3.2277 0.07355 .  
## Rel_status:Sex     8.6   1    0.7840 0.37673    
## Residuals       2879.9 263                      
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Performing an ANOVA with Multiply Imputed Data Sets

First, we need to create the multiply imputed data sets.

library(mice)
library(miceadds)

Create the imputed data sets

imp_data = mice(data_AcadAchiev, m = 40, seed = 142)

    # This will create 40 imputed data sets to fill in the missing
    # values from the data set 'data_AcadAchiev'

    # If we set the seed value (Jon recommends this), then we will
    # reproduce the results if we rerun the imputation 

Perform the analysis on each of the imputed data sets with the ‘mi.anova()’ function from the ‘miceadds’ library.

mi.anova(mi.res = imp_data, 
        formula = "Math01 ~ 1 + Rel_status + Sex + Rel_status * Sex", 
        type = 3)
## Univariate ANOVA for Multiply Imputed Data (Type 3)  
## 
## lm Formula:  Math01 ~ 1 + Rel_status + Sex + Rel_status * Sex
## R^2=0.0179 
## ..........................................................................
## ANOVA Table 
##                       SSQ df1        df2 F value  Pr(>F)    eta2
## Rel_status        5.84013   1  19297.050  0.4583 0.49842 0.00137
## Sex              69.98155   1   9003.745  5.8390 0.01569 0.01642
## Rel_status:Sex    0.60642   1 105463.481  0.0333 0.85525 0.00014
## Residual       4186.17380  NA         NA      NA      NA      NA
##                partial.eta2
## Rel_status          0.00139
## Sex                 0.01644
## Rel_status:Sex      0.00014
## Residual                 NA