Analysis (with
)
Once you have a Mids
object containing imputed data, you can use it to perform repeated analyses.
Inspecting imputed data
If you just want to inspect the outcome of the imputation process, you can use the complete
/listComplete
function to fill in the missing values in the original data frame.
Mice.complete
— Functioncomplete(
mids::Mids,
imputation::Int
)
Produces a data table with missings replaced with imputed values from a multiply imputed dataset (Mids
) object.
The Mids object must be supplied first.
The imputation
argument is an integer identifying which specific imputation is to be used to fill in the missing values.
Mice.listComplete
— FunctionlistComplete(
mids::Mids
)
Summarises the outputs of all imputations in a multiply imputed dataset (Mids
) as a list of completed datasets.
Data analysis
To perform a data analysis procedure on each imputed dataset in turn, use the with
function. The with
function returns the results of the analyses wrapped in a Mira
object.
Mice.Mira
— TypeMira
A multiply imputed repeated analyses object.
The analyses are stored as a vector of analyses of individual imputations.
Mice.with
— Functionwith(
mids::Mids,
func::Function
)
Conducts repeated analyses of a multiply imputed dataset (Mids
).
The function takes two arguments: firstly the Mids
object itself, then a function (func
). The function should take the form data -> analysisFunction(arguments, data, moreArguments...)
, where data
represents the position of the data argument in the function.
For example: with(mids, data -> lm(@formula(y ~ x1 + x2), data))
The with
function requires the use of a closure, which then permits the function to run the specified analysis procedure on each imputed dataset in turn. For example:
using CSV, DataFrames, GLM, Mice, Random, Statistics
myData = CSV.read("test/data/cirrhosis.csv", DataFrame, missingstring = "NA");
myData.Stage = categorical(myData.Stage); # Making the Stage variable categorical
myPredictorMatrix = makePredictorMatrix(myData);
myPredictorMatrix[:, ["ID", "N_Days"]] .= false;
Random.seed!(1234); # Set random seed for reproducibility
imputedData = mice(myData, predictorMatrix = myPredictorMatrix);
analysesMeans = with(imputedData, data -> mean(data.Cholesterol));
# returns Mira of the mean of Bilirubin in each imputed dataset
analysesLMs = with(imputedData, data -> lm(@formula(N_Days ~ Drug + Age + Stage + Bilirubin), data));
# returns Mira of linear model outputs from each imputed dataset
