Multithreading
Mice.jl
supports multithreading. To start with, you need to make sure that your Julia session is started with multiple threads. See here for information on how to do this.
Once this is done, you can run mice
in multithreaded mode by setting the threads
argument to true
. This causes the imputations to be computed in parallel.
If you instead want to run the entire mice
function in parallel, you can do something like this:
using CSV, DataFrames, Mice, Random
myData = CSV.read("test/data/cirrhosis.csv", DataFrame);
# Defining missing values
colsWithMissings = ["Drug", "Ascites", "Hepatomegaly", "Spiders", "Cholesterol", "Copper", "Alk_Phos", "SGOT", "Tryglicerides", "Platelets", "Prothrombin", "Stage"];
myData[!, colsWithMissings] = allowmissing(myData[!, colsWithMissings]);
for i in colsWithMissings
replace!(myData[!, i], "NA" => missing)
end
for i in ["Cholesterol", "Copper", "Alk_Phos", "SGOT", "Tryglicerides", "Platelets", "Prothrombin"]
myData[!, i] = passmissing(x -> parse(Float64, x)).(myData[!, i])
end
myMethods = makeMethods(myData);
myMethods[["ID", "N_Days"]] .= "";
myPredictorMatrix = makePredictorMatrix(myData);
myPredictorMatrix[:, ["ID", "N_Days"]] .= false;
Random.seed!(1234); # Set random seed for reproducibility
imputedData = Vector{Mids}(undef, 10); # Initialise vector of Mids outputs
Threads.@threads for i in 1:10 # Number of parallel runs
# Produces 5 x 10 = 50 imputed datasets in 10 separate Mids objects
imputedData[i] = mice(myData, m = 5, predictorMatrix = myPredictorMatrix, methods = myMethods, threads = false, progressReports = false)
end
imputedData = bindImputations(imputedData) # Binds the separate Mids objects into a single output