Tutoriels missMDA
You can see the section on missing values to better learn more on the handling of missinge values.
You will find here some tutorials.
Videos
- Handling missing values in:
- continuous data sets using PCA (See this video)
- categorical data sets using MCA (See this video)
- Multiple imputation for:
- continuous data sets using PCA (See this video from 11'7 to the end)
Steps to perform PCA with missing values?
- estimate the number of dimensions used in the reconstruction formula with the estim_ncpPCA function
- impute the data set with the impute.PCA function using the number of dimensions previously calculated (by default, 2 dimensions are chosen)
- perform the PCA on the completed data set using the PCA function of the FactoMineR package
Example
library(missMDA)
data(orange)
nb = estim_ncpPCA(orange,ncp.max=5)
res.comp = imputePCA(orange,ncp=2)
res.pca = PCA(res.comp$completeObs)
Steps to perform MCA with missing values?
- estimate the number of dimensions used in the reconstruction formula with the estim_ncpMCA function
- impute the data set with the impute.MCA function using the number of dimensions previously calculated (by default, 2 dimensions are chosen); this step impute the disjuntive matrix used in MCA
- perform the MCA on the completed disjunctive matrix using the MCA function of the FactoMineR package, and the tab.disj argument
Example
library(missMDA)
data(vnf)
nb = estim_ncpMCA(vnf,ncp.max=5)
tab.disj = imputeMCA(vnf, ncp=4)$tab.disj
res.mca = MCA(vnf,tab.disj=tab.disj)
Steps to generate multiple imputed data sets (with continuous variables)
- estimate the number of dimensions used in the reconstruction formula with the estim_ncpPCA function
- generate the imputed data sets with the MIPCA function using the number of dimensions previously calculated (by default, 2 dimensions are chosen)
- visualize the imputed data sets with the plot.MIPCA function
Example
library(missMDA)
data(orange)
nb = estim_ncpPCA(orange,ncp.max=5)
resMI = MIPCA(orange,ncp=2)
plot(resMI)