# F.A.Q.

You will find here the answers to various questions about **R** and **FactoMineR** and more specific ones about graphical options.

# Various questions

How do I install the R software for the first time?

Click here to see an animated tutorial.

How do I install the FactoMineR Rcmdr plug-in with Rcmdr?

Download the package **RcmdrPlugin.FactoMineR** to add the **FactoMineR** GUI in Rcmdr:

- download the
**FactoMineR**package (on the CRAN or on the**FactoMineR**Website) - download the
**Rcmdr**package (on the CRAN) - download the
**RcmdrPlugin.FactoMineR**package (on the CRAN or on**FactoMineR**Website) - open an R session then type:
`library(FactoMineR)`

- open an Rcmdr session:
`library(Rcmdr)`

- click on
*Tools*->*download Rcmdr plug-ins*and choose the**RcmdrPlugin.FactoMineR**

How are missing values taken into account?

By default, missing values in **FactoMineR** are replaced by the mean of each variable which is not a very proper and convenient way to deal with missing values, especially when there are a lot of them in your dataset. We have implemented a package **missMDA** to deal with missing values in PCA, in MCA and in MFA.

How does PCA behave in high dimension?

For the moment, **FactoMineR** is not an efficient tool to deal with very high dimensional datasets. The graphical representations are not created to cope such datasets. However, it will be possible (soon) to collect only few scores and loadings of big datasets in order to make a preprocessing of big data.

What is a supplementary variable?

A supplementary variable is a variable which will not be taken into account during the construction of the factorial axes *i.e.* the calculation of distances between the individuals.

Whatever method you use, only active variables will be taken into account for the construction of the factorial plane.

Where do I find scores and loadings in res.pca?

Scores (*i.e.* principal coordinates) are in: `res.pca$ind$coord`

The variance of the individuals' coordinates for a dimension corresponds to the eigenvalue of this dimension.

Loadings (*i.e.* standard coordinates) are not given by **FactoMineR**'s methods. They return principal coordinates.

You can calculate them by dividing variables' coordinates on a dimension by this dimension's eigenvalue's square root.

Just type: `sweep(res.pca$var$coord,2,sqrt(res.pca$eig[1:ncol(res.pca$var$coord),1]),FUN="/")`

What are contributions?

The contribution of a point to the inertia of an axis is the quotient between the inertia of its projection and the inertia of the whole scatterplot's projection on this axis.

I deleted some individuals and thus suppressed some categories which were taken only by those individuals. But R has still got these categories with 0 individuals in memory, how do I recode the variables?

Suppose your variable of interest is variable *X* with three levels: *A*, *B* and *C*. After you delete individuals, *B* has got 0 individuals left.

To delete level *B* from R, type the following code: `dataset[,X] <- factor(as.character(dataset[,X]))`

Do I have to standardize the variables when doing a PCA?

If variables do not have the same units, it is essential to standardize them.

If variables have the same units, their influence in the calculus is balanced according to their standard deviation. To standardize them gives them all the same importance. Knowing that, standardization or no standardization is your choice.

This package does PCA only based on correlation matrix. Is it possible to use covariance matrix instead of correlation one?

When you choose to perform an unscaled PCA, a covariance matrix is used instead of a correlation one. Just choose *scale.unit=FALSE* when typing *PCA(...)*.

I was wondering in the AFM function, what the different types "c", "n" and "s" mean?

"c" and "s" are for quantitative variables: for "s" variables are scaled to unit variance, for "c" they are just centered.

"n" is for qualitative variables.

By default, all quantitative variables are scaled to unit variance.

# Graphical options

How can I add a title to my graph? Can I change the range of the axes in my graph?

All the graphs are plotted with the functions *plot.PCA()*, *plot.MCA()*, *plot.CA()*, ... To change the graphical options, you should see the help of these functions.

For example, to add a title in a PCA graph and change the range in the x-axis, you make the PCA using the option *graph=F*, and then plot the graph with the *plot.PCA()* function:

`res.pca = PCA(mydata, graph=FALSE) `

plot(res.pca, main="Title of my graph", xlim=c(-2,3))

How do I gather several graphs into one single plot?

You should use for example the function *plot.PCA* if you are doing a PCA (else you use the other plotting functions) with the argument *new.plot = FALSE*.

For example: `data(decathlon)`

res.pca <- PCA(decathlon, quanti.sup = 11:12, quali.sup=13,graph=FALSE)

par(mfrow=c(1,2))

plot(res.pca,choix="ind",new.plot=FALSE)

plot(res.pca,choix="var",new.plot=FALSE)

I have got too many variables to represent and cannot see anything on my graph, how do I represent only the variables which are the best represented?

Use the *lim.cos2.var* option of the function *graph.var()*. It allows you yo choose the value of the square cosinus under which the variables are not drawn.

I would like to represent only supplementary individuals on the graph, how do I remove active ones?

Use the *invisible* option of *plot.PCA()* (or *plot.MCA*,...). `plot.PCA(res.pca, choix="ind",invisible="ind")`

For more details, see: `help(plot.PCA)`