Loading Plot in R (8 Examples)

 

In this tutorial, I’ll explain how to plot loading plots for principal component analysis (PCA) using the R programming language.

The table of content is structured as follows:

Let’s dive into it!

 

Relevant Packages & Dataset

First, we should load the necessary libraries for the implementation. If they are not installed yet, first you should install them. If they are already installed, then you can skip this step and directly run the next code chunk.

install.packages("factoextra") # install packages
install.packages("MASS")
install.packages("ggplot2")

To load the libraries into your environment, you should run the following codelines.

library(factoextra) # load libraries
library(MASS)
library(ggplot2)

Once all libraries are installed and loaded, we can load our sample data. In this tutorial, we will utilize the Boston dataset, which contains information collected by the U.S Census Service concerning housing in the area of Boston Mass.

It’s a favorable dataset for PCA since most of the variables are continuous. Non-continuous ones will be omitted after the data importation.

data("Boston") # upload dataset
Boston_cont <- Boston[, !(names(Boston) %in% c("chas", "rad"))] # filter dataset

Let’s take a look what the final data looks like.

#      crim zn indus   nox    rm  age    dis tax ptratio  black lstat medv
# 1 0.00632 18  2.31 0.538 6.575 65.2 4.0900 296    15.3 396.90  4.98 24.0
# 2 0.02731  0  7.07 0.469 6.421 78.9 4.9671 242    17.8 396.90  9.14 21.6
# 3 0.02729  0  7.07 0.469 7.185 61.1 4.9671 242    17.8 392.83  4.03 34.7
# 4 0.03237  0  2.18 0.458 6.998 45.8 6.0622 222    18.7 394.63  2.94 33.4
# 5 0.06905  0  2.18 0.458 7.147 54.2 6.0622 222    18.7 396.90  5.33 36.2
# 6 0.02985  0  2.18 0.458 6.430 58.7 6.0622 222    18.7 394.12  5.21 28.7

Now all libraries and the dataset are loaded. The next step is to perform PCA.

 

Principal Component Analysis

In order to run a PCA, we will employ the prcomp() function. See Principal Component Analysis (PCA) in R.

pca_result <- prcomp(Boston_cont,  # perform PCA
                     scale = TRUE)

We have performed PCA, now we can visualize the computed loadings via loading plots, also referred to as correlation circles.

 

Example 1: Basic Loading Plot

Let’s start with a basic loading plot. We simply need to pass our pca object, pca_result, into the fviz_pca_var() function.

basic_lp <- fviz_pca_var(pca_result) # plot loading plot
basic_lp # print basic_plot

The execution of the previous R code returns the following output.

 
loading plot R
 

It already looks nice, however, there are many customizations we can apply to improve the representation. In the next step, we will see how the loading vector color can be changed.

 

Example 2: Loading Plot Colored

To color the loading vectors in the loading plot, we will use the col.var argument. We will define green4 as the color of choice, you can use any color you want.

col_lp <- fviz_pca_var(pca_result, # plot colored loading plot
                       col.var = "green4")
col_lp # print col_lp

The execution of the code leads to the following change.

 
loading plot colored R
 

Well done! It appears like our attempt was successful. The good news is that there are more advanced ways to color the loading vectors. Let’s move on to the next example!

 

Example 3: Loading Plot Colored by Contribution

In this example, we will color the loading vectors by their contributions (in percentage) to the principal components. To do this, we will utilize the col.var argument.

col_cont_lp <- fviz_pca_var(pca_result, # color by contribution
                            col.var = "contrib")
col_cont_lp # print col_cont_lp

See the visual below for the result.

 
loading plot colored by contributions R
 

As you can see, longer vectors are associated with higher contributions. For further details, see Loading Plot Explained.

One may also want to change the color palette used in coloring by contribution. Let’s explore how this is done in the next example!

 

Example 4: Loading Plot Colored by Contribution (Customized)

In order to change the colors used in coloring by contribution, we need to use the gradient.cols argument.

col_cont_cus_lp <- fviz_pca_var(pca_result, # change contribution colors
                                col.var = "contrib",
                                gradient.cols = c("white", "steelblue", "red3"))
col_cont_cus_lp # print col_cont_lp

We have defined white, steelblue and red3 as gradient colors. See how this intervantion affected our graph below.

 
loading plot colored by contributions R (customized)
 

As seen, less contributing variables are colored pale blue, while more contributing variables are colored red.

If one is only interested in top contributing variables, she can also filter out the rest. Let’s demonstrate it with an example!

 

Example 5: Loading Plot Filtered by Contribution

Here we will show how to filter variables based on their contributions. In this example, the select.var argument will help us to visualize the top 6 variables with the highest contribution and discard the rest.

filt_cont_lp <- fviz_pca_var(pca_result, # filter by contribution
                             select.var = list(contrib = 6),
                             col.var = "red3")
filt_cont_lp # print filt_cont_lp

You can see the filtered graph below.

 
loading plot filtered by contribution
 

We have successfully filtered out the variables that do not contribute significantly to the construction of principal components.

If you would like to filter variables based on their contribution values, it is better to filter based on cos2, which is a related measure. For instance, select.var = list(cos2 > 0.5) filters variables with cos2 values higher than 0.5. See here for the mathematical relation.

Next, we will filter the variables based on preference. Let’s jump into the example!

 

Example 6: Loading Plot Filtered by Variable

If you want to show only certain variables in your loading plot, you can use the following code script as a template.

filt_var_lp <- fviz_pca_var(pca_result, # filter variables
                            select.var = list(name = c("zn", 
                                                      "indus",
                                                      "age")),
                            col.var ="green4")
filt_var_lp # print filt_var_lp

We have selected the "zn", "indus" and "age" variables to visualize in our graph. See the output below.

 
loading plot filtered by variable
 

In the next section, we will make a different arrangement in visualisation. As you observed in previous plots, some vector labels were overlapping. In the next example, we will learn how to avoid overlapping and other tactics to increase the readability of the graph.

 

Example 7: Loading Plot with Increased Readability

To increase the readability of the graph, we will increase the size of vectors and labels, and implement label repulsion to prevent overlap.

arsz_inc_lp <- fviz_pca_var(pca_result, # increase arrows and label size
                              repel = TRUE, 
                              arrowsize = 0.8,
                              labelsize = 6)
arsz_inc_lp # print arsz_inc_lp

As shown, the repel argument have been set to TRUE; arrowsize and labelsize have been increased to 0.8 and 6. See the adjusted graph below.

 
loading plot increased readability
 

Now, we have a loading plot, where the vectors and labels are emphasized. You should experiment with the sizes to determine the optimal appearance for your plot. Please also be aware that the vector labels are repelled away from each other successfully.

In the next section, we will discover relabelling the main and axis titles. Without further ado let’s jump into the example!

 

Example 8: Loading Plot with Customized Titles

We have covered different visual adjustments of loading vectors in previous examples. Differently, here, we will modify the main title and axis labels.

cus_ttl_lp <- fviz_pca_var(pca_result, # change axes and plot titles
                           title = "Principal Component Analysis (PCA) Loadings",
                           xlab = "PC1 (49.9%)",
                           ylab = "PC2 (12.6%)"
) 
cus_ttl_lp # print cus_ttl_lp

As you can see, the arguments, title, xlab and ylab have been used to redefine our plot title and axis labels. Let’s take a look at the result.

 
loading plot customized titles
 

By modifying the titles in your plot, you can present the information in a way that is more comprehensible and relatable to your audience.
 

Video, Further Resources & Summary

Do you need more explanations on how to visualize loadings after performing PCA in R? Then you might check out the following video of the Statistics Globe YouTube channel.

In the video tutorial, we explain how to draw loading plots in PCA using R programming.

 

 

Furthermore, you could have a look at some of the other tutorials on Statistics Globe:

This article has demonstrated how to create loading plots in R. If you have further questions, you may leave a comment below.

 

Rana Cansu Kebabci Statistician & Data Scientist

This page was created in collaboration with Cansu Kebabci. You might have a look at Cansu’s author page to get more information about academic background and the other articles she has written for Statistics Globe.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top