Draw Autoplot of PCA in Python (2 Examples)


On this page you’ll learn how to create an autoplot of a Principal Component Analysis (PCA) in the Python programming language.

The table of content is structured as shown below:

Let’s start with it!


Example Data and Add-On Libraries

In order to explain how to draw an autoplot of a PCA in Python, we will need to use some libraries which will help us with the data analysis, calculation, model building, and data visualization of our PCA and its autoplot. Please load them before we start.

import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from sklearn import decomposition
from sklearn.decomposition import PCA
from sklearn.datasets import load_wine

Now, it’s time to prepare our data. For this tutorial, we will use the wine dataset from the scikit-learn library. To import it, we will use the load() function.

wine = load_wine()

After this, we will convert it into a pandas DataFrame, so we can run our PCA. We will also define our target, which corresponds to the type of wine in our data.

df = pd.DataFrame(wine.data, 
target = pd.Series(wine.target, 
                       name = "Class")
df.iloc[:, 0:3].head(6)

Wine DataFrame

Our data set has 178 rows and 13 columns. Above, we can see the first 6 rows from the first 3 columns by using the head() and the .iloc[] methods.

Now, we can work on the PCA.


Scale the Data and Perform the PCA

Before performing the PCA, our data must be scaled. For this, we will use the StandardScaler() function, and then we will transform our DataFrame.

scaler = StandardScaler()
wine_scaled = scaler.transform(df)

We can define the number of components we want to include in our PCA, and then use them to perform the PCA in our DataFrame by using the pca.fit() function.

pca = PCA(n_components=2)
PC = pca.fit_transform(wine_scaled)
pca_wine = pd.DataFrame(data = PC,
               columns = ['PC1', 'PC2'])

Wine PCA

Above we can see the first 6 rows of our PCA DataFrame. Let’s see how does the autoplot look like!


Example 1: Basic Autoplot of PCA

Now, we will see how to draw an autoplot by plotting one principal component in each axis and using the scatter() function.

fig, ax = plt.subplots(figsize=(14, 9))
              fontsize = 20)
              fontsize = 20)
ax.set_title('Figure 1', 

Autoplot of PCA in Python

Note that we have colored our data by its target using the c= argument and used the “cool” colormap, although there are plenty of colormaps you can choose to color your plot. Also, the s= argument helped us to change the size of the points in our plot.

You can also add the feature vectors to this plot. Take a look at the following example.


Example 2: Autoplot of PCA as Biplot

It’s also possible to add the feature vectors to this autoplot in case we want to create a biplot. Take a look at the following code.

xs = PC[:,0]
ys = PC[:,1]
scalex = 1.0/(xs.max() - xs.min())
scaley = 1.0/(ys.max() - ys.min())
fig, ax = plt.subplots(figsize=(14, 9))
for i, feature in enumerate(wine.feature_names):
    ax.arrow(0, 0, pca.components_[0, i], 
             pca.components_[1, i], 
    ax.text(pca.components_[0, i] * 1.15, 
            pca.components_[1, i] * 1.15, 
            feature, fontsize = 18)
scatter = ax.scatter(xs * scalex,ys * scaley, 
ax.set_xlabel('PC1', fontsize=20)
ax.set_ylabel('PC2', fontsize=20)
ax.set_title('Figure 2', fontsize=20)
legend1 = ax.legend(*scatter.legend_elements(),
                    loc="lower left", title="Wine Target")

Biplot-Autoplot of PCA in Python

As shown in the previous Python output, we have represented the variables as vectors by using a for loop and the arrow() and text() functions. Moreover, it’s possible to modify the size and length of the arrows’ heads using the head_width= and head_length= arguments. Finally, we’ve also added a legend via the matplotlib legend() function. If you want to see other examples of biplot of PCA in Python, you can check our tutorial: Draw Biplot of PCA in Python.


Video, Further Resources & Summary

Do you need more explanations on how to create an autoplot of PCA in Python? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel.


The YouTube video will be added soon.


If you want to learn more, you could take a look at some other tutorials available on Statistics Globe:

In this post you had the opportunity to learn how to create an autoplot in Python. In case you have further questions, you may leave a comment.


Paula Villasante Soriano Statistician & R Programmer

This page was created in collaboration with Paula Villasante Soriano. Please have a look at Paula’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.


Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.