# Draw Autoplot of PCA in Python (Example)

On this page you’ll learn how to create an autoplot of a Principal Component Analysis (PCA) in the Python programming language.

The table of content is structured as shown below:

Let’s start with it!

## Example Data and Add-On Libraries

In order to explain how to draw an autoplot of a PCA in Python, we will need to use some libraries which will help us with the data analysis, calculation, model building and data visualization of our PCA and its autoplot. Please load them before we start:

import pandas as pd import matplotlib.pyplot as plt from sklearn.preprocessing import StandardScaler from sklearn import decomposition from sklearn.decomposition import PCA from sklearn.datasets import load_wine |

import pandas as pd import matplotlib.pyplot as plt from sklearn.preprocessing import StandardScaler from sklearn import decomposition from sklearn.decomposition import PCA from sklearn.datasets import load_wine

Now, it’s time to prepare our data. For this tutorial, we will use the Wine data set from the scikit-learn library. In order to import it, we will use the load() function from scikit-learn:

wine = load_wine() |

wine = load_wine()

After this, we will convert it into a pandas DataFrame, so we can run our PCA. We will also define our targets, which correspond to the class in our data:

DF_data = pd.DataFrame(wine.data, columns = wine.feature_names) targets = pd.Series(wine.target, name = "Class") DF_data.iloc[:, 0:3].head(6) |

DF_data = pd.DataFrame(wine.data, columns = wine.feature_names) targets = pd.Series(wine.target, name = "Class") DF_data.iloc[:, 0:3].head(6)

Our data set has 178 rows and 13 columns. Above we can see the first 6 rows from the first 3 columns, by using the .head function and the `.iloc[]`

method.

Now, we can work on the PCA.

## Scale the Data and Perform the PCA

Before performing the PCA, our data must be scaled. For this, we will use the StandardScaler() function in our data, and then we will transform our DataFrame with it:

DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data), index = DF_data.index, columns = DF_data.columns) |

DF_standard = pd.DataFrame(StandardScaler().fit_transform(DF_data), index = DF_data.index, columns = DF_data.columns)

Now, we can define the number of components we want to include in our PCA. In our case, two components:

m = DF_standard.shape[1] K = 2 |

m = DF_standard.shape[1] K = 2

And then use them to perform the PCA in our DataFrame, using the functions pca.fit() and pca.transform():

Mod_PCA = decomposition.PCA(n_components=m) wine_PCA = pd.DataFrame(Mod_PCA.fit_transform(DF_standard), columns=["PC%d" % k for k in range(1,m + 1)]).iloc[:,:K] wine_PCA.head(6) |

Mod_PCA = decomposition.PCA(n_components=m) wine_PCA = pd.DataFrame(Mod_PCA.fit_transform(DF_standard), columns=["PC%d" % k for k in range(1,m + 1)]).iloc[:,:K] wine_PCA.head(6)

Above we can see the first 6 rows of our PCA DataFrame. Now, let’s see how it looks like in an autoplot.

## Create the Autoplot of the PCA

Before, we defined our targets. In order to plot them in different colors, we can create a color list:

color_list = [{0:"b", 1:"purple", 2:"violet"}[x] for x in targets] |

color_list = [{0:"b", 1:"purple", 2:"violet"}[x] for x in targets]

Now, we can see how our autoplot looks like plotting one principal component in each axis:

fig, ax = plt.subplots() ax.scatter(x=wine_PCA["PC1"], y=wine_PCA["PC2"], color=color_list) ax.set_xlabel('Principal Component 1', fontsize = 15) ax.set_ylabel('Principal Component 2', fontsize = 15) ax.set_title("PCA Plot", fontsize=16) |

fig, ax = plt.subplots() ax.scatter(x=wine_PCA["PC1"], y=wine_PCA["PC2"], color=color_list) ax.set_xlabel('Principal Component 1', fontsize = 15) ax.set_ylabel('Principal Component 2', fontsize = 15) ax.set_title("PCA Plot", fontsize=16)

And that’s how we get an autoplot of our PCA using Python.

## Video, Further Resources & Summary

Do you need more explanations on how to create an autoplot of PCA in Python? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel.

*The YouTube video will be added soon.*

If you want to learn more, you could take a look at some other tutorials available on Statistics Globe:

- Draw 3D Plot of PCA in Python
- Append Values to pandas DataFrame in Python
- Change datetime Format in pandas DataFrame in Python
- Create New pandas DataFrame from Existing Data in Python

In this post you had the opportunity to learn how to **make an autoplot in Python**. In case you have further questions, you may leave a comment.

This page was created in collaboration with Paula Villasante Soriano. Please have a look at Paula’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.

### Statistics Globe Newsletter