In this tutorial you’ll learn how to create a Principal Component Analysis (PCA) plot in 3D in Python programming.

## Step 1: Add-On Libraries and Data Sample

First of all, we will need to import some libraries with which we will perform and plot our PCA. These will help us with the data analysis, calculation, model building and data visualization of our PCA plot in 3D:

import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.datasets import load_breast_cancer from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA plt.style.use('ggplot')

In order to create this PCA plot we will use the breast cancer data set, from the scikit-learn library. First of all, we will use the load() function from scikit-learn to load our data set, and then convert it into a pandas DataFrame:

b_cancer = load_breast_cancer() df = pd.DataFrame(data=b_cancer.data, columns=b_cancer.feature_names) df.iloc[:, 0:6].head(6)

Our data set has 569 rows and 30 columns. Above we can see the first 6 rows from the first 6 columns, by using the head() function and the “.iloc[]” method.

Now let’s see how the PCA looks like.

## Step 2: Standardize the Data and Perform the PCA

Before performing the PCA, we need to standardize our data using the StandardScaler() function, and then transform our data:

scaler = StandardScaler() scaler.fit(df) Bcancer_scaled = scaler.transform(df)

Now that we have already scaled our data, we can perform the PCA with it, using 3 components:

pca = PCA(n_components=3) pca.fit(Bcancer_scaled) pca_bcancer = pca.transform(Bcancer_scaled)

## Step 3: Create the 3D Plot of the PCA

Now, to plot our PCA in 3D, first we have to define some of its attributes. First of all, we will define the axes in our 3D PCA plot:

Xax = pca_bcancer[:,0] Yax = pca_bcancer[:,1] Zax = pca_bcancer[:,2]

Each axis will contain one of the components. We will also define the labels, referring to the diagnosis, and its colors. We can find the diagnosis column in `.target`

in our data, which will be the *y* axis in our 3D plot:

cdict = {0:'m',1:'c'} label = {0:'Malignant',1:'Benign'} y = b_cancer.target

Now, we can finally create our PCA plot in 3D. We will use a for loop to plot each label in a different color and the right place in the axis. In order to plot our data in 3 dimensions, we will specify it inside the fig.add_subplot() function:

fig = plt.figure(figsize=(14,9)) ax = fig.add_subplot(111, projection='3d') for l in np.unique(y): ix=np.where(y==l) ax.scatter(Xax[ix], Yax[ix], Zax[ix], c=cdict[l], s=60, label=label[l]) ax.set_xlabel("PC1", fontsize=12) ax.set_ylabel("PC2", fontsize=12) ax.set_zlabel("PC3", fontsize=12) ax.view_init(30, 125) ax.legend() plt.title("3D PCA plot") plt.show()

As a result, we get our PCA plot in 3D.

## Video, Further Resources & Summary

Do you need more explanations on how to plot a PCA in 3D? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel.

In this post we explained how to **make a PCA plot in 3 dimensions in Python**. If you have any question, please leave a comment below.

This page was created in collaboration with Paula Villasante Soriano. Please have a look at Paula’s author page to get more information about her academic background and the other articles she has written for Statistics Globe.

