Draw Point Cloud of PCA in Python (2 Examples)
In this tutorial, you’ll learn how to draw a point cloud based on a Principal Component Analysis (PCA) in the Python programming language.
We will show the following content:
Let’s go straight to the code.
Data Sample & Add-On Libraries
The first step is to load the needed libraries we will use. You may need to install the PyVista library, for which you can use the following code:
pip install pyvista |
pip install pyvista
Then, we can import the libraries:
import pyvista as pv import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from pyvista import examples from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA |
import pyvista as pv import numpy as np import pandas as pd import matplotlib.pyplot as plt import seaborn as sns from pyvista import examples from sklearn.preprocessing import StandardScaler from sklearn.decomposition import PCA
Now, we will create our point cloud data set. For this, we will use a function that returns a point cloud array:
def generate_cloud(subset=0.02): dataset = examples.download_lidar() ids = np.random.randint(low=0, high=dataset.n_points - 1, size=int(dataset.n_points * subset)) return dataset.points[ids] point_cloud = generate_cloud() |
def generate_cloud(subset=0.02): dataset = examples.download_lidar() ids = np.random.randint(low=0, high=dataset.n_points - 1, size=int(dataset.n_points * subset)) return dataset.points[ids] point_cloud = generate_cloud()
Once we have our data ready, we can perform our PCA.
Standardize the Data & Perform the PCA
Before performing the PCA, we will standardize the data by using the StandardScaler() class:
scaler = StandardScaler() scaler.fit(point_cloud) Pcloud_scaled = scaler.transform(point_cloud) |
scaler = StandardScaler() scaler.fit(point_cloud) Pcloud_scaled = scaler.transform(point_cloud)
Now, we can perform the PCA. We will choose three components, but you may need some more information in order to choose the optimal number of components for your PCA. Let’s take a look:
pca = PCA(n_components=3) PC = pca.fit_transform(Pcloud_scaled) |
pca = PCA(n_components=3) PC = pca.fit_transform(Pcloud_scaled)
Then, we will transform the output of the PCA to a DataFrame with three columns (one for each principal component) by using the pandas DataFrame() function. We can see what the first rows of our PCA look like:
pca_cloud = pd.DataFrame(data = PC, columns = ['PC1', 'PC2','PC3']) pca_cloud.head(6) |
pca_cloud = pd.DataFrame(data = PC, columns = ['PC1', 'PC2','PC3']) pca_cloud.head(6)
Example 1: Visualize Point Cloud for PCA in 2D
Now, we can plot the point cloud of the PCA in 2D using the seaborn package:
sns.set() fig = sns.scatterplot( x='PC1', y='PC2', s=0.5, data=pca_cloud ) plt.title('2D PCA Point Cloud') plt.show() |
sns.set() fig = sns.scatterplot( x='PC1', y='PC2', s=0.5, data=pca_cloud ) plt.title('2D PCA Point Cloud') plt.show()
Example 2: Visualize Point Cloud for PCA in 3D
We can also plot the point cloud in 3D. To achieve this, we will first define the axes in our plot, which are the three principal components that we previously defined:
Xax = PC[:,0] Yax = PC[:,1] Zax = PC[:,2] |
Xax = PC[:,0] Yax = PC[:,1] Zax = PC[:,2]
Now, we can plot the point cloud of the PCA by defining the figure and the axes in our 3D scatterplot:
fig = plt.figure(figsize=(14,9)) ax = fig.add_subplot(111, projection='3d') ax.scatter(Xax, Yax, Zax, s=0.05, c="magenta") ax.set_xlabel("PC1", fontsize=14) ax.set_ylabel("PC2", fontsize=14) ax.set_zlabel("PC3", fontsize=14) ax.view_init(30, 125) plt.title("3D PCA plot") plt.show() |
fig = plt.figure(figsize=(14,9)) ax = fig.add_subplot(111, projection='3d') ax.scatter(Xax, Yax, Zax, s=0.05, c="magenta") ax.set_xlabel("PC1", fontsize=14) ax.set_ylabel("PC2", fontsize=14) ax.set_zlabel("PC3", fontsize=14) ax.view_init(30, 125) plt.title("3D PCA plot") plt.show()
And that’s how we get our point clouds of the PCA that we created in Python.
Video, Further Resources & Summary
Do you need more explanations on how to draw your principal component analysis’ point cloud in Python? Then you should have a look at the following YouTube video of the Statistics Globe YouTube channel.
The YouTube video will be added soon.
You can also have a look at the other tutorials available on Statistics Globe:
- Draw 3D Plot of PCA in Python
- Can PCA be Used for Categorical Variables?
- Draw Autoplot of PCA in Python
- Imputation Methods (Top 5 Popularity Ranking)
- Learn Python
This post has shown how to draw a point cloud of a PCA in the Python language. In case you have further questions, don’t hesitate to leave a comment below.
This page was created in collaboration with Paula Villasante Soriano. Please have a look at Paula’s author page to get further information about her academic background and the other articles she has written for Statistics Globe.
Statistics Globe Newsletter