What are Eigenvalues & Eigenvectors in PCA? (Example)

 

In this tutorial, you’ll learn two fundamental concepts in Principal Component Analysis (PCA), eigenvalues and eigenvectors.

The tutorial consists of the following:

If you want to know more about these contents, keep reading!

 

Eigenvalues & Eigenvectors in PCA

Principal Component Analysis (PCA) is a statistical method used for reducing the dimensionality of large datasets. PCA finds the principal components, or the directions of maximum variance in the data, using the concepts of eigenvectors and eigenvalues.

Eigenvectors are the vectors indicating the direction of the axes along which the data varies the most. Each eigenvector has a corresponding eigenvalue, quantifying the amount of variance captured along its direction.

PCA involves selecting eigenvectors with the largest eigenvalues. By projecting the original data onto these selected eigenvectors, PCA transforms the dataset into a new coordinate system in lower dimensions while retaining the greatest data variances. We call these newly aligned coordinates, determined by the eigenvectors, as the principal components.

For further details about PCA, I suggest you visit the tutorial What is Principal Component Analysis (PCA)?. Also, you can watch our YouTube tutorial shared at the bottom of this page.

 

Calculation of Eigenvectors & Eigenvalues in PCA

In PCA, the eigenvectors and eigenvalues are calculated from the covariance matrix (the source of information about data variation) using the method called eigendecomposition. This method decomposes a square matrix into eigenvectors and eigenvalues.

For a given square matrix \(A\), eigendecomposition finds eigenvectors and scalar eigenvalues that satisfy the equation:

\(\large A⋅v=λ⋅v\)


where \(v\) is an eigenvector and \(λ\) is its corresponding eigenvalue. To find the eigenvalues \(λ\) and eigenvectors \(v\), we can gather all terms on the left-hand side and factor out the eigenvector \(v\) and rewrite the equation:

\(\large v⋅(A-λI)=0\)


where \(I\) is the identity matrix of the same size as \(A\). Since the non-trivial eigenvector solutions are of interest, first, the eigenvalues \(λ\) are found by solving the determinant of the given expression below.

\(\large det(A−λI)=0\)


Once we get the eigenvalues \(λ\), we can find the eigenvectors \(v\) by substituting each eigenvalue back into the equation:

\(\large v⋅(A-λI)=0\)


and solve it for \(v\).

 

Calculation Example

Let’s illustrate this calculation with an example! Imagine that we have a covariance matrix as given below.

\[
\large
\begin{bmatrix}
4.0 & 2.0 & 0.6 \\
2.0 & 3.0 & 0.9 \\
0.6 & 0.9 & 2.5
\end{bmatrix}
\]

Using the eigendecomposition method, we obtain the eigenvalues: \(\large 5.9 \), \(\large 2.3 \) and \(\large 1.3 \).

Once we plug the eigenvalues in the equation one by one, we obtain the corresponding eigenvectors below.

\[
\large
\begin{bmatrix}
0.74 \\ 0.61 \\0.29
\end{bmatrix}
\]
\[
\large
\begin{bmatrix}
-0.44 \\ 0.10 \\ 0.89
\end{bmatrix}
\]
\[
\large
\begin{bmatrix}
0.51 \\ -0.79 \\ 0.34
\end{bmatrix}
\]

These results indicate that the first principal component, along the first eigenvector, will explain \(5.9 / (5.9 + 2.3 + 1.3) = 62\% \) of the total variance, whereas the second principal component along the second eigenvector will explain \(2.3 / (5.9 + 2.3 + 1.3) = 24\% \) of the total variance. The same calculations will yield \(14\%\) for the third principal component. See my What is Explained Variance in PCA? tutorial for a better understanding of the concept.

The calculations show that the first two principal components explain a sufficient amount of variance, hence they can be retained for the analysis. To learn more about deciding the optimal number of components, visit Choose Optimal Number of Components for PCA.

 

Video, Further Resources & Summary

Would you like to know more about the eigenvalues and eigenvectors in PCA? Then I recommend taking a look at the following video tutorial on my YouTube channel. I’m explaining the eigendecomposition technique in more details in the video.

 

 

Furthermore, you might have a look at the other articles on this website.

 

This tutorial has shown the use of eigenvalues and eigenvectors in PCA. Let me know in the comments below, in case you have additional questions and/or comments.

 

Cansu Kebabci R Programmer & Data Scientist

This page was created in collaboration with Cansu Kebabci. Have a look at Cansu’s author page to get more information about her professional background, a list of all his tutorials, as well as an overview on her other tasks on Statistics Globe.

 

Subscribe to the Statistics Globe Newsletter

Get regular updates on the latest tutorials, offers & news at Statistics Globe.
I hate spam & you may opt out anytime: Privacy Policy.


Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.

Top