Back to course overview

## PCA (Principal Component Analysis) in Python - ML From Scratch 11

In this Machine Learning from Scratch Tutorial, we are going to implement a PCA algorithm using only built-in Python modules and numpy. We will also learn about the concept and the math behind this popular ML algorithm.

All algorithms from this course can be found on GitHub together with example tests.

## Implementation

``````import numpy as np

class PCA:

def __init__(self, n_components):
self.n_components = n_components
self.components = None
self.mean = None

def fit(self, X):
# Mean centering
self.mean = np.mean(X, axis=0)
X = X - self.mean
# covariance, function needs samples as columns
cov = np.cov(X.T)
# eigenvalues, eigenvectors
eigenvalues, eigenvectors = np.linalg.eig(cov)
# -> eigenvector v = [:,i] column vector, transpose for easier calculations
# sort eigenvectors
eigenvectors = eigenvectors.T
idxs = np.argsort(eigenvalues)[::-1]
eigenvalues = eigenvalues[idxs]
eigenvectors = eigenvectors[idxs]
# store first n eigenvectors
self.components = eigenvectors[0:self.n_components]

def transform(self, X):
# project data
X = X - self.mean
return np.dot(X, self.components.T)
``````

## FREE VS Code / PyCharm Extensions I Use

🪁 Code faster with Kite, AI-powered autocomplete: Link *

✅ Write cleaner code with Sourcery, instant refactoring suggestions: Link *

* These are affiliate links. By clicking on it you will not have any additional costs, instead you will support me and my project. Thank you! 🙏