back to course overview
AdaBoost in Python - ML From Scratch 13
In this Machine Learning from Scratch Tutorial, we are going to implement the AdaBoost algorithm using only built-in Python modules and numpy. AdaBoost is an ensemble technique that attempts to create a strong classifier from a number of weak classifiers. We will first learn about the concept and the math behind this popular ML algorithm, and then we jump to the code.
All algorithms from this course can be found on GitHub together with example tests.
import numpy as np # Decision stump used as weak classifier class DecisionStump(): def __init__(self): self.polarity = 1 self.feature_idx = None self.threshold = None self.alpha = None def predict(self, X): n_samples = X.shape X_column = X[:, self.feature_idx] predictions = np.ones(n_samples) if self.polarity == 1: predictions[X_column < self.threshold] = -1 else: predictions[X_column > self.threshold] = -1 return predictions class Adaboost(): def __init__(self, n_clf=5): self.n_clf = n_clf def fit(self, X, y): n_samples, n_features = X.shape # Initialize weights to 1/N w = np.full(n_samples, (1 / n_samples)) self.clfs =  # Iterate through classifiers for _ in range(self.n_clf): clf = DecisionStump() min_error = float('inf') # greedy search to find best threshold and feature for feature_i in range(n_features): X_column = X[:, feature_i] thresholds = np.unique(X_column) for threshold in thresholds: # predict with polarity 1 p = 1 predictions = np.ones(n_samples) predictions[X_column < threshold] = -1 # Error = sum of weights of misclassified samples misclassified = w[y != predictions] error = sum(misclassified) if error > 0.5: error = 1 - error p = -1 # store the best configuration if error < min_error: clf.polarity = p clf.threshold = threshold clf.feature_idx = feature_i min_error = error # calculate alpha EPS = 1e-10 clf.alpha = 0.5 * np.log((1.0 - min_error + EPS) / (min_error + EPS)) # calculate predictions and update weights predictions = clf.predict(X) w *= np.exp(-clf.alpha * y * predictions) # Normalize to one w /= np.sum(w) # Save classifier self.clfs.append(clf) def predict(self, X): clf_preds = [clf.alpha * clf.predict(X) for clf in self.clfs] y_pred = np.sum(clf_preds, axis=0) y_pred = np.sign(y_pred) return y_pred
Join My Newsletter! Get Python and ML tips emailed directly to your inbox. Each month you’ll get a summary of all the content I created, including the newest videos, articles, promotions, tips, and more.
Learn all the necessary basics to get started with TensorFlow 2 and Keras.
Learn all the necessary basics to get started with this deep learning framework.
Implement popular Machine Learning algorithms from scratch using only built-in Python modules and numpy.