👉 List of all notes for this book. IMPORTANT UPDATE November 18, 2024: I've stopped taking detailed notes from the book and now only highlight and annotate directly in the PDF files/book. With so many books to read, I don't have time to type everything. In the future, if I make notes while reading a book, they'll contain only the most notable points (for me).

<aside> 📔 Jupyter notebook for this chapter: on Github, on Colab, on Kaggle.

</aside>

MNIST

 = 5

y[0] = 5

Training a Binary Classifier

Let’s simplify the problem - “detect only the number 5” ← binary classifier (2 classes, 5 or non-5).

Good to start is stochastic gradient descent (SGD, or stochastic GD) classifier ← SGDClassifier ← deals with training instances independently, one at a time ← handling large datasets effeciently, well suited for online training.

from sklearn.linear_model import SGDClassifier

sgd_clf = SGDClassifier(random_state=42)
sgd_clf.fit(X_train, y_train_5)

sgd_clf.predict([some_digit])

Performance Measures

Evaluating a classifier is often significantly trickier than evaluating a regressor!

Measuring Accuracy Using Cross-Validation

Use cross_val_score() ← use k-folds.

from sklearn.model_selection import cross_val_score
cross_val_score(sgd_clf, X_train, y_train_5, cv=3, scoring="accuracy")

Wow, get 95% accuracy with SGD but it’s good? → Let’s try DummyClassifier ← classifies every single image in the most frequent class (non-5) and then use cross_val_score → 90% accuracy! Why? It’s because only 10% are 5s! ← If you always guess that an image is not a 5, 90% of the time, you’re right!