The Hundred-page Machine Learning Book

A concise, practical introduction to core machine learning concepts for engineers and analysts

Master the essential theory and algorithms behind machine learning in the time it takes to read a single weekend project.

A Andriy Burkov

The Hundred-Page Machine Learning Book strips machine learning down to its foundations without sacrificing precision. In roughly 160 pages, Andriy Burkov covers supervised and unsupervised learning, neural networks, model evaluation, and the math you actually need — nothing more, nothing less. Whether you are moving into ML from a neighboring discipline or refreshing your fundamentals before a deep specialization, this is the reference that earns its place on a working practitioner's desk.

Buy on Amazon →

About this book

Most machine learning books make one of two mistakes: they bury you in theory you will never use, or they skip the math entirely and leave you unable to reason about why a model fails. This book does neither.

Andriy Burkov distills the subject to its irreducible core. Each page carries weight. The notation is consistent, the explanations are precise, and the selection of topics reflects what a practitioner genuinely encounters — not a textbook committee's idea of completeness. You get supervised learning, unsupervised learning, model evaluation, regularization, ensemble methods, neural networks, and the probability and linear algebra that ties it all together.

The book is structured so that a reader with a basic quantitative background can move linearly from start to finish in a sitting or two, then return to individual sections as a reference. It does not assume you already know machine learning, but it also does not waste your time pretending the math is optional. Equations appear when they clarify; prose appears when it explains.

What makes this book unusual is what it leaves out. There is no filler. There are no lengthy code listings that age poorly. There is no chapter that exists to pad a page count. What remains is a tight, reliable map of the discipline — one that tells you what the territory looks like before you decide which corner to explore in depth.

Supervised learning algorithms including linear and logistic regression, SVM, decision trees, and k-NN
Unsupervised techniques: clustering, dimensionality reduction, and anomaly detection
Neural network fundamentals and how deep architectures differ from shallow ones
Bias-variance tradeoff, regularization, and how to diagnose a struggling model
Practical model selection, cross-validation, and evaluation metrics that match real objectives

Published in January 2019 and widely cited in university syllabi and practitioner reading lists, this book has become a standard orientation text precisely because it respects both your intelligence and your time. Read it once to build a mental model of the field. Read it again when you need to remind yourself why something works.

🎯 What you'll learn

Distinguish supervised, unsupervised, and semi-supervised learning and choose the right framing for a given problem
Explain and apply core algorithms — linear regression, SVM, decision trees, k-means, PCA — from first principles
Interpret the bias-variance tradeoff and apply regularization to prevent overfitting in practice
Evaluate models using metrics that align with actual business or research objectives, not just default accuracy scores
Understand how neural networks learn and where deep learning fits relative to classical methods
Read and reason about ML research papers without getting lost in unfamiliar notation
Build a reliable mental map of the entire ML discipline to guide further self-study or specialization

👤 Who is this book for?

Software engineers transitioning into machine learning who need a fast, rigorous orientation to the field
Data analysts who already work with data but want to understand the algorithms their tools are running under the hood
Graduate students starting an ML specialization who want a compact reference before tackling longer textbooks
Experienced practitioners in adjacent fields — statistics, operations research, signal processing — mapping their existing knowledge onto modern ML terminology
Hiring managers and technical leads who want to speak credibly about ML without committing to a 600-page tome

01

Introduction and Notation

Establishes the mathematical notation and vocabulary used throughout the book, so every subsequent chapter builds on a consistent foundation. You will review vectors, matrices, probability basics, and the core learning problem setup.
02

Supervised Learning Fundamentals

Defines supervised learning formally and introduces the concepts of features, labels, training sets, and loss functions. You will see how a learning algorithm turns labeled data into a predictive model.
03

Core Classification and Regression Algorithms

Walks through linear regression, logistic regression, support vector machines, k-nearest neighbors, and decision trees. For each algorithm you will learn how it works, when to use it, and where it typically breaks down.
04

Anatomy of a Learning Algorithm

Explains what all supervised learning algorithms share under the hood: an objective function, an optimization procedure, and a regularization strategy. You will learn to read any new algorithm through this common lens.
05

Basic Practice

Covers the practical workflow of a machine learning project: data splitting, cross-validation, hyperparameter tuning, and the pipeline from raw data to a deployed model. You will learn how decisions made early in a project constrain your options later.
06

Neural Networks and Deep Learning

Introduces feedforward networks, backpropagation, and activation functions, then explains how deep architectures extend these ideas. You will understand why depth matters and what problems it solves that shallow models cannot.
07

Problems and Solutions in Learning

Examines the bias-variance tradeoff, overfitting, underfitting, and the techniques — regularization, dropout, early stopping — used to address each. You will learn to diagnose a struggling model and choose the right remedy.
08

Unsupervised and Other Learning Paradigms

Covers clustering with k-means, dimensionality reduction with PCA, and introduces semi-supervised and self-supervised learning. You will see how to extract structure from data when labels are absent or scarce.
09

Advanced Topics Overview

Surveys ensemble methods including random forests and gradient boosting, and introduces generative models and transfer learning. You will build a map of where these techniques fit and when they outperform simpler approaches.
10

Conclusion and Further Reading

Synthesizes the material into a coherent picture of the ML landscape and points to authoritative sources for each major area. You will finish with a clear sense of what to study next based on your own goals.

Frequently asked questions

What mathematical background do I need to get value from this book?

Comfort with basic algebra and some exposure to probability and statistics is enough to follow the core explanations. Linear algebra concepts like vectors and matrices are introduced with the notation the book uses, so you do not need a formal course first.

Does the book include code or programming exercises?

The book focuses on concepts, theory, and algorithms rather than code listings. It is language-agnostic by design, making it a strong complement to hands-on courses or coding-focused resources rather than a replacement for them.

Is this book still relevant given how fast the ML field moves?

The fundamentals covered — supervised learning, neural network basics, model evaluation, regularization — are stable concepts that underpin nearly every modern ML system. The book's focus on theory rather than specific frameworks means it ages well.

Who is this book not for?

If you are already an experienced ML practitioner looking for advanced coverage of a specific area like reinforcement learning, Bayesian methods, or large language models, this book will feel too introductory. It is an orientation text, not a specialization manual.

Is this the complete book or an excerpt?

This is the full book. The concise length — roughly 160 pages — is intentional; every topic included was chosen to earn its place, and nothing was cut for a shorter edition.

How does this compare to longer ML textbooks like Bishop or Murphy?

Those books are reference volumes intended to be consulted rather than read linearly. This book is meant to be read cover to cover first, giving you the conceptual scaffolding to then get more out of a deeper reference when you need it.