The Elements of Statistical Learning book cover — a Springer edition showing an abstract statistical or mathematical visualization in muted tones

Pages

Published

2013

AI Learning ✨ New

The Elements of Statistical Learning

Data Mining, Inference, and Prediction for Practitioners and Researchers

Build a rigorous foundation in statistical learning — from linear models to neural networks — and understand why each method works, not just how to call it.

J Jerome Friedman R Robert Tibshirani T Trevor Hastie

The Elements of Statistical Learning by Hastie, Tibshirani, and Friedman is the standard reference for anyone who wants to understand modern machine learning at a mathematical level. Covering supervised and unsupervised methods, model selection, regularization, ensemble methods, and more, it gives practitioners and researchers the conceptual tools to evaluate, adapt, and apply statistical learning methods with confidence across real problems.

Buy on Amazon →

About this book

Most machine learning tutorials teach you which function to call. This book teaches you what the function is doing and when it will fail. That distinction separates practitioners who apply methods from those who understand them — and it is the difference this book is designed to make.

Written by three of the most cited researchers in statistics and machine learning, The Elements of Statistical Learning has been the go-to reference for graduate students, data scientists, and applied researchers for over two decades. It covers the full breadth of statistical learning: from linear and logistic regression through support vector machines, random forests, gradient boosting, and neural networks, always grounding each technique in the statistical theory that explains its behavior.

The book does not assume you will blindly trust a library default. It shows you how bias-variance tradeoff shapes every modeling decision, how regularization controls model complexity, and how cross-validation and information criteria let you make honest comparisons between competing approaches. You will come away knowing not just what to run, but how to reason about the results.

Supervised learning: linear methods, classification, kernel smoothing, additive models
Model selection and assessment: AIC, BIC, cross-validation, bootstrap
Regularization: ridge regression, the lasso, and elastic net
Tree-based methods: CART, random forests, gradient boosted trees
Support vector machines and kernel methods
Unsupervised learning: clustering, principal components, independent component analysis
Neural networks and deep architecture foundations

At 545 pages, the book is dense by design. Each chapter builds on the last, and the mathematical notation is precise. Readers who engage with it seriously — working through the derivations, not just the prose — consistently describe it as the book that finally made machine learning legible to them at a fundamental level.

Springer publishes this edition, and the authors have made a PDF freely available through Stanford. The physical volume is worth owning for sustained study: the layout is clean, the index is thorough, and having it on your desk signals the kind of seriousness the subject deserves.

🎯 What you'll learn

Derive and interpret linear and logistic regression from first principles, not just from sklearn documentation
Apply the bias-variance decomposition to diagnose overfitting and underfitting in your own models
Select and tune regularization methods — ridge, lasso, elastic net — based on the structure of your data
Understand how random forests and gradient boosting reduce error through ensemble strategies
Evaluate competing models honestly using cross-validation, AIC, BIC, and bootstrap estimates
Recognize the conditions under which support vector machines and kernel methods outperform simpler alternatives
Interpret unsupervised methods including PCA and clustering as formal optimization problems with defined assumptions

👤 Who is this book for?

Data scientists who apply machine learning daily and want to understand the theory behind the tools they use
Graduate students in statistics, computer science, or related fields looking for a rigorous core text
Software engineers transitioning into machine learning who are comfortable with linear algebra and calculus
Applied researchers who need to evaluate, adapt, or extend statistical methods for domain-specific problems
Practitioners preparing for technical interviews or ML research roles where theoretical depth is tested

01

Introduction

Sets the scope and vocabulary of statistical learning, distinguishing prediction from inference and supervised from unsupervised problems. Establishes the notation and framing used throughout the book.
02

Overview of Supervised Learning

Introduces the core ideas of input-output modeling, least squares, and nearest-neighbor methods. Develops the statistical decision theory that underlies all supervised approaches covered later.
03

Linear Methods for Regression

Covers ordinary least squares, subset selection, shrinkage methods including ridge and lasso, and derived input directions. You will understand why regularization works geometrically and statistically.
04

Linear Methods for Classification

Examines linear discriminant analysis, logistic regression, and separating hyperplanes. Contrasts the assumptions each method makes and the conditions under which each performs best.
05

Basis Expansions and Regularization

Extends linear models using splines, wavelets, and reproducing kernel Hilbert spaces. Shows how smoothness constraints translate into regularization penalties.
06

Kernel Smoothing Methods

Covers local regression, kernel density estimation, and local likelihood. Develops the idea of locally adaptive fitting as an alternative to global parametric models.
07

Model Assessment and Selection

Formalizes the concepts of generalization error, cross-validation, bootstrap, and information criteria. Gives you a principled framework for comparing models without overfitting the comparison itself.
08

Model Inference and Averaging

Introduces the bootstrap as an inference tool, Bayesian approaches to modeling, and model averaging including bagging. Connects frequentist and Bayesian perspectives on uncertainty.
09

Additive Models, Trees, and Related Methods

Develops generalized additive models, CART decision trees, and PRIM. Shows how these methods balance interpretability and flexibility in practice.
10

Boosting and Additive Trees

Presents AdaBoost, gradient boosting machines, and stochastic gradient boosting as forward stagewise additive modeling. Explains why boosting is one of the most effective off-the-shelf prediction methods available.

Frequently asked questions

What mathematical background do I need to get the most out of this book?

You should be comfortable with linear algebra, multivariate calculus, and basic probability and statistics. Readers without this background will find the notation difficult to follow, and a stats or calculus refresher is recommended before starting.

Is this book suitable for practitioners, or is it primarily for researchers?

Both groups use it, but the emphasis is on understanding methods rather than applying them through code. Practitioners who want to move beyond black-box usage will find it invaluable; those looking for implementation tutorials should pair it with a more applied text.

Does the book include code examples or software exercises?

The book focuses on mathematical exposition rather than code. It is not a programming manual, and examples are presented analytically. For code-based exploration of the same methods, you may want to supplement with R or Python resources.

How does this relate to the authors' other book, 'An Introduction to Statistical Learning'?

An Introduction to Statistical Learning (ISL) is a gentler, more applied version aimed at readers without a heavy math background. The Elements of Statistical Learning (ESL) is the deeper, more rigorous treatment that ISL was derived from. Many readers start with ISL and graduate to ESL.

Is the 2013 edition still current and relevant?

Yes. The core statistical learning methods covered — regularization, tree methods, SVMs, ensemble approaches — remain foundational and in wide use. The field has moved toward deep learning since publication, but this book's content is not outdated for the methods it covers.

Get this book

Buy on Amazon →

Specs

Publisher: Springer Science & Business Media
Published: Nov 2013
Pages: 545
Language: English

About the authors

Jerome Friedman

Robert Tibshirani

Trevor Hastie

New

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

A practical, project-driven introduction to machine learning and deep learning with Python

by Aurélien Géron

AI Learning

2022 View →

New

Designing Machine Learning Systems

An Iterative Process for Production-Ready Machine Learning Applications

by Chip Huyen

AI Learning

2022 View →

New

Probabilistic Machine Learning

A rigorous foundation in Bayesian reasoning, probabilistic models, and modern machine learning methods

by Kevin P. Murphy

AI Learning

2022 View →

Cover of Artificial Intelligence: A Modern Approach by Russell and Norvig, showing abstract symbolic representation of intelligent systems

New

Artificial Intelligence: A Modern Approach, Global Edition

The definitive textbook on intelligent systems, from foundational search and logic to modern machine learning and probabilistic reasoning

by Peter Norvig, Stuart Russell

AI Learning

2021 View →

The Elements of Statistical Learning

About this book

🎯 What you'll learn

👤 Who is this book for?

Table of contents

Introduction

Overview of Supervised Learning

Linear Methods for Regression

Linear Methods for Classification

Basis Expansions and Regularization

Kernel Smoothing Methods

Model Assessment and Selection

Model Inference and Averaging

Additive Models, Trees, and Related Methods

Boosting and Additive Trees

Frequently asked questions

You might also like

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Designing Machine Learning Systems

Probabilistic Machine Learning

Artificial Intelligence: A Modern Approach, Global Edition

The Elements of Statistical Learning

About this book

🎯 What you'll learn

👤 Who is this book for?

Table of contents

Introduction

Overview of Supervised Learning

Linear Methods for Regression

Linear Methods for Classification

Basis Expansions and Regularization

Kernel Smoothing Methods

Model Assessment and Selection

Model Inference and Averaging

Additive Models, Trees, and Related Methods

Boosting and Additive Trees

Frequently asked questions

You might also like

Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow

Designing Machine Learning Systems

Probabilistic Machine Learning

Artificial Intelligence: A Modern Approach, Global Edition

Stay ahead of the curve