Probabilistic Machine Learning

A rigorous foundation in Bayesian reasoning, probabilistic models, and modern machine learning methods

Build the mathematical intuition and practical toolkit to understand and apply probabilistic machine learning from first principles to modern deep models.

K Kevin P. Murphy

Probabilistic Machine Learning by Kevin P. Murphy gives you a unified, mathematically grounded treatment of the core ideas behind modern ML systems. Starting from probability theory and Bayesian inference, the book builds through linear models, deep neural networks, latent variable models, and Monte Carlo methods. At 858 pages, it is a serious reference and a teachable text, covering both the theory that explains why methods work and the practical details that matter when you apply them.

Buy on Amazon →

About this book

Most machine learning books teach you what to run. This one teaches you why it works. Kevin P. Murphy's Probabilistic Machine Learning builds a coherent framework rooted in probability theory and Bayesian reasoning, then uses that framework to derive, explain, and connect virtually every major class of ML algorithm.

The book opens with the mathematical foundations you actually need: probability, statistics, decision theory, and information theory. From there it moves through linear and logistic regression, feedforward and convolutional networks, recurrent models, attention and Transformers, and then into territory that most introductory texts skip: latent variable models, variational inference, normalizing flows, diffusion models, and Markov chain Monte Carlo.

What distinguishes this treatment is coherence. Each new model is introduced within the same probabilistic language, so you are always building on what came before rather than picking up disconnected techniques. A Gaussian mixture model, a VAE, and a Bayesian neural network all speak the same underlying dialect. That coherence pays off when you encounter a real problem and need to choose or design the right model rather than copy-paste a script.

Murphy writes for readers who are willing to engage with the mathematics. Derivations are shown, not hidden. But the exposition is careful and the notation consistent, so the density is earned rather than gratuitous. Worked examples and figures anchor the abstractions throughout.

Covers both classical probabilistic models and modern deep learning approaches within a single framework
Explains variational inference, EM, and MCMC in enough detail to actually implement them
Includes coverage of normalizing flows, energy-based models, and diffusion models
Freely available online in draft form — the print and digital editions are the polished, final version

Whether you are a graduate student building your theoretical foundation, a researcher who wants to stop treating black-box models as magic, or a senior practitioner who needs a reliable reference, this is the book you reach for when you want the real explanation.

🎯 What you'll learn

Apply Bayesian reasoning to model uncertainty in predictions and parameter estimates
Derive linear and logistic regression from first principles using the probabilistic framework
Understand how deep neural networks, CNNs, RNNs, and Transformers fit into a unified probabilistic picture
Implement latent variable models including mixture models and variational autoencoders
Use variational inference and expectation-maximization to fit models where exact inference is intractable
Sample from complex distributions using Markov chain Monte Carlo methods
Reason about normalizing flows, energy-based models, and diffusion-based generative models
Select and design models for real problems by understanding the assumptions embedded in each approach

👤 Who is this book for?

Graduate students in machine learning, statistics, or a related field who need a rigorous primary text or supplement to a course
ML engineers and researchers who want to move beyond heuristic intuition and understand the probabilistic reasoning behind the models they use
Data scientists with solid Python skills who are ready to engage seriously with the underlying mathematics
Practitioners implementing Bayesian models, variational methods, or generative models who need a trustworthy reference
Self-taught ML practitioners who have worked through an introductory course and want a deeper, more principled treatment

01

Probability: Univariate Models

Establishes the probabilistic language used throughout the book, covering random variables, common distributions, and the rules of probability that underpin every subsequent model.
02

Probability: Multivariate Models

Extends probability theory to joint, conditional, and marginal distributions, and introduces Gaussian distributions and their properties in detail.
03

Bayesian Statistics

Develops Bayesian inference from Bayes' rule through prior and posterior distributions, conjugate models, and the key ideas of credible intervals and model comparison.
04

Linear and Logistic Regression

Derives the two workhorses of supervised learning from the probabilistic framework, showing how maximum likelihood and MAP estimation connect to familiar loss functions.
05

Deep Neural Networks

Covers feedforward networks, backpropagation, regularization, and the practical training decisions that determine whether a network learns, all grounded in the probabilistic view.
06

Beyond Standard Feedforward Networks

Surveys convolutional networks, recurrent networks, attention mechanisms, and Transformers, explaining the architectural choices in terms of the structural assumptions they encode.
07

Latent Variable Models

Introduces mixture models, PCA, and variational autoencoders, showing how latent variables extend the expressiveness of the model class and how the EM algorithm fits them.
08

Approximate Inference

Covers variational inference and the evidence lower bound in enough mathematical detail to implement them, and shows how mean-field VI scales to large models.
09

Sampling and Monte Carlo Methods

Explains rejection sampling, importance sampling, and Markov chain Monte Carlo including Metropolis-Hastings and Hamiltonian Monte Carlo, with guidance on diagnosing convergence.
10

Generative Deep Learning

Surveys normalizing flows, energy-based models, score matching, and diffusion models, connecting each to the probabilistic framework developed earlier in the book.

Frequently asked questions

What mathematical background do I need before reading this book?

You should be comfortable with multivariate calculus, linear algebra, and basic probability at the undergraduate level. Murphy includes a concise review of key concepts, but the book is not a first introduction to mathematics.

Is this the same as Murphy's earlier 'Machine Learning: A Probabilistic Perspective' (2012)?

No. This is a substantially new book written for the 2020s, covering deep learning, Transformers, variational autoencoders, diffusion models, and other topics absent from the 2012 text. Treat it as a successor, not an update.

Is the book available for free online?

Murphy has made draft PDFs freely available on his website. The published MIT Press edition is the final, polished version with corrected notation and updated figures.

Does the book include code or programming exercises?

The main text is mathematically focused rather than code-first. Supplementary notebooks and code examples are available through the author's public GitHub repository linked from his website.

Is this book suitable for someone without a graduate-level background?

A strong undergraduate with solid mathematics and some ML exposure can work through it, but it is designed for graduate-level readers. If you are new to ML entirely, an introductory course first will make the material much more tractable.

How does this book relate to Murphy's 'Probabilistic Machine Learning: Advanced Topics'?

This volume covers foundations. The companion 'Advanced Topics' volume extends the treatment into research-level material such as causal inference, reinforcement learning, and more. Reading this book first is the intended path.