New
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow
A practical, project-driven introduction to machine learning and deep learning with Python
Pages
388
Published
2022
An Iterative Process for Production-Ready Machine Learning Applications
Learn to design, build, and maintain ML systems that actually work in production — from data pipelines to model monitoring.
Most ML courses stop at model accuracy. This book starts where they end. Chip Huyen walks you through every layer of a production ML system — data engineering, feature stores, training pipelines, deployment strategies, and monitoring — giving you the mental models and practical tools to build systems that hold up under real-world conditions. At 388 pages, it covers the full lifecycle without padding, making it the clearest practitioner's guide to ML system design available today.
Training a model is the easy part. Keeping it accurate, reliable, and cost-effective in production is where most ML projects fail. Chip Huyen wrote this book because the gap between a Jupyter notebook and a live system serving millions of requests is enormous, and almost no resource addressed it head-on.
This book gives you a complete mental model of what a production ML system actually looks like. You will learn how data flows through an organization, how features are computed and stored at scale, how training pipelines are structured to support fast iteration, and how deployment decisions affect latency, cost, and reliability. Each chapter builds on the last, so by the end you have a coherent picture of the entire lifecycle rather than a collection of isolated techniques.
Huyen is direct about tradeoffs. Batch inference versus online inference. Feature stores versus on-the-fly computation. Shadow deployment versus canary releases. You will understand not just how to implement each approach but when to choose it and what you are giving up. That kind of reasoning is what separates engineers who ship ML systems from engineers who demo them.
The book also confronts the operational reality that most practitioners face: data drift, model decay, feedback loops, and the organizational friction of keeping a system accurate over time. A chapter dedicated to monitoring and observability shows you what to measure, what to alert on, and how to diagnose degradation before users notice it.
Whether you are the first ML engineer at a startup or moving from research into a platform role at a larger company, this book gives you the vocabulary, the frameworks, and the practical judgment to design systems that survive contact with production.
Establishes what production ML systems are and why they differ fundamentally from one-off model training. You will learn how the components fit together and what makes ML systems uniquely difficult to build and maintain.
Frames the design process around business objectives, requirements, and constraints. You will practice translating a vague product goal into concrete system requirements before writing a single line of code.
Covers the data layer: sources, formats, storage engines, and data flow patterns. You will understand how data moves through an organization and where the common failure points are.
Addresses labeling, sampling strategies, class imbalance, and data augmentation. You will learn how the quality and composition of training data shapes every downstream modeling decision.
Explains how to create, transform, and store features for both batch and real-time use. You will evaluate when a feature store is worth the investment and how to avoid common feature leakage mistakes.
Covers model selection, experiment tracking, hyperparameter tuning, and evaluation metrics that reflect business goals. You will build workflows that make experiments reproducible and comparable.
Walks through batch, online, streaming, and edge deployment patterns and the infrastructure each requires. You will match deployment strategy to product latency and cost constraints.
Defines data drift, concept drift, and feedback loops, then shows you how to detect and respond to each. You will design a monitoring setup that catches model degradation before it reaches users.
Explains how to update models safely using shadow deployment, canary releases, and A/B testing. You will learn when continual retraining is worth the infrastructure cost and when it is not.
Surveys the tooling landscape — orchestration, serving frameworks, feature platforms, and experiment trackers — and gives you criteria for evaluating and selecting them for your team's context.
You should be comfortable with basic ML concepts — supervised learning, model evaluation, training loops — at roughly the level of a university ML course or equivalent self-study. The book does not teach modeling fundamentals; it focuses on systems design around them.
It is primarily conceptual and design-oriented, with code samples used to illustrate specific points rather than as the main vehicle of instruction. If you want line-by-line implementation tutorials, this is not that book — it is focused on decision-making and architecture.
The core concepts — data pipelines, deployment patterns, drift monitoring, and system design tradeoffs — are stable and remain directly applicable. Specific tool names in the MLOps ecosystem evolve quickly, so treat those sections as a framework for evaluation rather than a current vendor guide.
Yes. Huyen explicitly addresses resource-constrained environments and explains which practices scale down to small teams. Many readers apply the frameworks working solo or on a team of two or three engineers.
The book includes code snippets throughout the text. Check the publisher's page at O'Reilly for any associated resources or errata the author has released since publication.
New
A practical, project-driven introduction to machine learning and deep learning with Python
New
A rigorous foundation in Bayesian reasoning, probabilistic models, and modern machine learning methods
New
The definitive textbook on intelligent systems, from foundational search and logic to modern machine learning and probabilistic reasoning
by Peter Norvig, Stuart Russell
New
A Programmer's Guide to Building AI and Machine Learning Models with TensorFlow