Build a Large Language Model (From Scratch) versus Deep Learning with Python.
Both show up on every "best" list. They're not competitors. They're a sequence. Here's which one to read first, and when.
Reviewed by Ashish Sheth · Updated May 2026
Author
Sebastian Raschka
François Chollet, Matthew Watson
Pages
368
1250
Published
2024
2025
Publisher
Manning Publications
Manning Publications
Level
intermediate
intermediate
Amazon Rating
4.5/5 (445)
4.5/5 (25)
Goodreads Rating
4.6/5 (313)
4.57/5 (1,428)
Build a Large Language Model (From Scratch)
Strengths
+ Clear, step-by-step pedagogy that breaks down complex concepts into manageable pieces
+ Hands-on coding throughout, you build a working model on your laptop
+ Excellent diagrams and visual explanations alongside code
+ Companion GitHub repo has 91,000+ stars with bonus materials
Caveats
− Limited mathematical depth on why certain architectural choices exist
− Focuses only on GPT-style architecture, no coverage of alternatives
− Requires solid Python and basic ML knowledge to follow along
Deep Learning with Python
Strengths
+ Written by the creator of Keras — authority is unmatched
+ 3rd edition (2025) adds JAX, PyTorch, generative AI, and Keras 3 multi-backend
+ Clear, code-driven explanations without unnecessary math
+ Develops genuine intuition, not just recipe-following
Caveats
− 1,250 pages — significant time commitment
− Keras-first framing may feel indirect if you live in pure PyTorch
− Goes broad rather than deep on the newest LLM-era techniques (pair with AI Engineering for production LLM work)
The verdict
Choose based on your specific needs: Build a Large Language Model (From Scratch) focuses on transformer architecture from scratch, while Deep Learning with Python emphasizes deep learning from first principles.
Build a Large Language Model (From Scratch)
Check Price on Amazon →
Deep Learning with Python
Check Price on Amazon →
Frequently asked
Which is better, Build a Large Language Model (From Scratch) or Deep Learning with Python?
Choose based on your specific needs: Build a Large Language Model (From Scratch) focuses on transformer architecture from scratch, while Deep Learning with Python emphasizes deep learning from first principles.
Do I need a GPU to follow along?
No. The model you build is small enough to train on a regular laptop CPU. That's intentional. The goal is understanding, not training a production model.
How is the 3rd edition different from the 2nd?
The 3rd edition (October 2025) adds Keras 3 multi-backend support, PyTorch and JAX primers, and full coverage of modern generative AI. It's also significantly longer (1,250 vs 504 pages). If you read the 2nd edition recently, the new content is the main reason to upgrade.