Build a Large Language Model (From Scratch) versus Generative Deep Learning.
Both show up on every "best" list. They're not competitors. They're a sequence. Here's which one to read first, and when.
Reviewed by Ashish Sheth · Updated April 2026
Author
Sebastian Raschka
David Foster
Pages
368
453
Published
2024
2023
Publisher
Manning Publications
Shroff Publishers / O'Reilly Media
Level
intermediate
intermediate
Amazon Rating
4.5/5 (445)
4.5/5 (205)
Goodreads Rating
4.6/5 (313)
4.3/5 (264)
Build a Large Language Model (From Scratch)
Strengths
+ Clear, step-by-step pedagogy that breaks down complex concepts into manageable pieces
+ Hands-on coding throughout, you build a working model on your laptop
+ Excellent diagrams and visual explanations alongside code
+ Companion GitHub repo has 91,000+ stars with bonus materials
Caveats
− Limited mathematical depth on why certain architectural choices exist
− Focuses only on GPT-style architecture, no coverage of alternatives
− Requires solid Python and basic ML knowledge to follow along
Generative Deep Learning
Strengths
+ Best single book covering the breadth of generative architectures
+ 2nd edition adds diffusion models — essential for 2026 readers
+ Code-first with Keras implementations you can run
+ Strong theoretical grounding without being math-heavy
Caveats
− Keras/TensorFlow focus when much of generative ML is now PyTorch
− Diffusion chapter is solid but the field has moved fast since 2023
− Less coverage of LLM generation than readers may expect from the title
The verdict
Build a Large Language Model (From Scratch) is the stronger pick overall, but Generative Deep Learning may suit you better if you're a developers exploring generative models.
Build a Large Language Model (From Scratch)
Check Price on Amazon →
Generative Deep Learning
Check Price on Amazon →
Frequently asked
Which is better, Build a Large Language Model (From Scratch) or Generative Deep Learning?
Build a Large Language Model (From Scratch) is the stronger pick overall, but Generative Deep Learning may suit you better if you're a developers exploring generative models.
Do I need a GPU to follow along?
No. The model you build is small enough to train on a regular laptop CPU. That's intentional. The goal is understanding, not training a production model.
Does this book cover Stable Diffusion and modern image generation?
The 2nd edition (2023) added a diffusion-models chapter that covers the architecture behind Stable Diffusion. Specific tools and APIs have evolved since, but the architecture explanations hold up.