Pages

463

Year

2024

Level

intermediate

Read time

12h

Louis-François Bouchard, Louie Peters · Towards AI · 2024

Reviewed by Ashish Sheth · Updated May 2026

Building LLMs for Production

Name: Building LLMs for Production
Rating: 4.8 (23 reviews)
ISBN: 9789355427830

Enhancing LLM Abilities and Reliability with Prompting, Fine-Tuning, and RAG

4.8 / 5

AMAZON · 23 RATINGS

ai engineering · llm

SUBJECTS

Check Price on Amazon →

What you'll come away with

01.

How to select the right LLM for your production use case

02.

Building reliable RAG systems with proper retrieval and indexing

03.

When to fine-tune vs use RAG vs prompt engineer

04.

Practical LangChain and LlamaIndex implementation patterns

05.

Quantization and distillation techniques for reducing inference costs

06.

How to build AI agents with tool access

Strengths

+Clear explanations with simple analogies for complex concepts

+Practical code examples using LangChain and LlamaIndex

+Strong chapters on fine-tuning, quantization, and distillation

+Endorsed by Jerry Liu (CEO of LlamaIndex) as most comprehensive LLM apps textbook

Caveats

−Title says 'Production' but lacks depth on actual hosting and serving infrastructure

−Reads more as applied research than a practical engineering guide

−Better as a reference than a cover-to-cover read

★ 4.8 FROM 23 READERS ON AMAZON

Check Price on Amazon →

Read this if

→Engineers building their first RAG or LLM-powered application

→ML practitioners wanting a survey of LLM techniques with code

→Teams deciding between fine-tuning, RAG, and prompting strategies

Skip this if

—Senior ML engineers already running LLM systems in production

—People wanting deep transformer theory (see Build a Large Language Model)

—Those looking for cloud infrastructure and DevOps guidance for LLMs

Head-to-head comparisons

Building LLMs for Production vs AI Engineering → Building LLMs for Production vs Designing Machine Learning Systems → Building LLMs for Production vs LLM Engineer's Handbook → Building LLMs for Production vs Hands-On Large Language Models → Building LLMs for Production vs AI Agents in Action →

MORE AI & ML ENGINEERING BOOKS →

Frequently asked

Is Building LLMs for Production good for beginners?

You need basic Python and some understanding of what LLMs are. It's not a first-ever AI book, but it starts from fundamentals and builds up.

Does it cover the latest models like GPT-4 and Claude?

It covers LLM concepts and patterns that apply to any model. The frameworks (LangChain, LlamaIndex) work with all major providers.

Does it actually cover deploying LLMs to production servers?

Not deeply. Despite the title, the book leans toward applied techniques like prompting, fine-tuning, RAG, and distillation rather than the infrastructure side of serving models. For hosting and scaling, pair it with a cloud-specific guide or the LLM Engineer's Handbook, which is more deployment-focused.