Definition
Retrieval-Augmented Generation (RAG) is a pattern where the system retrieves relevant documents from a knowledge base and includes them in the prompt before asking the LLM to generate an answer. RAG grounds the model in current, domain-specific facts instead of relying on training data alone. Used for documentation chatbots, customer support, internal search. Cheaper and faster to update than fine-tuning.
Example
User asks 'What's our refund policy?' → System retrieves the refund policy doc → Includes it in the prompt → LLM answers grounded in the doc.
When to use
Domain knowledge, frequently-updated facts, customer support, documentation Q&A.
Also known as
retrieval augmented generation