AI-101

Paper #8

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020)

AI Confidence: 80%

AI-generated

TL;DR

RAG (Retrieval-Augmented Generation) combines a language model with an external knowledge retrieval system. Instead of relying solely on knowledge stored in its parameters, the model retrieves relevant documents at inference time and uses them to generate better, more factual responses.

What It Does

RAG consists of two components: a retriever and a generator. The retriever searches a document corpus for passages relevant to the input query. The generator (a sequence-to-sequence model) takes both the original input and the retrieved passages as context and generates the output.

The retriever is a dense passage retriever that encodes documents and queries into vectors and finds the closest matches. The generator is a pre-trained language model fine-tuned to use retrieved context.

Why It Matters

RAG addresses one of the biggest problems with language models: hallucination and outdated knowledge. A standard LLM can only use knowledge from its training data, which has a cutoff date and is compressed into model weights in a lossy way. RAG gives the model access to external, up-to-date, verifiable information at generation time.

This pattern is now ubiquitous in production AI applications. When you use ChatGPT with web browsing, Perplexity AI, or any enterprise AI that answers questions about your company's documents, you are using RAG.

The practical implications are enormous: you can build AI systems that answer questions about private databases, recent events, or specialized domains without retraining the model.

Key Details

Authors: Patrick Lewis, Ethan Perez, Aleksandra Piktus, and 7 others (Facebook AI Research, University College London, New York University).

Link to paper: https://arxiv.org/abs/2005.11401

Sources & Further Reading

Full paper: https://arxiv.org/abs/2005.11401

LangChain: RAG tutorial - https://python.langchain.com/docs/tutorials/rag/

Pinecone: "What is RAG?" - https://www.pinecone.io/learn/retrieval-augmented-generation/