AI-101

Large Language Model (LLM)

An AI system trained on massive text datasets to understand and generate human language.

core-conceptsmodels
AI Confidence: 85%

AI-generated

What It Means

A large language model is a neural network with billions of parameters trained on enormous amounts of text data. LLMs learn patterns in language - grammar, facts, reasoning, and even coding - by predicting the next word in sequences during training. GPT-4, Claude, Gemini, and LLaMA are all LLMs.

Why It Matters

LLMs are the technology behind virtually every modern AI tool. When you use ChatGPT, Claude, GitHub Copilot, or any AI writing assistant, you are using an LLM. Understanding what they are helps you understand their strengths (pattern matching, language generation, broad knowledge) and limitations (hallucinations, outdated training data, inability to truly "understand" in the human sense).

Sources & Further Reading

Anthropic: Claude model documentation - https://docs.anthropic.com/en/docs/about-claude/models

OpenAI: GPT-4 technical report - https://arxiv.org/abs/2303.08774

Wikipedia: Large language model - https://en.wikipedia.org/wiki/Large_language_model

Andrej Karpathy "Intro to Large Language Models" (video) - https://www.youtube.com/watch?v=zjkBMFhNj_g