AI Research Papers
Plain-language TL;DR summaries of the most influential AI research papers. Each entry covers what the paper does, why it matters, and links to the full text.
- Gemini 3 and Gemini 2.5 - Google DeepMind (2025-2026)
Google's Gemini model family advanced rapidly through 2025-2026, with Gemini 2.5 Pro/Flash (mid-2025) and Gemini 3 Pro/Deep Think (November 2025)...
- GPT-5 System Card - OpenAI (August 2025)
GPT-5, released August 2025, is not a single model but a unified system with a smart fast model, a deep reasoning model, and a real-time router...
- Claude 4 System Card - Anthropic (May 2025)
Anthropic's Claude Opus 4 and Claude Sonnet 4, released May 2025, came with a 120-page system card - the most detailed safety disclosure ever...
- DeepSeek-R1: Incentivizing Reasoning via Reinforcement Learning (January 2025)
DeepSeek-R1 showed that pure reinforcement learning - without any human-labeled reasoning examples - can teach a language model to reason step by...
- LLaMA 4: Natively Multimodal Open-Source AI - Meta (April 2025)
Meta's LLaMA 4 introduced the first open-weight natively multimodal models using a Mixture-of-Experts architecture, with Scout (109B total...
- Circuit Tracing: Mechanistic Interpretability Breakthrough - Anthropic (2025)
Anthropic's circuit tracing research revealed how language models transform prompts into responses at the mechanistic level - tracing not just...
- Meta Chain-of-Thought: System 2 Reasoning in LLMs (January 2025)
"Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought" proposed training models to generate their own reasoning...
- Alignment Faking in Large Language Models - Anthropic (December 2024)
Anthropic researchers demonstrated that AI models can exhibit "alignment faking" - appearing to comply with training objectives during training...
- From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review (2025)
This comprehensive survey covers the rapid evolution of AI agents from simple chatbots to autonomous systems that plan, act, and learn through...
- AlphaGenome and AI for Science - Google DeepMind (2025-2026)
Google DeepMind expanded AI's impact on scientific research with AlphaGenome (genomics), GenCast (weather prediction), and continued work on...
- Gemini: A Family of Highly Capable Multimodal Models (2023)
Google's Gemini is a natively multimodal model family trained from the ground up on text, images, audio, and video interleaved together, rather...
- The Claude 3 Model Family (2024)
Anthropic's Claude 3 technical report describes a family of three multimodal models (Haiku, Sonnet, Opus) that set new standards for safety,...
- Direct Preference Optimization: Your Language Model Is Secretly a Reward Model (2023)
DPO simplifies the RLHF pipeline by eliminating the need for a separate reward model and reinforcement learning phase. It directly optimizes the...
- Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2023)
Tree of Thoughts (ToT) extends chain-of-thought prompting by having the model explore multiple reasoning paths, evaluate them, and backtrack when...
- Switch Transformers: Scaling to Trillion Parameter Models (2022)
Switch Transformers introduced a simplified mixture-of-experts (MoE) approach where each input token is routed to a single expert, enabling models...
- Mistral 7B (2023)
Mistral 7B is a 7-billion-parameter model that outperforms the 13B LLaMA 2 on all benchmarks and matches or beats the 34B LLaMA on many tasks. It...
- LoRA: Low-Rank Adaptation of Large Language Models (2021)
LoRA enables fine-tuning of large language models by training only a small number of additional parameters (0.01% of the original), making...
- High-Resolution Image Synthesis with Latent Diffusion Models (2022)
Latent Diffusion Models (the paper behind Stable Diffusion) made high-quality image generation computationally accessible by running the diffusion...
- Denoising Diffusion Probabilistic Models (2020)
This paper established diffusion models as a viable approach to image generation. Diffusion models work by learning to reverse a gradual noising...
- DALL-E: Zero-Shot Text-to-Image Generation (2021)
DALL-E showed that a Transformer trained on text-image pairs can generate original images from text descriptions, with no task-specific training....
- Scaling Laws for Neural Language Models (2020)
This paper discovered that language model performance follows predictable mathematical relationships (power laws) with model size, dataset size,...
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)
Simply adding "Let's think step by step" to a prompt dramatically improves language model performance on reasoning tasks. This paper formalized...
- Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020)
RAG (Retrieval-Augmented Generation) combines a language model with an external knowledge retrieval system. Instead of relying solely on knowledge...
- GPT-4 Technical Report (2023)
GPT-4 is a large multimodal model that accepts both text and image inputs and produces text outputs. It represents a significant leap in...
- LLaMA: Open and Efficient Foundation Language Models (2023)
Meta's LLaMA showed that smaller, more efficiently trained open-source models can match or beat much larger proprietary models. It democratized...
- Constitutional AI: Harmlessness from AI Feedback (2022)
Constitutional AI (CAI) is Anthropic's approach to making AI systems safer without relying entirely on human labelers. Instead of humans rating...
- Training Language Models to Follow Instructions with Human Feedback (2022)
This paper (often called the InstructGPT paper) introduced reinforcement learning from human feedback (RLHF) as a method to align language models...
- Language Models Are Few-Shot Learners - GPT-3 (2020)
GPT-3 demonstrated that scaling a language model to 175 billion parameters enables it to perform tasks it was never explicitly trained on, simply...
- BERT: Pre-training of Deep Bidirectional Transformers (2018)
BERT showed that pre-training a Transformer to understand language bidirectionally (looking at words both before and after a given position)...
- Attention Is All You Need (2017)
This paper introduced the Transformer architecture, which replaced recurrence and convolution with self-attention mechanisms. It is the foundation...