AI Research Papers

Plain-language TL;DR summaries of the most influential AI research papers. Each entry covers what the paper does, why it matters, and links to the full text.

Gemini 3 and Gemini 2.5 - Google DeepMind (2025-2026)
Google's Gemini model family advanced rapidly through 2025-2026, with Gemini 2.5 Pro/Flash (mid-2025) and Gemini 3 Pro/Deep Think (November 2025)...
GPT-5 System Card - OpenAI (August 2025)
GPT-5, released August 2025, is not a single model but a unified system with a smart fast model, a deep reasoning model, and a real-time router...
Claude 4 System Card - Anthropic (May 2025)
Anthropic's Claude Opus 4 and Claude Sonnet 4, released May 2025, came with a 120-page system card - the most detailed safety disclosure ever...
DeepSeek-R1: Incentivizing Reasoning via Reinforcement Learning (January 2025)
DeepSeek-R1 showed that pure reinforcement learning - without any human-labeled reasoning examples - can teach a language model to reason step by...
LLaMA 4: Natively Multimodal Open-Source AI - Meta (April 2025)
Meta's LLaMA 4 introduced the first open-weight natively multimodal models using a Mixture-of-Experts architecture, with Scout (109B total...
Circuit Tracing: Mechanistic Interpretability Breakthrough - Anthropic (2025)
Anthropic's circuit tracing research revealed how language models transform prompts into responses at the mechanistic level - tracing not just...
Meta Chain-of-Thought: System 2 Reasoning in LLMs (January 2025)
"Towards System 2 Reasoning in LLMs: Learning How to Think With Meta Chain-of-Thought" proposed training models to generate their own reasoning...
Alignment Faking in Large Language Models - Anthropic (December 2024)
Anthropic researchers demonstrated that AI models can exhibit "alignment faking" - appearing to comply with training objectives during training...
From LLM Reasoning to Autonomous AI Agents: A Comprehensive Review (2025)
This comprehensive survey covers the rapid evolution of AI agents from simple chatbots to autonomous systems that plan, act, and learn through...
AlphaGenome and AI for Science - Google DeepMind (2025-2026)
Google DeepMind expanded AI's impact on scientific research with AlphaGenome (genomics), GenCast (weather prediction), and continued work on...
Gemini: A Family of Highly Capable Multimodal Models (2023)
Google's Gemini is a natively multimodal model family trained from the ground up on text, images, audio, and video interleaved together, rather...
The Claude 3 Model Family (2024)
Anthropic's Claude 3 technical report describes a family of three multimodal models (Haiku, Sonnet, Opus) that set new standards for safety,...
Direct Preference Optimization: Your Language Model Is Secretly a Reward Model (2023)
DPO simplifies the RLHF pipeline by eliminating the need for a separate reward model and reinforcement learning phase. It directly optimizes the...
Tree of Thoughts: Deliberate Problem Solving with Large Language Models (2023)
Tree of Thoughts (ToT) extends chain-of-thought prompting by having the model explore multiple reasoning paths, evaluate them, and backtrack when...
Switch Transformers: Scaling to Trillion Parameter Models (2022)
Switch Transformers introduced a simplified mixture-of-experts (MoE) approach where each input token is routed to a single expert, enabling models...
Mistral 7B (2023)
Mistral 7B is a 7-billion-parameter model that outperforms the 13B LLaMA 2 on all benchmarks and matches or beats the 34B LLaMA on many tasks. It...
LoRA: Low-Rank Adaptation of Large Language Models (2021)
LoRA enables fine-tuning of large language models by training only a small number of additional parameters (0.01% of the original), making...
High-Resolution Image Synthesis with Latent Diffusion Models (2022)
Latent Diffusion Models (the paper behind Stable Diffusion) made high-quality image generation computationally accessible by running the diffusion...
Denoising Diffusion Probabilistic Models (2020)
This paper established diffusion models as a viable approach to image generation. Diffusion models work by learning to reverse a gradual noising...
DALL-E: Zero-Shot Text-to-Image Generation (2021)
DALL-E showed that a Transformer trained on text-image pairs can generate original images from text descriptions, with no task-specific training....
Scaling Laws for Neural Language Models (2020)
This paper discovered that language model performance follows predictable mathematical relationships (power laws) with model size, dataset size,...
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (2022)
Simply adding "Let's think step by step" to a prompt dramatically improves language model performance on reasoning tasks. This paper formalized...
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks (2020)
RAG (Retrieval-Augmented Generation) combines a language model with an external knowledge retrieval system. Instead of relying solely on knowledge...
GPT-4 Technical Report (2023)
GPT-4 is a large multimodal model that accepts both text and image inputs and produces text outputs. It represents a significant leap in...
LLaMA: Open and Efficient Foundation Language Models (2023)
Meta's LLaMA showed that smaller, more efficiently trained open-source models can match or beat much larger proprietary models. It democratized...
Constitutional AI: Harmlessness from AI Feedback (2022)
Constitutional AI (CAI) is Anthropic's approach to making AI systems safer without relying entirely on human labelers. Instead of humans rating...
Training Language Models to Follow Instructions with Human Feedback (2022)
This paper (often called the InstructGPT paper) introduced reinforcement learning from human feedback (RLHF) as a method to align language models...
Language Models Are Few-Shot Learners - GPT-3 (2020)
GPT-3 demonstrated that scaling a language model to 175 billion parameters enables it to perform tasks it was never explicitly trained on, simply...
BERT: Pre-training of Deep Bidirectional Transformers (2018)
BERT showed that pre-training a Transformer to understand language bidirectionally (looking at words both before and after a given position)...
Attention Is All You Need (2017)
This paper introduced the Transformer architecture, which replaced recurrence and convolution with self-attention mechanisms. It is the foundation...