Paper #28

Claude 4 System Card - Anthropic (May 2025)

AI Confidence: 85%

AI-generated

TL;DR

Anthropic's Claude Opus 4 and Claude Sonnet 4, released May 2025, came with a 120-page system card - the most detailed safety disclosure ever published for a frontier AI model. Claude Opus 4 was released under AI Safety Level 3 (ASL-3), making it the first commercial model at that safety classification.

What It Does

Claude Opus 4 is a frontier reasoning model with strong performance across coding, math, analysis, and long-context tasks. Claude Sonnet 4 is a faster, more affordable model that closes much of the gap with Opus.

The system card details the training process: pre-training on publicly available internet data (through March 2025), third-party licensed data, contractor data, opt-in user data, and Anthropic-generated data. Alignment uses human feedback, Constitutional AI, and selected character trait training.

Extensive red-teaming found no evidence of systematic deception, hidden goals, or coherent scheming behavior - a key concern for models at this capability level.

Why It Matters

The ASL-3 classification is significant. Anthropic's Responsible Scaling Policy defines ASL-3 as the level where models could potentially provide meaningful uplift for catastrophic misuse. Releasing under ASL-3 means Anthropic implemented enhanced containment, monitoring, and deployment safeguards.

The 120-page system card sets a new standard for transparency. Where most labs publish 10-20 page model cards, Anthropic detailed every aspect of evaluation, safety testing, and known limitations.

Claude 4.5 Opus followed (a "thinking" model), then Claude Opus 4.6 and Sonnet 4.6 in February 2026 with 1M token context at standard pricing.

Key Details

Organization: Anthropic. Release: May 22, 2025 (Opus 4, Sonnet 4). February 2026 (Opus 4.6, Sonnet 4.6). Safety level: ASL-3 (Opus 4), ASL-2 (Sonnet 4). System card: 120 pages.

Sources & Further Reading

Anthropic: Claude 4 System Card (PDF) - https://www-cdn.anthropic.com/4263b940cabb546aa0e3283f35b686f4f3b2ff47.pdf

Simon Willison: Claude 4 System Card analysis - https://simonwillison.net/2025/may/25/claude-4-system-card/

Anthropic: Model transparency hub - https://www.anthropic.com/transparency/model-report