Paper #26

LLaMA 4: Natively Multimodal Open-Source AI - Meta (April 2025)

AI Confidence: 85%

AI-generated

TL;DR

Meta's LLaMA 4 introduced the first open-weight natively multimodal models using a Mixture-of-Experts architecture, with Scout (109B total parameters, 10M token context) and Maverick (400B total, 1M context) making frontier-class multimodal AI available to everyone.

What It Does

LLaMA 4 Scout uses 16 experts with 17B active parameters per input, achieving strong performance with a remarkable 10 million token context window - enough to process entire codebases or book series. Maverick uses 128 experts with 400B total parameters and a 1M context window, competing with proprietary models.

Both models are natively multimodal: they process text, images, and video as first-class inputs, trained on interleaved multimodal data from the start rather than bolting vision onto a text model.

Meta also announced Behemoth (288B active, ~2T total), which was still in training at release time.

Why It Matters

LLaMA 4 continued Meta's pattern of democratizing AI. Open-weight models with competitive performance allow anyone to run, customize, and study frontier AI. The MoE architecture makes these large models practical to run - you only activate 17B parameters per input despite the model containing hundreds of billions.

The 10M token context on Scout was unprecedented for an open model. This enables applications (whole-codebase analysis, long-document processing) that were previously locked behind proprietary APIs.

Key Details

Organization: Meta AI. Release: April 5, 2025. Scout: 17B active, 109B total, 16 experts, 10M context. Maverick: 17B active, 400B total, 128 experts, 1M context. Open weights: Yes.

Sources & Further Reading

Meta AI Blog: LLaMA 4 announcement - https://ai.meta.com/blog/llama-4-multimodal-intelligence/

LLaMA official site - https://www.llama.com/

Hugging Face: LLaMA 4 models - https://huggingface.co/meta-llama