Paper #14
LoRA: Low-Rank Adaptation of Large Language Models (2021)
AI-generated
LoRA enables fine-tuning of large language models by training only a small number of additional parameters (0.01% of the original), making customization of massive models practical on consumer hardware.
Instead of updating all parameters during fine-tuning (which requires enormous GPU memory for large models), LoRA freezes the pre-trained model weights and injects small trainable rank-decomposition matrices into each Transformer layer. These low-rank matrices capture the task-specific adaptations without modifying the original weights.
The key mathematical insight: the weight updates during fine-tuning have low intrinsic rank, meaning they can be effectively represented by much smaller matrices. A 175B parameter model can be adapted with only 4.7M trainable parameters.
LoRA made fine-tuning accessible. Before LoRA, fine-tuning GPT-3-scale models required hundreds of gigabytes of GPU memory and specialized infrastructure. With LoRA, you can fine-tune a 7B parameter model on a single consumer GPU.
It also enables efficient multi-task deployment: the base model stays frozen, and you swap in different LoRA adapters for different tasks. A single base model can serve dozens of specialized tasks with minimal overhead.
LoRA is now the standard approach for fine-tuning open-source models. The entire Stable Diffusion community uses LoRA to create specialized image generation styles. The technique has been extended to QLoRA (quantized LoRA), which further reduces memory requirements.
Authors: Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen (Microsoft).
Key result: GPT-3 175B fine-tuned with LoRA matches full fine-tuning performance with 10,000x fewer trainable parameters.
Link to paper: https://arxiv.org/abs/2106.09685
Full paper: https://arxiv.org/abs/2106.09685
Hugging Face PEFT library (includes LoRA) - https://huggingface.co/docs/peft
Sebastian Raschka: "LoRA explained" - https://magazine.sebastianraschka.com/