Curiosity · Concept
LoRA (Low-Rank Adaptation)
Low-Rank Adaptation (LoRA), introduced by Hu et al. in 2021, is the dominant way to fine-tune large models on commodity hardware. Instead of updating all N billion parameters, you freeze them and add small low-rank matrices (rank 4-64) to the weight updates. You end up training millions of parameters instead of billions — with quality that matches full fine-tuning on most tasks.
Quick reference
- Proficiency
- Intermediate
- Also known as
- LoRA, low-rank adapter, PEFT (LoRA family)
- Prerequisites
- Fine-tuning, Linear algebra (matrix rank)
Frequently asked questions
What is LoRA?
LoRA (Low-Rank Adaptation) is a fine-tuning method that freezes the base model and adds small trainable low-rank matrices alongside each weight matrix. Only these small matrices are trained, yielding a compact 'adapter' that captures the fine-tuning while using a fraction of the memory.
Why does LoRA work despite training so few parameters?
Empirically, the 'update' from fine-tuning lies in a low-dimensional subspace — you don't need all N^2 degrees of freedom to express it. A rank-8 or rank-16 approximation captures most of the signal, which is why LoRA matches full fine-tuning on most downstream benchmarks.
What is QLoRA?
QLoRA (Dettmers et al., 2023) quantizes the frozen base model to 4 bits (NF4) and attaches LoRA adapters in higher precision. This lets you fine-tune massive models — 65B+ — on a single consumer or prosumer GPU, democratizing fine-tuning of frontier-size models.
How is LoRA different from full fine-tuning?
Full fine-tuning updates every parameter — expensive in memory (2x the params for optimizer states) and produces a full new model. LoRA trains <1% of parameters, produces a tiny adapter file, and lets you keep one base model with many swappable task adapters at inference.
Sources
- Hu et al. — LoRA: Low-Rank Adaptation of Large Language Models — accessed 2026-04-20
- Dettmers et al. — QLoRA: Efficient Finetuning of Quantized LLMs — accessed 2026-04-20
- Hugging Face PEFT documentation — accessed 2026-04-20