Curiosity · Concept

LoRA (Low-Rank Adaptation)

Low-Rank Adaptation (LoRA), introduced by Hu et al. in 2021, is the dominant way to fine-tune large models on commodity hardware. Instead of updating all N billion parameters, you freeze them and add small low-rank matrices (rank 4-64) to the weight updates. You end up training millions of parameters instead of billions — with quality that matches full fine-tuning on most tasks.

Quick reference

Proficiency
Intermediate
Also known as
LoRA, low-rank adapter, PEFT (LoRA family)
Prerequisites
Fine-tuning, Linear algebra (matrix rank)

Frequently asked questions

What is LoRA?

LoRA (Low-Rank Adaptation) is a fine-tuning method that freezes the base model and adds small trainable low-rank matrices alongside each weight matrix. Only these small matrices are trained, yielding a compact 'adapter' that captures the fine-tuning while using a fraction of the memory.

Why does LoRA work despite training so few parameters?

Empirically, the 'update' from fine-tuning lies in a low-dimensional subspace — you don't need all N^2 degrees of freedom to express it. A rank-8 or rank-16 approximation captures most of the signal, which is why LoRA matches full fine-tuning on most downstream benchmarks.

What is QLoRA?

QLoRA (Dettmers et al., 2023) quantizes the frozen base model to 4 bits (NF4) and attaches LoRA adapters in higher precision. This lets you fine-tune massive models — 65B+ — on a single consumer or prosumer GPU, democratizing fine-tuning of frontier-size models.

How is LoRA different from full fine-tuning?

Full fine-tuning updates every parameter — expensive in memory (2x the params for optimizer states) and produces a full new model. LoRA trains <1% of parameters, produces a tiny adapter file, and lets you keep one base model with many swappable task adapters at inference.

Sources

  1. Hu et al. — LoRA: Low-Rank Adaptation of Large Language Models — accessed 2026-04-20
  2. Dettmers et al. — QLoRA: Efficient Finetuning of Quantized LLMs — accessed 2026-04-20
  3. Hugging Face PEFT documentation — accessed 2026-04-20