Creativity · Comparison

In-Context Learning vs Fine-Tuning

Two canonical ways to make an LLM do your task. In-context learning (ICL) stuffs examples into the prompt at inference time — fast to iterate, no training run. Fine-tuning changes the model's weights, usually with LoRA or full fine-tune — slower to set up, cheaper per request, more durable. Pick by how stable and how high-volume your task is.

Side-by-side

Criterion	Fine-Tuning	In-Context Learning
Where adaptation lives	Model weights (persistent)	Prompt (ephemeral)
Setup cost	Training compute + data curation	Prompt engineering only
Inference cost	Same as base model (smaller prompts)	Higher — long prompts every call
Iteration speed	Hours to days per experiment	Minutes per experiment
Data requirement	Hundreds–thousands of labelled examples	A handful of examples
Behaviour stability	Very high	Depends on prompt; drift-prone
Safety / bias risk	Bakes your data into weights	Contained per call; easier to audit
Best fit	Stable high-volume tasks with style/output constraints	Prototyping, low-volume, or rapidly-changing tasks

Verdict

Start with in-context learning — it's cheaper, faster, and tells you whether the task is learnable at all. Move to fine-tuning when three things are true: the task is stable, your prompts are long and expensive, and you have enough clean labelled data to beat a well-crafted prompt. Prompt engineering + evals is almost always the right first move; fine-tuning is an optimisation, not a default.

When to choose each

Choose Fine-Tuning if…

Your task is stable and runs at high volume.
You have hundreds to thousands of clean labelled examples.
Your prompts are long (style guides, tool schemas, rubrics).
You care about consistency and latency at scale.

Choose In-Context Learning if…

You're prototyping and don't know what the task should even look like.
Your task changes every week.
You have a handful of examples, not thousands.
You value audibility of the exact instructions per call.

Frequently asked questions

Is fine-tuning obsolete with long context windows?

Not quite. Long context helps ICL a lot, but for high-volume tasks, fine-tuning still wins on cost and latency. For task-specific behaviour that needs to be rock-solid — JSON schemas, safety constraints — fine-tuning remains useful.

When should I use LoRA vs full fine-tuning?

LoRA for most adaptation tasks — it's cheap, fast, and easy to serve. Full fine-tune when you're changing the model's core behaviour or creating a derivative model for open-weights distribution.

Which should VSET students learn first?

Prompt engineering and ICL first — they're the foundation. Fine-tuning becomes valuable in year three or four when students have enough labelled data and compute to make it worthwhile.

Sources

OpenAI — fine-tuning guide — accessed 2026-04-20
Anthropic — prompt engineering overview — accessed 2026-04-20