Capability · Framework — fine-tuning

torchtune

torchtune is PyTorch's answer to the fine-tuning ecosystem: a native library that uses standard PyTorch patterns (no Transformers wrapper) with readable 'recipes' for each training setup. It's preferred by teams who want to understand and modify their training code without fighting framework abstractions, and by researchers who need a clean base for novel methods.

Framework facts

Category: fine-tuning
Language: Python / PyTorch
License: BSD-3-Clause
Repository: https://github.com/pytorch/torchtune

Install

pip install torchtune

Quickstart

# Download model
tune download meta-llama/Llama-3.1-8B --output-dir ./llama3

# Run LoRA recipe with custom config
tune run lora_finetune_single_device \
    --config llama3/8B_lora_single_device \
    dataset.source=my-jsonl-dataset

Alternatives

TRL — Hugging Face alternative
Axolotl — config-driven, HF-based
Unsloth — fastest single-GPU
LLaMA-Factory — GUI-based config tuner

Frequently asked questions

Why pick torchtune over TRL?

torchtune has a smaller surface area and avoids the Transformers dependency tree. If you value code clarity or plan to modify training internals, torchtune is easier to fork. TRL has broader model coverage and algorithm support.

Does it scale to multi-node?

Yes — torchtune supports FSDP and tensor parallelism out of the box using PyTorch's native distributed primitives, which many practitioners find simpler to debug than Accelerate/DeepSpeed stacks.

Sources

torchtune — docs — accessed 2026-04-20
torchtune on GitHub — accessed 2026-04-20