Capability · Framework — fine-tuning

torchtune

torchtune is PyTorch's answer to the fine-tuning ecosystem: a native library that uses standard PyTorch patterns (no Transformers wrapper) with readable 'recipes' for each training setup. It's preferred by teams who want to understand and modify their training code without fighting framework abstractions, and by researchers who need a clean base for novel methods.

Framework facts

Category
fine-tuning
Language
Python / PyTorch
License
BSD-3-Clause
Repository
https://github.com/pytorch/torchtune

Install

pip install torchtune

Quickstart

# Download model
tune download meta-llama/Llama-3.1-8B --output-dir ./llama3

# Run LoRA recipe with custom config
tune run lora_finetune_single_device \
    --config llama3/8B_lora_single_device \
    dataset.source=my-jsonl-dataset

Alternatives

  • TRL — Hugging Face alternative
  • Axolotl — config-driven, HF-based
  • Unsloth — fastest single-GPU
  • LLaMA-Factory — GUI-based config tuner

Frequently asked questions

Why pick torchtune over TRL?

torchtune has a smaller surface area and avoids the Transformers dependency tree. If you value code clarity or plan to modify training internals, torchtune is easier to fork. TRL has broader model coverage and algorithm support.

Does it scale to multi-node?

Yes — torchtune supports FSDP and tensor parallelism out of the box using PyTorch's native distributed primitives, which many practitioners find simpler to debug than Accelerate/DeepSpeed stacks.

Sources

  1. torchtune — docs — accessed 2026-04-20
  2. torchtune on GitHub — accessed 2026-04-20