Capability · Framework — fine-tuning
Accelerate (Hugging Face)
Accelerate is the glue between PyTorch and every distributed backend that matters: DDP, FSDP, DeepSpeed, Megatron-LM, TPU (XLA), and mixed precision. You write a standard PyTorch loop, Accelerate wraps your model, optimizer, and dataloader, and the same script runs on anything from a MacBook to a 512-GPU cluster. It's the default backbone for Transformers Trainer, TRL, PEFT, and Diffusers training.
Framework facts
- Category
- fine-tuning
- Language
- Python
- License
- Apache-2.0
- Repository
- https://github.com/huggingface/accelerate
Install
pip install accelerate Quickstart
from accelerate import Accelerator
accelerator = Accelerator(mixed_precision='bf16')
model, optimizer, train_dl = accelerator.prepare(model, optimizer, train_dl)
for batch in train_dl:
outputs = model(**batch)
accelerator.backward(outputs.loss)
optimizer.step()
# → accelerate launch train.py for multi-GPU Alternatives
- Lightning Fabric — similar idea
- DeepSpeed — native
- Megatron-LM
Frequently asked questions
Do I still need DeepSpeed?
Accelerate can launch DeepSpeed — you get DeepSpeed's memory optimisations (ZeRO-3, CPU offload) behind the same API. You rarely need to talk to DeepSpeed directly anymore.
Accelerate or FSDP?
FSDP is a PyTorch feature Accelerate wraps. Start with `accelerate config` and pick FSDP; you get sensible defaults and avoid the boilerplate.
Sources
- Accelerate docs — accessed 2026-04-20
- Accelerate GitHub — accessed 2026-04-20