Capability · Framework — fine-tuning

Accelerate (Hugging Face)

Accelerate is the glue between PyTorch and every distributed backend that matters: DDP, FSDP, DeepSpeed, Megatron-LM, TPU (XLA), and mixed precision. You write a standard PyTorch loop, Accelerate wraps your model, optimizer, and dataloader, and the same script runs on anything from a MacBook to a 512-GPU cluster. It's the default backbone for Transformers Trainer, TRL, PEFT, and Diffusers training.

Framework facts

Category: fine-tuning
Language: Python
License: Apache-2.0
Repository: https://github.com/huggingface/accelerate

Install

pip install accelerate

Quickstart

from accelerate import Accelerator
accelerator = Accelerator(mixed_precision='bf16')

model, optimizer, train_dl = accelerator.prepare(model, optimizer, train_dl)
for batch in train_dl:
    outputs = model(**batch)
    accelerator.backward(outputs.loss)
    optimizer.step()
# → accelerate launch train.py for multi-GPU

Alternatives

Lightning Fabric — similar idea
DeepSpeed — native
Megatron-LM

Frequently asked questions

Do I still need DeepSpeed?

Accelerate can launch DeepSpeed — you get DeepSpeed's memory optimisations (ZeRO-3, CPU offload) behind the same API. You rarely need to talk to DeepSpeed directly anymore.

Accelerate or FSDP?

FSDP is a PyTorch feature Accelerate wraps. Start with `accelerate config` and pick FSDP; you get sensible defaults and avoid the boilerplate.

Sources

Accelerate docs — accessed 2026-04-20
Accelerate GitHub — accessed 2026-04-20