Capability · Framework — fine-tuning
Colossal-AI
Colossal-AI is the deep-learning training system from HPC-AI Tech (Singapore). Like DeepSpeed it focuses on scaling very large models, with its own implementations of tensor parallelism, pipeline parallelism, sequence parallelism, and ZeRO-like sharding. Its Gemini heterogeneous memory manager lets you offload to CPU and NVMe with finer granularity, and Colossal-Chat was one of the first open reproductions of ChatGPT's RLHF pipeline.
Framework facts
- Category
- fine-tuning
- Language
- Python / CUDA
- License
- Apache-2.0
- Repository
- https://github.com/hpcaitech/ColossalAI
Install
pip install colossalai Quickstart
import colossalai
from colossalai.booster import Booster
from colossalai.booster.plugin import GeminiPlugin
colossalai.launch_from_torch(config={})
plugin = GeminiPlugin(placement_policy='auto')
booster = Booster(plugin=plugin)
model, optimizer, _, dataloader, _ = booster.boost(model, optimizer, dataloader=dataloader) Alternatives
- DeepSpeed
- Megatron-LM
- FSDP
Frequently asked questions
Colossal-AI or DeepSpeed?
DeepSpeed has a wider user base and tighter integration with HF Transformers. Colossal-AI often shines in benchmarks on the same hardware for tensor-parallel heavy models and in heterogeneous memory (Gemini) scenarios. Try both on your workload.
Is Colossal-Chat still relevant?
As reference code, yes — it walks through SFT + reward model + PPO. For production RLHF/DPO most teams now use TRL or veRL, which have better maintenance.
Sources
- Colossal-AI docs — accessed 2026-04-20
- Colossal-AI GitHub — accessed 2026-04-20