Capability · Framework — fine-tuning
Modal
Modal lets you take any Python function, add a decorator, and run it on managed GPUs (A100, H100, H200, B200) in the cloud with per-second billing. It's widely used for LLM fine-tuning jobs, batch inference, embedding pipelines, and hosting OpenAI-compatible endpoints without writing Dockerfiles or YAML.
Framework facts
- Category
- fine-tuning
- Language
- Python
- License
- Proprietary (open-source client)
- Repository
- https://github.com/modal-labs/modal-client
Install
pip install modal
modal setup Quickstart
import modal
app = modal.App('finetune')
image = modal.Image.debian_slim().pip_install('axolotl')
@app.function(image=image, gpu='H100', timeout=60*60*4)
def train():
import subprocess
subprocess.run(['accelerate', 'launch', 'train.py'])
@app.local_entrypoint()
def main():
train.remote() Alternatives
- Replicate — model-centric hosting
- Runpod — GPU marketplace
- Baseten — model serving platform
- SkyPilot — multi-cloud launcher
Frequently asked questions
What is Modal best at?
GPU workloads with bursty demand — fine-tuning, batch embeddings, scheduled eval jobs — where per-second billing and fast cold starts matter. Also great for serving FastAPI endpoints backed by GPU inference.
Can I run Modal on my own infra?
No. The runtime is a proprietary managed service. The Python client is open source and you can abstract it behind interfaces if portability matters.
Sources
- Modal — docs — accessed 2026-04-20
- Modal — GitHub — accessed 2026-04-20