Capability · Framework — fine-tuning
Replicate
Replicate hosts thousands of open-source models behind a unified HTTP API. Every model has a stable URL, versioned weights, and pay-per-second billing. Teams use it to ship image, audio, and LLM features without managing GPUs — and publish their own models by wrapping them in Cog.
Framework facts
- Category
- fine-tuning
- Language
- Python / HTTP
- License
- Proprietary (Cog is Apache-2.0)
- Repository
- https://github.com/replicate/replicate-python
Install
pip install replicate
export REPLICATE_API_TOKEN=r8_... Quickstart
import replicate
output = replicate.run(
'black-forest-labs/flux-schnell',
input={'prompt': 'A retro poster of VSET, Delhi'},
)
print(output) Alternatives
- Modal — serverless GPU for custom code
- Fal.ai — low-latency image/audio
- Hugging Face Inference Endpoints
- Runpod serverless
Frequently asked questions
When is Replicate the right choice?
When you want a hosted HTTP endpoint for popular open-source models, or when you want to publish your own model as a product with zero infra work via Cog.
Does Replicate support fine-tuning?
Yes for many models (e.g. SDXL, Flux LoRAs, Llama). Upload training data, kick off a fine-tune via API, and get a new private model version.
Sources
- Replicate — docs — accessed 2026-04-20
- Cog — GitHub — accessed 2026-04-20