Curiosity · AI Model
GPT-3.5 Turbo
GPT-3.5 Turbo was the model behind the first wave of the generative-AI gold rush — the ChatGPT default in late 2022, and the cheap API workhorse that powered thousands of startups in 2023-24. With 16K context, function calling, and millisecond-scale latency, it still appears in legacy stacks and classroom demos.
Model specs
- Vendor
- OpenAI
- Family
- GPT-3.5
- Released
- 2022-11
- Context window
- 16,385 tokens
- Modalities
- text
- Input price
- $0.5/M tok
- Output price
- $1.5/M tok
- Pricing as of
- 2026-04-20
Strengths
- Extremely low price per token
- Fast, deterministic latency
- Mature SDK and tooling support
Limitations
- Reasoning quality far below GPT-4o mini or Claude 3.5 Haiku
- No vision or audio modalities
- OpenAI no longer recommends GPT-3.5 Turbo for new development
Use cases
- Legacy chatbot and classification pipelines
- Classroom demos of the original ChatGPT experience
- Non-critical background summarisation jobs
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU | ≈70% | 2023-06 |
| HumanEval | ≈48% | 2023-06 |
Frequently asked questions
What is GPT-3.5 Turbo?
GPT-3.5 Turbo is OpenAI's original production-grade chat model, first shipped in late 2022 as the engine behind ChatGPT, and widely used via the API throughout 2023-24 for cheap, fast text generation.
Is GPT-3.5 Turbo deprecated?
OpenAI still serves GPT-3.5 Turbo endpoints for legacy apps, but recommends new projects use GPT-4o mini or GPT-5 nano, which are cheaper and significantly more capable.
What context window does GPT-3.5 Turbo have?
Later revisions of GPT-3.5 Turbo support a 16,385-token context window; earlier snapshots used 4,096 or 8,192 tokens.
Sources
- OpenAI — GPT-3.5 Turbo docs — accessed 2026-04-20
- OpenAI — ChatGPT announcement — accessed 2026-04-20