Curiosity · AI Model

GPT-3.5 Turbo

GPT-3.5 Turbo was the model behind the first wave of the generative-AI gold rush — the ChatGPT default in late 2022, and the cheap API workhorse that powered thousands of startups in 2023-24. With 16K context, function calling, and millisecond-scale latency, it still appears in legacy stacks and classroom demos.

Model specs

Vendor
OpenAI
Family
GPT-3.5
Released
2022-11
Context window
16,385 tokens
Modalities
text
Input price
$0.5/M tok
Output price
$1.5/M tok
Pricing as of
2026-04-20

Strengths

  • Extremely low price per token
  • Fast, deterministic latency
  • Mature SDK and tooling support

Limitations

  • Reasoning quality far below GPT-4o mini or Claude 3.5 Haiku
  • No vision or audio modalities
  • OpenAI no longer recommends GPT-3.5 Turbo for new development

Use cases

  • Legacy chatbot and classification pipelines
  • Classroom demos of the original ChatGPT experience
  • Non-critical background summarisation jobs

Benchmarks

BenchmarkScoreAs of
MMLU≈70%2023-06
HumanEval≈48%2023-06

Frequently asked questions

What is GPT-3.5 Turbo?

GPT-3.5 Turbo is OpenAI's original production-grade chat model, first shipped in late 2022 as the engine behind ChatGPT, and widely used via the API throughout 2023-24 for cheap, fast text generation.

Is GPT-3.5 Turbo deprecated?

OpenAI still serves GPT-3.5 Turbo endpoints for legacy apps, but recommends new projects use GPT-4o mini or GPT-5 nano, which are cheaper and significantly more capable.

What context window does GPT-3.5 Turbo have?

Later revisions of GPT-3.5 Turbo support a 16,385-token context window; earlier snapshots used 4,096 or 8,192 tokens.

Sources

  1. OpenAI — GPT-3.5 Turbo docs — accessed 2026-04-20
  2. OpenAI — ChatGPT announcement — accessed 2026-04-20