Curiosity · AI Model

GPT-5 Thinking

GPT-5 Thinking is the 'slow, careful' sibling of GPT-5. It exposes a native chain-of-thought loop with self-verification, longer tool-use chains, and substantially better scores on frontier math and agentic benchmarks, at the cost of several seconds of extra latency per reply.

Model specs

Vendor: OpenAI
Family: GPT-5
Released: 2025-09
Context window: 400,000 tokens
Modalities: text, vision, code
Input price: $15/M tok
Output price: $60/M tok
Pricing as of: 2026-04-20

Strengths

State-of-the-art scores on AIME, GPQA, and SWE-bench Verified
Self-verification loops reduce confident errors
Handles long tool chains without losing context

Limitations

Much slower than GPT-5 or GPT-5 mini for interactive UIs
Higher per-token price with thinking-token surcharge
Chain-of-thought is not fully exposed to the caller

Use cases

Frontier math and physics problem solving
Long-horizon agent planning with tool use
Research-grade code review and refactor planning
High-stakes decisions where answer quality outweighs latency

Benchmarks

Benchmark	Score	As of
AIME 2025	≈96%	2026-02
GPQA Diamond	≈84%	2026-02
SWE-bench Verified	≈72%	2026-02

Frequently asked questions

What is GPT-5 Thinking?

GPT-5 Thinking is a deliberate-reasoning mode of OpenAI's GPT-5 model. It spends additional inference tokens self-verifying, planning tool use, and checking answers, yielding stronger scores on frontier benchmarks at higher latency and cost.

When should I use GPT-5 Thinking instead of GPT-5?

Choose GPT-5 Thinking for hard math and science problems, long agent runs, or any workload where answer quality matters more than latency. Default to regular GPT-5 for interactive chat and routine generation.

How is GPT-5 Thinking priced?

GPT-5 Thinking uses GPT-5's base rates plus a higher effective cost because it generates more tokens per response. OpenAI publishes the current rates on its pricing page.

Sources

OpenAI — GPT-5 — accessed 2026-04-20
OpenAI — Pricing — accessed 2026-04-20