Curiosity · AI Model
OpenAI o3
OpenAI o3 is the April 2025 reasoning model that pushed the "thinking-model" category to a new plateau. Unlike o1, o3 uses tools inside its chain of thought — running Python, browsing, and analysing images as part of reasoning — and posts dramatic gains on ARC-AGI, SWE-bench, and graduate-level science evaluations.
Model specs
- Vendor
- OpenAI
- Family
- o-series
- Released
- 2025-04
- Context window
- 200,000 tokens
- Modalities
- text, vision, code
- Input price
- $10/M tok
- Output price
- $40/M tok
- Pricing as of
- 2026-04-20
Strengths
- Tool use inside chain of thought — dramatic lift on real-world tasks
- Strong vision reasoning, not just captioning
- Large gains on ARC-AGI vs o1, signalling genuine generalisation
- Good calibration — knows when to think longer
Limitations
- Still slower and pricier than GPT-5 mini for equivalent production work
- High reasoning-token consumption inflates real cost
- No audio support — pair with Realtime for voice
- Sometimes over-reasons on simple prompts, wasting tokens
Use cases
- Complex coding agents — Cursor, Devin-style autonomous work
- Graduate-level scientific research assistance
- Data analysis agents that need to run code mid-reasoning
- Multi-step web research with image understanding
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| SWE-bench Verified | ≈69% | 2025-04 |
| ARC-AGI (public) | ≈75% | 2025-04 |
| GPQA Diamond | ≈87% | 2025-04 |
Frequently asked questions
What is OpenAI o3?
OpenAI o3 is the April 2025 reasoning model that succeeded o1. It combines long internal reasoning with tool use — running Python, browsing the web, analysing images — during chain of thought, producing much stronger scores on coding, science, and abstract reasoning benchmarks.
How is o3 different from o1?
The two biggest differences are tool use (o3 can run code and browse during reasoning; o1 could not) and vision (o3 accepts images; o1 was text-only). o3 also posts significantly higher scores across nearly every benchmark.
How much does o3 cost?
As of April 2026, o3 is priced at roughly USD 10 per million input tokens and USD 40 per million output tokens. Reasoning tokens count as output, so real costs are often higher than the headline price suggests.
Should I use o3 or GPT-5 in 2026?
GPT-5 is generally recommended for new builds because its unified router handles easy and hard prompts from one model. Choose o3 when you want deterministic reasoning behaviour without GPT-5's router, or when o3 is available on a platform GPT-5 is not.
Sources
- OpenAI — Introducing o3 and o4-mini — accessed 2026-04-20
- OpenAI — Pricing — accessed 2026-04-20