Curiosity · AI Model

Grok 3

Grok 3 is xAI's 2025 flagship LLM, preceding Grok 4 and still widely used via the xAI API. It introduced the 'Think' mode — a reasoning scratch-pad that lets the model draft and revise an answer before emitting it — and pioneered live X-platform grounding for real-time responses.

Model specs

Vendor: xAI
Family: Grok
Released: 2025-02
Context window: 128,000 tokens
Modalities: text, code
Input price: $2/M tok
Output price: $10/M tok
Pricing as of: 2026-04-20

Strengths

Strong AIME and math scores for its size
Lower price than Grok 4 for many routine tasks
Live X firehose grounding is already mature in Grok 3
'Think' mode exposes reasoning steps for debugging

Limitations

Superseded by Grok 4 for top-tier reasoning tasks
No native vision — use Grok 2 Vision for image input
Smaller context window than Grok 4 (128k vs 256k)

Use cases

Cost-sensitive real-time Q&A with X grounding
Math tutoring and competition-style problem solving
Reasoning-heavy coding assistants
Chat products that need a 'scratchpad' mode

Benchmarks

Benchmark	Score	As of
AIME 2024	~52%	2026-04
MMLU	~87%	2026-04
HumanEval	~86%	2026-04

Frequently asked questions

What is Grok 3?

Grok 3 is xAI's 2025 flagship LLM, which introduced a visible 'Think' reasoning mode and live grounding in X (Twitter) posts. It is now one tier below Grok 4 in xAI's API.

Should I use Grok 3 or Grok 4?

Use Grok 4 for maximum quality and 256k context. Use Grok 3 when latency or cost matter more than raw benchmark leadership — it remains a strong reasoning model.

What is 'Think' mode?

Think mode instructs Grok to emit an internal scratchpad of reasoning before producing the final answer. It improves accuracy on math, logic, and coding problems at the cost of extra output tokens.

Sources

xAI — Grok 3 announcement — accessed 2026-04-20
xAI API docs — accessed 2026-04-20