Curiosity · AI Model

Grok 3

Grok 3 is xAI's 2025 flagship LLM, preceding Grok 4 and still widely used via the xAI API. It introduced the 'Think' mode — a reasoning scratch-pad that lets the model draft and revise an answer before emitting it — and pioneered live X-platform grounding for real-time responses.

Model specs

Vendor
xAI
Family
Grok
Released
2025-02
Context window
128,000 tokens
Modalities
text, code
Input price
$2/M tok
Output price
$10/M tok
Pricing as of
2026-04-20

Strengths

  • Strong AIME and math scores for its size
  • Lower price than Grok 4 for many routine tasks
  • Live X firehose grounding is already mature in Grok 3
  • 'Think' mode exposes reasoning steps for debugging

Limitations

  • Superseded by Grok 4 for top-tier reasoning tasks
  • No native vision — use Grok 2 Vision for image input
  • Smaller context window than Grok 4 (128k vs 256k)

Use cases

  • Cost-sensitive real-time Q&A with X grounding
  • Math tutoring and competition-style problem solving
  • Reasoning-heavy coding assistants
  • Chat products that need a 'scratchpad' mode

Benchmarks

BenchmarkScoreAs of
AIME 2024~52%2026-04
MMLU~87%2026-04
HumanEval~86%2026-04

Frequently asked questions

What is Grok 3?

Grok 3 is xAI's 2025 flagship LLM, which introduced a visible 'Think' reasoning mode and live grounding in X (Twitter) posts. It is now one tier below Grok 4 in xAI's API.

Should I use Grok 3 or Grok 4?

Use Grok 4 for maximum quality and 256k context. Use Grok 3 when latency or cost matter more than raw benchmark leadership — it remains a strong reasoning model.

What is 'Think' mode?

Think mode instructs Grok to emit an internal scratchpad of reasoning before producing the final answer. It improves accuracy on math, logic, and coding problems at the cost of extra output tokens.

Sources

  1. xAI — Grok 3 announcement — accessed 2026-04-20
  2. xAI API docs — accessed 2026-04-20