Curiosity · AI Model

OpenAI Whisper v3 (large-v3)

Whisper large-v3 is OpenAI's open-weight automatic speech recognition (ASR) model. It covers 99 languages with strong accuracy on accented and noisy audio, and is released under an MIT-style licence, which is why it anchors most open-source transcription pipelines — from podcast tooling to meeting-notes apps.

Model specs

Vendor: OpenAI
Family: Whisper
Released: 2023-11
Context window: 30 tokens
Modalities: text, audio
Input price: n/a
Output price: n/a
Pricing as of: 2026-04-20

Strengths

Open-weight and MIT-licensed — fully self-hostable
99 languages with strong accented-speech robustness
Timestamped word-level output
Mature ecosystem — whisper.cpp, faster-whisper, Hugging Face

Limitations

30-second chunking — long audio needs a segmenter
Hallucinates on silent or background-only audio — use VAD upstream
Speaker diarisation must be added separately
Real-time streaming needs faster-whisper or a distilled variant

Use cases

Podcast and video transcription
Meeting notes and action-item extraction
Subtitle generation across languages
Voice-note capture for productivity tools

Benchmarks

Benchmark	Score	As of
Common Voice multilingual WER	≈10.4%	2023
LibriSpeech test-clean WER	≈2.0%	2023

Frequently asked questions

What is Whisper v3?

Whisper large-v3 is OpenAI's third-major-version open-weight speech-to-text model. It supports 99 languages and is released with MIT-licensed weights, so it can be self-hosted for free and fine-tuned.

How accurate is Whisper?

On clean English audio (LibriSpeech test-clean) Whisper large-v3 achieves roughly 2% word-error-rate. On multilingual Common Voice it averages around 10–11% WER, which is state-of-the-art among open-weight ASR models.

Can Whisper do real-time streaming?

Not out of the box — Whisper is trained on 30-second windows. For real-time, use faster-whisper, whisper.cpp with low-latency buffering, or a distilled variant like Distil-Whisper. Deepgram Nova-3 and AssemblyAI offer purpose-built streaming.

Is Whisper free to use?

Yes — the model weights are MIT-licensed and self-hostable for free. OpenAI also offers a hosted Whisper endpoint via its audio API for a per-minute fee.

Sources

OpenAI — Whisper repo — accessed 2026-04-20
Hugging Face — openai/whisper-large-v3 — accessed 2026-04-20