Curiosity · AI Model

AssemblyAI Universal-2

Q: What is AssemblyAI Universal-2?

Universal-2 is AssemblyAI's flagship speech-to-text model, released in October 2024. It targets batch transcription quality (state-of-the-art English WER) and integrates with LeMUR, AssemblyAI's LLM layer, for downstream analysis.

Q: What is LeMUR?

LeMUR is AssemblyAI's LLM layer that operates on transcripts — it produces summaries, chapters, sentiment, action items, and supports Q&A over the recording in a single API call on top of Universal-2 output.

Q: How does Universal-2 compare with Deepgram Nova-3?

Universal-2 leads on batch transcription quality and bundles LLM analytics; Nova-3 leads on streaming latency and voice-agent latency budgets. Pick Universal-2 for podcast and meeting workflows; Nova-3 for real-time phone agents.

Q: How is AssemblyAI priced?

AssemblyAI bills per second of audio for transcription, plus per-token fees for LeMUR LLM features. Volume discounts and enterprise plans are available.

Universal-2 is AssemblyAI's flagship ASR model. It is tuned for batch transcription quality — achieving state-of-the-art WER on English meetings and media — and integrates with LeMUR, AssemblyAI's LLM layer, for one-call summaries, chapters, sentiment, and Q&A on top of the transcript.

Model specs

Vendor: AssemblyAI
Family: Universal
Released: 2024-10
Context window: 1 tokens
Modalities: text, audio
Input price: n/a
Output price: n/a
Pricing as of: 2026-04-20

Strengths

State-of-the-art English WER on meeting-style audio
LeMUR LLM features on top of the transcript — summaries, chapters, sentiment, Q&A
Speaker diarisation, auto-chapters, PII redaction built in
Simple REST API with usage-based pricing

Limitations

Batch-first — streaming latency lags Deepgram Nova-3
Multilingual coverage smaller than Whisper large-v3
Closed API — no self-hostable weights
LeMUR features depend on AssemblyAI's own LLM pipeline, not your choice of model

Use cases

Podcast and media transcription with chapters
Meeting notes with speaker diarisation and summaries
Content moderation and topic detection
Sales-call analytics and Q&A over recordings

Benchmarks

Benchmark	Score	As of
English WER (meeting audio)	≈6.6%	2024
Streaming p95 latency	<500 ms	2024

Frequently asked questions

What is AssemblyAI Universal-2?

Universal-2 is AssemblyAI's flagship speech-to-text model, released in October 2024. It targets batch transcription quality (state-of-the-art English WER) and integrates with LeMUR, AssemblyAI's LLM layer, for downstream analysis.

What is LeMUR?