Curiosity · AI Model
SeamlessM4T v2
SeamlessM4T v2 is Meta AI's November 2023 update to its massively multilingual and multimodal translation model. A single model handles speech-to-text, speech-to-speech, text-to-speech, and text-to-text translation across nearly 100 languages, with the v2 SONAR-based encoder cutting latency and improving quality over v1. Meta released both the model (CC-BY-NC) and a streaming extension (Seamless / SeamlessExpressive / SeamlessStreaming) that preserves tone and enables real-time interpretation.
Model specs
- Vendor
- Meta AI
- Family
- Seamless
- Released
- 2023-11
- Context window
- 4,096 tokens
- Modalities
- text, audio
Strengths
- Single model covers S2T, S2S, T2S, T2T
- Near-100-language coverage
- Streaming variant for real-time use
- Open weights for research
Limitations
- CC-BY-NC licence blocks direct commercial deployment
- Low-resource language quality still variable
- Larger and heavier than specialised per-language models
- Does not do non-translation language modelling
Use cases
- Speech-to-speech translation research
- Multilingual captioning and subtitling
- Cross-language voice assistants
- Low-resource language support experiments
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| FLEURS S2T BLEU avg | ≈32 (≈10% better than Whisper-v3 cross-language) | 2023-11 |
| S2S BLASER 2.0 | SOTA at release | 2023-11 |
Frequently asked questions
What is SeamlessM4T v2?
Meta's November 2023 multilingual, multimodal translation model covering speech and text across nearly 100 languages in a single network.
Is SeamlessM4T commercially usable?
The public weights are CC-BY-NC — commercial use requires a separate licence from Meta.
Does it support real-time interpretation?
Yes — via the SeamlessStreaming variant, which adds streaming-friendly encoder-decoder decoding.
Sources
- Meta AI — Seamless Communication — accessed 2026-04-20
- SeamlessM4T v2 paper (arXiv) — accessed 2026-04-20