Curiosity · AI Model

SeamlessM4T v2

SeamlessM4T v2 is Meta AI's November 2023 update to its massively multilingual and multimodal translation model. A single model handles speech-to-text, speech-to-speech, text-to-speech, and text-to-text translation across nearly 100 languages, with the v2 SONAR-based encoder cutting latency and improving quality over v1. Meta released both the model (CC-BY-NC) and a streaming extension (Seamless / SeamlessExpressive / SeamlessStreaming) that preserves tone and enables real-time interpretation.

Model specs

Vendor: Meta AI
Family: Seamless
Released: 2023-11
Context window: 4,096 tokens
Modalities: text, audio

Strengths

Single model covers S2T, S2S, T2S, T2T
Near-100-language coverage
Streaming variant for real-time use
Open weights for research

Limitations

CC-BY-NC licence blocks direct commercial deployment
Low-resource language quality still variable
Larger and heavier than specialised per-language models
Does not do non-translation language modelling

Use cases

Speech-to-speech translation research
Multilingual captioning and subtitling
Cross-language voice assistants
Low-resource language support experiments

Benchmarks

Benchmark	Score	As of
FLEURS S2T BLEU avg	≈32 (≈10% better than Whisper-v3 cross-language)	2023-11
S2S BLASER 2.0	SOTA at release	2023-11

Frequently asked questions

What is SeamlessM4T v2?

Meta's November 2023 multilingual, multimodal translation model covering speech and text across nearly 100 languages in a single network.

Is SeamlessM4T commercially usable?

The public weights are CC-BY-NC — commercial use requires a separate licence from Meta.

Does it support real-time interpretation?

Yes — via the SeamlessStreaming variant, which adds streaming-friendly encoder-decoder decoding.

Sources

Meta AI — Seamless Communication — accessed 2026-04-20
SeamlessM4T v2 paper (arXiv) — accessed 2026-04-20