Curiosity · AI Model
BART Large
BART (Bidirectional and Auto-Regressive Transformer), introduced by Facebook AI Research in 2019, combines a BERT-style encoder with a GPT-style decoder trained on a denoising objective. The 'large' checkpoint (~400M parameters) was the default pick for summarisation and conditional text generation in the pre-LLM era and still turns up in production legacy pipelines.
Model specs
- Vendor
- Meta
- Family
- BART
- Released
- 2019-10
- Context window
- 1,024 tokens
- Modalities
- text
Strengths
- Strong summarisation performance with small fine-tuning datasets
- Efficient vs. modern decoder-only LLMs for pure seq2seq tasks
- Widely supported in Hugging Face and fairseq
Limitations
- Pre-LLM era — no instruction-following or chat ability
- 1024-token context is tiny by modern standards
- Deprecated for new work — modern LLMs dominate summarisation quality
Use cases
- Legacy abstractive summarisation pipelines
- Low-resource fine-tuning for domain summarisation
- Classroom baselines for seq2seq transformer architecture
- Paraphrase and data-augmentation tasks
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| CNN/DailyMail ROUGE-1 | ≈44.2 | 2020-01 |
| XSum ROUGE-1 | ≈45.1 | 2020-01 |
Frequently asked questions
What is BART Large?
BART Large is Meta AI's classic 2019 encoder-decoder transformer with about 400 million parameters, pretrained with a denoising objective and best known for state-of-the-art summarisation in its era.
Is BART Large still used?
BART Large is legacy. You will still find it in academic references and older production summarisation pipelines, but modern LLMs produce better summaries with zero-shot prompting.
Where can I download BART Large?
BART Large weights are freely available on Hugging Face under the 'facebook/bart-large' repository, with fine-tuned variants like 'facebook/bart-large-cnn' for CNN/DailyMail summarisation.
Sources
- arXiv — BART paper — accessed 2026-04-20
- Hugging Face — facebook/bart-large — accessed 2026-04-20