Curiosity · AI Model
Stable Diffusion 2.1
Stable Diffusion 2.1, released in December 2022, is the final 2.x-series open-weights text-to-image checkpoint from Stability AI. It uses an OpenCLIP ViT-H/14 text encoder, trains natively at 768x768, and was widely adopted by community fine-tuners before SDXL and SD3 took over. Today it is mostly a legacy baseline and teaching reference.
Model specs
- Vendor
- Stability AI
- Family
- Stable Diffusion 2
- Released
- 2022-12
- Context window
- 77 tokens
- Modalities
- text, vision
Strengths
- Open weights under the CreativeML OpenRAIL-M license
- Runs comfortably on consumer GPUs
- Large ecosystem of community fine-tunes and tooling
Limitations
- Outclassed by SDXL, SD3, and SD 3.5 on prompt adherence and quality
- OpenCLIP H/14 text encoder limits prompt fidelity
- Known issues with human anatomy and text rendering
Use cases
- Classroom teaching on latent diffusion models
- Legacy ControlNet and LoRA pipelines
- Baselines for newer diffusion research
- Fine-tuning experiments on modest hardware
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| LAION aesthetic score (internal) | improved over SD 2.0 | 2022-12 |
Frequently asked questions
What is Stable Diffusion 2.1?
Stable Diffusion 2.1 is Stability AI's December 2022 open-weights text-to-image latent diffusion model — the final revision of the 2.x series, trained natively at 768x768 with an OpenCLIP ViT-H/14 text encoder.
Should I still use Stable Diffusion 2.1?
For new projects, SDXL or Stable Diffusion 3.5 produce much better images. SD 2.1 remains a useful teaching baseline and is still served by community pipelines and archived LoRAs.
What license covers SD 2.1?
SD 2.1 is distributed under the CreativeML OpenRAIL-M license, which allows most commercial and personal use subject to responsible-use restrictions.
Sources
- Stability AI — Stable Diffusion 2.1 release — accessed 2026-04-20
- Hugging Face — stabilityai/stable-diffusion-2-1 — accessed 2026-04-20