Curiosity · AI Model

DALL·E 2

DALL·E 2 was OpenAI's April 2022 breakthrough text-to-image system and the model that pushed diffusion-based image generation into the mainstream. It pairs a CLIP-based text/image embedding space with an unCLIP prior that maps text embeddings to image embeddings, then decodes them with a 64×64 diffusion model upsampled to 1024×1024. Variations, inpainting, and outpainting features made it the dominant creative-AI API of 2022-23. Superseded by DALL·E 3 (2023) and the GPT-Image family; legacy API endpoint still available.

Model specs

Vendor
OpenAI
Family
DALL·E
Released
2022-04
Context window
1 tokens
Modalities
text, vision
Input price
n/a
Output price
n/a
Pricing as of
2026-04-20

Strengths

  • First consumer diffusion model at scale
  • Cheap per-image price relative to contemporaries
  • Supports variations, inpainting, outpainting
  • Stable, well-documented API

Limitations

  • Quality well behind DALL·E 3, Midjourney v6, Imagen 3, SDXL
  • Weak on text-in-image rendering
  • 1024×1024 max resolution
  • Style range narrower than modern open models

Use cases

  • Legacy applications still calling the DALL·E 2 API
  • Educational reference for diffusion architecture (unCLIP)
  • Low-cost image variations and inpainting
  • Benchmarking against modern generators

Benchmarks

BenchmarkScoreAs of
FID (MS-COCO zero-shot)≈10.42022-04
Human preference vs DALL·E 1preferred ~72% of the time2022-04

Frequently asked questions

What is DALL·E 2?

DALL·E 2 is OpenAI's 2022 text-to-image diffusion model that uses CLIP-guided priors and cascaded upsamplers to produce 1024×1024 images from text prompts.

Is DALL·E 2 still recommended?

For new work, no — DALL·E 3 and OpenAI's GPT-Image API produce much better results. DALL·E 2 is mainly a legacy option.

Can DALL·E 2 edit existing images?

Yes. It supports variations, inpainting (masked edits), and outpainting (extending beyond the canvas).

Sources

  1. OpenAI — DALL·E 2 — accessed 2026-04-20
  2. Hierarchical Text-Conditional Image Generation with CLIP Latents — accessed 2026-04-20