Curiosity · AI Model

Gemini 1.5 Flash

Gemini 1.5 Flash is Google's May 2024 small model — a distilled sibling of 1.5 Pro that kept the 1M-token context but slashed cost and latency. It became the default cheap-and-long model for RAG, summarisation, and embedded features across Vertex AI during 2024 before 2.0 Flash replaced it.

Model specs

Vendor
Google
Family
Gemini 1.5
Released
2024-05
Context window
1,000,000 tokens
Modalities
text, vision, audio, video, code
Input price
$0.075/M tok
Output price
$0.3/M tok
Pricing as of
2026-04-20

Strengths

  • First sub-$0.10/M input-token tier with 1M context
  • Multimodal — handles text, images, audio, video at launch
  • Great fit for bulk pipelines on Vertex AI
  • Distilled from 1.5 Pro, so behaviour is similar in tone

Limitations

  • Noticeably weaker reasoning than 1.5 Pro and the Gemini 2.x family
  • Superseded across nearly all tasks by 2.0 Flash and 2.5 Flash
  • No thinking mode

Use cases

  • High-volume RAG and summarisation at low cost
  • Embedded LLM features (classification, extraction)
  • Long-document Q&A where cost dominates latency
  • Legacy Vertex AI deployments

Benchmarks

BenchmarkScoreAs of
MMLU≈78.9%2024-05
Video-MME≈62%2024-05
MATH≈54%2024-05

Frequently asked questions

What is Gemini 1.5 Flash?

Gemini 1.5 Flash is Google's May 2024 small, fast model. It is a distilled version of Gemini 1.5 Pro that keeps the 1M-token context at much lower cost and latency, and became the default low-cost Gemini tier for 2024.

Is Gemini 1.5 Flash still worth using?

For new projects, Gemini 2.0 Flash or 2.5 Flash is a clear upgrade at similar or lower cost. Gemini 1.5 Flash remains popular in legacy Vertex deployments and for workloads tuned to its specific behaviour.

How much does Gemini 1.5 Flash cost?

As of April 2026, Gemini 1.5 Flash is priced at roughly USD 0.075 per million input tokens and USD 0.30 per million output tokens — one of the cheapest 1M-context models available.

Does Gemini 1.5 Flash support video?

Yes. All Gemini 1.5 models — Pro, Flash, and the smaller 8B variant — support video input, leveraging the 1M-token context to reason across long clips.

Sources

  1. Google — Gemini 1.5 Flash — accessed 2026-04-20
  2. Google Cloud — Gemini pricing — accessed 2026-04-20