Curiosity · AI Model
Gemini 1.5 Flash
Gemini 1.5 Flash is Google's May 2024 small model — a distilled sibling of 1.5 Pro that kept the 1M-token context but slashed cost and latency. It became the default cheap-and-long model for RAG, summarisation, and embedded features across Vertex AI during 2024 before 2.0 Flash replaced it.
Model specs
- Vendor
- Family
- Gemini 1.5
- Released
- 2024-05
- Context window
- 1,000,000 tokens
- Modalities
- text, vision, audio, video, code
- Input price
- $0.075/M tok
- Output price
- $0.3/M tok
- Pricing as of
- 2026-04-20
Strengths
- First sub-$0.10/M input-token tier with 1M context
- Multimodal — handles text, images, audio, video at launch
- Great fit for bulk pipelines on Vertex AI
- Distilled from 1.5 Pro, so behaviour is similar in tone
Limitations
- Noticeably weaker reasoning than 1.5 Pro and the Gemini 2.x family
- Superseded across nearly all tasks by 2.0 Flash and 2.5 Flash
- No thinking mode
Use cases
- High-volume RAG and summarisation at low cost
- Embedded LLM features (classification, extraction)
- Long-document Q&A where cost dominates latency
- Legacy Vertex AI deployments
Benchmarks
| Benchmark | Score | As of |
|---|---|---|
| MMLU | ≈78.9% | 2024-05 |
| Video-MME | ≈62% | 2024-05 |
| MATH | ≈54% | 2024-05 |
Frequently asked questions
What is Gemini 1.5 Flash?
Gemini 1.5 Flash is Google's May 2024 small, fast model. It is a distilled version of Gemini 1.5 Pro that keeps the 1M-token context at much lower cost and latency, and became the default low-cost Gemini tier for 2024.
Is Gemini 1.5 Flash still worth using?
For new projects, Gemini 2.0 Flash or 2.5 Flash is a clear upgrade at similar or lower cost. Gemini 1.5 Flash remains popular in legacy Vertex deployments and for workloads tuned to its specific behaviour.
How much does Gemini 1.5 Flash cost?
As of April 2026, Gemini 1.5 Flash is priced at roughly USD 0.075 per million input tokens and USD 0.30 per million output tokens — one of the cheapest 1M-context models available.
Does Gemini 1.5 Flash support video?
Yes. All Gemini 1.5 models — Pro, Flash, and the smaller 8B variant — support video input, leveraging the 1M-token context to reason across long clips.
Sources
- Google — Gemini 1.5 Flash — accessed 2026-04-20
- Google Cloud — Gemini pricing — accessed 2026-04-20