Curiosity · AI Model

Gemini 1.5 Flash

Gemini 1.5 Flash is Google's May 2024 small model — a distilled sibling of 1.5 Pro that kept the 1M-token context but slashed cost and latency. It became the default cheap-and-long model for RAG, summarisation, and embedded features across Vertex AI during 2024 before 2.0 Flash replaced it.

Model specs

Vendor: Google
Family: Gemini 1.5
Released: 2024-05
Context window: 1,000,000 tokens
Modalities: text, vision, audio, video, code
Input price: $0.075/M tok
Output price: $0.3/M tok
Pricing as of: 2026-04-20

Strengths

First sub-$0.10/M input-token tier with 1M context
Multimodal — handles text, images, audio, video at launch
Great fit for bulk pipelines on Vertex AI
Distilled from 1.5 Pro, so behaviour is similar in tone

Limitations

Noticeably weaker reasoning than 1.5 Pro and the Gemini 2.x family
Superseded across nearly all tasks by 2.0 Flash and 2.5 Flash
No thinking mode

Use cases

High-volume RAG and summarisation at low cost
Embedded LLM features (classification, extraction)
Long-document Q&A where cost dominates latency
Legacy Vertex AI deployments

Benchmarks

Benchmark	Score	As of
MMLU	≈78.9%	2024-05
Video-MME	≈62%	2024-05
MATH	≈54%	2024-05

Frequently asked questions

What is Gemini 1.5 Flash?

Gemini 1.5 Flash is Google's May 2024 small, fast model. It is a distilled version of Gemini 1.5 Pro that keeps the 1M-token context at much lower cost and latency, and became the default low-cost Gemini tier for 2024.

Is Gemini 1.5 Flash still worth using?

For new projects, Gemini 2.0 Flash or 2.5 Flash is a clear upgrade at similar or lower cost. Gemini 1.5 Flash remains popular in legacy Vertex deployments and for workloads tuned to its specific behaviour.

How much does Gemini 1.5 Flash cost?

As of April 2026, Gemini 1.5 Flash is priced at roughly USD 0.075 per million input tokens and USD 0.30 per million output tokens — one of the cheapest 1M-context models available.

Does Gemini 1.5 Flash support video?

Yes. All Gemini 1.5 models — Pro, Flash, and the smaller 8B variant — support video input, leveraging the 1M-token context to reason across long clips.

Sources

Google — Gemini 1.5 Flash — accessed 2026-04-20
Google Cloud — Gemini pricing — accessed 2026-04-20