Curiosity · AI Model

OpenVLA

OpenVLA is an open-source 7B vision-language-action model from Stanford, Google DeepMind, Toyota Research, and collaborators. It fine-tunes a Prismatic VLM (DINOv2 + SigLIP + Llama 2 7B) on 970k robot trajectories from the Open X-Embodiment dataset and outputs discretised 7-DoF actions. Weights and training code are fully open under an Apache-2.0-style licence, making it the go-to baseline for academic robotics research on generalist policies.

Model specs

Vendor: Stanford / Google DeepMind / TRI (collaboration)
Family: OpenVLA
Released: 2024-06
Context window: 1 tokens
Modalities: text, vision, code

Strengths

Fully open weights + training code under permissive licence
Outperforms RT-2-X on common benchmarks despite being smaller
Efficient LoRA adaptation to new embodiments
Active research community and downstream forks

Limitations

Action space is tokenised — not as smooth as flow-matching policies like π0
Single-arm table-top focus — locomotion and bi-manual less covered
Requires 24 GB+ GPU for full-precision inference
Needs task-specific fine-tuning for most production use

Use cases

Open baseline for generalist manipulation research
LoRA fine-tuning to new robots with small demonstration sets
Comparative evaluation of VLA design choices
Teaching and benchmarking embodied-AI curricula

Benchmarks

Benchmark	Score	As of
BridgeV2 / RT-2-X evaluation suite	≈16 pp higher success than RT-2-X on average	2024-06
LoRA fine-tune to new embodiment	matches scratch RT-2 with ~1% of compute	2024-06

Frequently asked questions

What is OpenVLA?

OpenVLA is a 7B-parameter open-source vision-language-action model trained on the Open X-Embodiment robot dataset, released by Stanford, Google DeepMind, Toyota Research, and collaborators.

Why use OpenVLA instead of RT-2?

RT-2 weights are closed. OpenVLA is fully open, reproducible, and on common benchmarks slightly outperforms RT-2-X while being easier to fine-tune with LoRA.

What licence is OpenVLA under?

Weights and code are released under an MIT-style licence that allows research and commercial use, subject to the underlying data terms.

Sources

OpenVLA project site — accessed 2026-04-20
OpenVLA paper (arXiv) — accessed 2026-04-20