Capability · Framework — rag
Milvus
Milvus is one of the most scalable open-source vector databases. Its cloud-native architecture separates storage and compute, supports HNSW, IVF, DiskANN, and GPU indexes, and handles billion-vector workloads used at major AI platforms. Zilliz Cloud provides the managed version.
Framework facts
- Category
- rag
- Language
- Go / C++ / Python SDK
- License
- Apache-2.0
- Repository
- https://github.com/milvus-io/milvus
Install
pip install pymilvus
# single-node docker
wget https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
docker compose up -d Quickstart
from pymilvus import MilvusClient
client = MilvusClient('http://localhost:19530')
client.create_collection('docs', dimension=1536)
client.insert('docs', [{'id': 1, 'vector': [0.1]*1536, 'text': 'hi'}])
hits = client.search('docs', data=[[0.1]*1536], limit=3) Alternatives
- Qdrant — Rust-based, simpler ops
- Weaviate — GraphQL-native
- Vespa — full search engine with tensors
- pgvector — Postgres extension
Frequently asked questions
When is Milvus the right choice?
When you're heading toward hundreds of millions of vectors, need GPU indexing, or run in Kubernetes and want storage/compute separation. For smaller workloads Qdrant or pgvector are simpler.
Does Milvus support hybrid search?
Yes. Milvus 2.4+ supports hybrid dense+sparse retrieval with SPLADE or BM25-style sparse vectors fused via RRF or weighted sum.
Sources
- Milvus — GitHub — accessed 2026-04-20
- Milvus — docs — accessed 2026-04-20