Capability · Framework — rag

Milvus

Milvus is one of the most scalable open-source vector databases. Its cloud-native architecture separates storage and compute, supports HNSW, IVF, DiskANN, and GPU indexes, and handles billion-vector workloads used at major AI platforms. Zilliz Cloud provides the managed version.

Framework facts

Category
rag
Language
Go / C++ / Python SDK
License
Apache-2.0
Repository
https://github.com/milvus-io/milvus

Install

pip install pymilvus
# single-node docker
wget https://github.com/milvus-io/milvus/releases/download/v2.4.0/milvus-standalone-docker-compose.yml -O docker-compose.yml
docker compose up -d

Quickstart

from pymilvus import MilvusClient

client = MilvusClient('http://localhost:19530')
client.create_collection('docs', dimension=1536)
client.insert('docs', [{'id': 1, 'vector': [0.1]*1536, 'text': 'hi'}])
hits = client.search('docs', data=[[0.1]*1536], limit=3)

Alternatives

  • Qdrant — Rust-based, simpler ops
  • Weaviate — GraphQL-native
  • Vespa — full search engine with tensors
  • pgvector — Postgres extension

Frequently asked questions

When is Milvus the right choice?

When you're heading toward hundreds of millions of vectors, need GPU indexing, or run in Kubernetes and want storage/compute separation. For smaller workloads Qdrant or pgvector are simpler.

Does Milvus support hybrid search?

Yes. Milvus 2.4+ supports hybrid dense+sparse retrieval with SPLADE or BM25-style sparse vectors fused via RRF or weighted sum.

Sources

  1. Milvus — GitHub — accessed 2026-04-20
  2. Milvus — docs — accessed 2026-04-20