Capability · Framework — rag

Tantivy

Tantivy is a high-performance, embeddable search library in Rust, often described as 'Lucene for Rust'. It powers services like Quickwit, Vector, and many bespoke search pipelines, and is increasingly used in RAG stacks as the lexical/BM25 half of a hybrid retriever next to a vector DB. Python bindings (tantivy-py) make it usable from typical LLM toolchains.

Framework facts

Category
rag
Language
Rust (Python bindings)
License
MIT
Repository
https://github.com/quickwit-oss/tantivy

Install

# Rust
cargo add tantivy
# Python
pip install tantivy

Quickstart

import tantivy

schema_builder = tantivy.SchemaBuilder()
schema_builder.add_text_field('title', stored=True)
schema_builder.add_text_field('body', stored=True)
schema = schema_builder.build()

index = tantivy.Index(schema)
writer = index.writer()
writer.add_document(tantivy.Document(title='MCP', body='Model Context Protocol'))
writer.commit()

searcher = index.searcher()
query = index.parse_query('mcp', ['title', 'body'])
for _, doc_addr in searcher.search(query, 10).hits:
    print(searcher.doc(doc_addr))

Alternatives

  • Meilisearch — hosted-feel search
  • Elasticsearch — battle-tested, heavy
  • Lucene — JVM-native

Frequently asked questions

Is Tantivy a database?

No — it's a library you embed in your app. Quickwit is the log-search product built on Tantivy; for general-purpose full-text search use Meilisearch or Elasticsearch unless you specifically want an embedded engine.

Can I use Tantivy for vector search?

Tantivy ships ANN via its `columnar` feature but it's not the main focus — pair it with a proper vector DB (Qdrant, LanceDB) and use Tantivy for BM25 in a hybrid retriever.

Sources

  1. Tantivy GitHub — accessed 2026-04-20
  2. tantivy-py docs — accessed 2026-04-20