Curiosity · Concept

Semantic Chunking

Semantic chunking, popularised by Greg Kamradt and now built into LangChain and LlamaIndex, replaces arbitrary character or token cutoffs with embedding-driven topic boundaries. The pipeline embeds each sentence, computes similarity between neighbouring sentences (or sliding windows), and cuts at similarity troughs. The result is chunks that correspond to coherent ideas rather than fixed-length blocks — useful for long-form articles, lectures, and any content where topic shifts are meaningful but not syntactically marked.

Quick reference

Proficiency
Intermediate
Also known as
embedding-based chunking
Prerequisites
Chunking strategies, Embeddings

Frequently asked questions

What is semantic chunking?

Semantic chunking is a chunking strategy that places split points where embedding similarity between consecutive sentences or groups of sentences drops below a threshold — letting chunks end naturally at topic boundaries.

Why choose semantic chunking over fixed-size?

It preserves topical coherence inside each chunk, which often improves retrieval and makes retrieved chunks more self-contained. It helps most on long-form content where topics shift without obvious structural markers.

What are the downsides?

Higher indexing cost (you must embed every sentence), chunk sizes become unpredictable, and threshold tuning is sensitive to the embedding model. It also can't rescue badly structured source material.

How is it different from structural chunking?

Structural chunking splits on syntactic markers (paragraphs, headings, lists); it needs no model. Semantic chunking splits on meaning shifts, which is more robust for unstructured text but requires embeddings at indexing time.

Sources

  1. Kamradt — 5 Levels of Text Splitting for Retrieval — accessed 2026-04-20
  2. Qu et al. — Is Semantic Chunking Worth the Computational Cost? — accessed 2026-04-20