Capability · Framework — orchestration

Guidance

Guidance predates the current structured-output wave and introduced many ideas now standard across the ecosystem. It lets you interleave Python code, prompt templates, and output constraints in a single program — forcing specific tokens, regex matches, or schemas at chosen points. It shines with local models (Llama, Mistral, Phi) where token-level access enables strong guarantees that pure prompt-based approaches can't match.

Framework facts

Category: orchestration
Language: Python
License: MIT
Repository: https://github.com/guidance-ai/guidance

Install

pip install guidance

Quickstart

from guidance import models, gen, select

lm = models.Transformers('meta-llama/Llama-3.1-8B-Instruct')
lm += 'Choose a mood: ' + select(['happy', 'sad', 'excited'], name='mood')
lm += '\nJustify in one sentence: ' + gen(name='reason', max_tokens=30)
print(lm['mood'], lm['reason'])

Alternatives

Outlines — more active, similar philosophy
SGLang — runtime-level constrained generation
LMQL — research-y prompt programming language
Instructor — retry-based alternative for API models

Frequently asked questions

Guidance or Outlines?

Outlines has broader adoption and more active development in 2026. Guidance still works well and has unique features like the interleaved template syntax, but Outlines is the safer bet for new projects. Both cover the structured-generation core.

Does it work with Claude or GPT?

Limited — true constrained decoding needs logit access, which hosted APIs don't expose. Guidance has API adapters but they fall back to prompt engineering. For hosted APIs, use Instructor or BAML.

Sources

Guidance on GitHub — accessed 2026-04-20