Capability · Framework — orchestration

Guidance

Guidance predates the current structured-output wave and introduced many ideas now standard across the ecosystem. It lets you interleave Python code, prompt templates, and output constraints in a single program — forcing specific tokens, regex matches, or schemas at chosen points. It shines with local models (Llama, Mistral, Phi) where token-level access enables strong guarantees that pure prompt-based approaches can't match.

Framework facts

Category
orchestration
Language
Python
License
MIT
Repository
https://github.com/guidance-ai/guidance

Install

pip install guidance

Quickstart

from guidance import models, gen, select

lm = models.Transformers('meta-llama/Llama-3.1-8B-Instruct')
lm += 'Choose a mood: ' + select(['happy', 'sad', 'excited'], name='mood')
lm += '\nJustify in one sentence: ' + gen(name='reason', max_tokens=30)
print(lm['mood'], lm['reason'])

Alternatives

  • Outlines — more active, similar philosophy
  • SGLang — runtime-level constrained generation
  • LMQL — research-y prompt programming language
  • Instructor — retry-based alternative for API models

Frequently asked questions

Guidance or Outlines?

Outlines has broader adoption and more active development in 2026. Guidance still works well and has unique features like the interleaved template syntax, but Outlines is the safer bet for new projects. Both cover the structured-generation core.

Does it work with Claude or GPT?

Limited — true constrained decoding needs logit access, which hosted APIs don't expose. Guidance has API adapters but they fall back to prompt engineering. For hosted APIs, use Instructor or BAML.

Sources

  1. Guidance on GitHub — accessed 2026-04-20