Capability · Framework — orchestration

Outlines

Outlines takes a different tack from retry-based parsers: instead of asking the model nicely for JSON and hoping, it constrains token sampling so only structurally-valid continuations are considered. That means you never get malformed JSON or off-schema output. It works with local models (vLLM, Transformers, llama.cpp) and hosted APIs and is a staple for agents that need reliable tool calls or extraction at scale.

Framework facts

Category
orchestration
Language
Python
License
Apache 2.0
Repository
https://github.com/dottxt-ai/outlines

Install

pip install outlines

Quickstart

import outlines
from pydantic import BaseModel

class Character(BaseModel):
    name: str
    age: int
    hobbies: list[str]

model = outlines.models.transformers('meta-llama/Llama-3.1-8B-Instruct')
generator = outlines.generate.json(model, Character)
result = generator('Describe a fictional wizard')
print(result)

Alternatives

  • Instructor — retry-based, API-only
  • Guidance — Microsoft's constrained generation library
  • SGLang — includes constrained generation primitives
  • BAML — schema-first DSL

Frequently asked questions

Does constrained decoding hurt quality?

In practice no — when the schema matches what the model 'wants' to output, quality is preserved or improves because the model can't waste tokens on invalid structure. Overly restrictive schemas can hurt; keep free-text fields for narrative content.

Can I use Outlines with the Anthropic API?

Partially — structured output with hosted APIs usually falls back to prompt-based JSON. True token-level constraints need logit access (open weights). For Claude, rely on Anthropic's native tool use + schema validation.

Sources

  1. Outlines — docs — accessed 2026-04-20
  2. Outlines on GitHub — accessed 2026-04-20