Creativity · MCP — pattern

The MCP Sampling Pattern

Sampling inverts the usual MCP direction: instead of the client calling the server, the server asks the client to run a completion using whatever LLM the user has configured. This lets a tool ('summarize this PDF I just fetched') reason over its own data without bundling API keys or models, while keeping the user in control of cost and privacy.

MCP facts

Kind: pattern
Ecosystem: anthropic-mcp
Transports: stdio, sse, http

Capabilities

sampling/createMessage — server-initiated RPC asking the client to run a completion
Client enforces consent, cost caps, and model choice for every request
Enables 'smart' MCP servers that reason over fetched data without bundling a model

Frequently asked questions

Why not just have the server call an LLM directly?

Because then the user pays twice, manages two sets of keys, and loses auditability. Sampling lets the user's client be the single point of LLM spend and consent.

Do all clients support sampling?

Support is growing but not universal. Claude Desktop, recent Cursor, and Zed support it; older or simpler clients may reject sampling requests. Design your server to fall back gracefully.

Is sampling a security risk?

Yes if mishandled. A malicious server can try to abuse the user's LLM budget. Clients typically show a consent UI the first time a server requests sampling and enforce per-server token caps.

What prompts make sense for sampling?

Short, tool-internal reasoning steps — summarize this chunk, classify this ticket, extract fields from this email. Avoid sampling as a back-door to run unbounded user-facing tasks.

Sources

MCP Sampling spec — accessed 2026-04-20
MCP docs — accessed 2026-04-20