Capability · Framework — observability

Datadog LLM Observability

Datadog LLM Observability sits inside the Datadog APM platform. It captures every LLM call as a span, visualises token and cost usage, runs out-of-the-box quality evaluations (toxicity, PII leakage, hallucination), and joins LLM traces to the rest of your distributed trace so you can debug end-to-end. The `ddtrace` SDK auto-instruments OpenAI, Anthropic, LangChain, LlamaIndex, Bedrock, and more.

Framework facts

Category: observability
Language: Python / Node.js / Java / Go
License: Commercial SaaS

Install

pip install ddtrace
export DD_API_KEY=...
export DD_LLMOBS_ENABLED=1
export DD_LLMOBS_ML_APP=my-llm-app

Quickstart

from ddtrace.llmobs import LLMObs
LLMObs.enable(ml_app='my-llm-app', api_key='...')

from openai import OpenAI
OpenAI().chat.completions.create(
    model='gpt-4o-mini',
    messages=[{'role':'user','content':'hi'}]
)
# traces visible in Datadog LLM Observability UI

Alternatives

Arize Phoenix — open-source
LangSmith — LangChain-native
New Relic AI Monitoring — direct competitor
Dynatrace / Splunk O11y — APM peers adding LLM features

Frequently asked questions

Does it work without LangChain?

Yes — `ddtrace` has auto-instrumentation for the major provider SDKs (OpenAI, Anthropic, Bedrock, Vertex) and a `@workflow`/`@llm` decorator for custom code.

How is pricing calculated?

Datadog charges per LLM span indexed, with a monthly free tier. Check current pricing; volumes in production can grow quickly.

Sources

Datadog LLM Observability docs — accessed 2026-04-20
ddtrace GitHub — accessed 2026-04-20