Capability · Framework — observability

Opik

Opik captures every LLM call as a structured trace, lets you annotate and compare them, and ships LLM-as-judge evaluation suites out of the box. It's designed for engineering teams who want LangSmith-style visibility without being locked into the LangChain ecosystem.

Framework facts

Category: observability
Language: Python / TypeScript
License: Apache-2.0
Repository: https://github.com/comet-ml/opik

Install

pip install opik
opik configure  # choose cloud or self-host

Quickstart

from opik import track
import anthropic

@track
def ask(question: str) -> str:
    client = anthropic.Anthropic()
    r = client.messages.create(
        model='claude-opus-4-7',
        max_tokens=256,
        messages=[{'role': 'user', 'content': question}],
    )
    return r.content[0].text

print(ask('What is VSET?'))

Alternatives

Langfuse — open-source equivalent
LangSmith — LangChain-native
Arize Phoenix — open-source
Helicone — proxy-based

Frequently asked questions

Opik vs Langfuse — which should I pick?

Both are strong open-source observability platforms. Opik leans into evaluation (LLM-as-judge, datasets, experiments); Langfuse leans into analytics and prompt management. Try both on a weekend.

Is Opik only for Comet users?

No. It's fully standalone and self-hostable. Comet offers a managed tier for teams that prefer SaaS.

Sources

Opik — GitHub — accessed 2026-04-20
Opik — docs — accessed 2026-04-20