Capability · Framework — orchestration

NVIDIA NeMo Guardrails

NeMo Guardrails lets you specify conversation rules in Colang — a purpose-built DSL — and enforces them around any LLM. It intercepts user input, model output, and tool calls to block off-topic chats, unsafe responses, and jailbreaks. It integrates with LangChain, LlamaIndex, and NVIDIA NIM microservices.

Framework facts

Category
orchestration
Language
Python
License
Apache-2.0
Repository
https://github.com/NVIDIA/NeMo-Guardrails

Install

pip install nemoguardrails

Quickstart

from nemoguardrails import LLMRails, RailsConfig

config = RailsConfig.from_path('./config')  # contains config.yml + *.co files
rails = LLMRails(config)
response = rails.generate(messages=[{'role': 'user', 'content': 'Tell me about VSET.'}])
print(response['content'])

Alternatives

  • Guardrails AI — Python-first with RAIL spec
  • LLM Guard — focused on prompt injection / PII
  • Llama Guard — Meta's open safety model
  • Amazon Bedrock Guardrails

Frequently asked questions

What is Colang?

A YAML-adjacent DSL for describing conversational flows, message canonical forms, and rails. It lets non-programmers specify policies that the runtime enforces against any LLM.

Does NeMo Guardrails slow responses down?

Rails add a small overhead because they call helper models for classification. Using fast guards (e.g. Mixtral-8x7B or Claude Haiku) keeps total latency under a few hundred milliseconds.

Sources

  1. NeMo Guardrails — GitHub — accessed 2026-04-20
  2. NeMo Guardrails — docs — accessed 2026-04-20