Capability · Framework — orchestration
NVIDIA NeMo Guardrails
NeMo Guardrails lets you specify conversation rules in Colang — a purpose-built DSL — and enforces them around any LLM. It intercepts user input, model output, and tool calls to block off-topic chats, unsafe responses, and jailbreaks. It integrates with LangChain, LlamaIndex, and NVIDIA NIM microservices.
Framework facts
- Category
- orchestration
- Language
- Python
- License
- Apache-2.0
- Repository
- https://github.com/NVIDIA/NeMo-Guardrails
Install
pip install nemoguardrails Quickstart
from nemoguardrails import LLMRails, RailsConfig
config = RailsConfig.from_path('./config') # contains config.yml + *.co files
rails = LLMRails(config)
response = rails.generate(messages=[{'role': 'user', 'content': 'Tell me about VSET.'}])
print(response['content']) Alternatives
- Guardrails AI — Python-first with RAIL spec
- LLM Guard — focused on prompt injection / PII
- Llama Guard — Meta's open safety model
- Amazon Bedrock Guardrails
Frequently asked questions
What is Colang?
A YAML-adjacent DSL for describing conversational flows, message canonical forms, and rails. It lets non-programmers specify policies that the runtime enforces against any LLM.
Does NeMo Guardrails slow responses down?
Rails add a small overhead because they call helper models for classification. Using fast guards (e.g. Mixtral-8x7B or Claude Haiku) keeps total latency under a few hundred milliseconds.
Sources
- NeMo Guardrails — GitHub — accessed 2026-04-20
- NeMo Guardrails — docs — accessed 2026-04-20