Contribution · Application — Developer & DevOps

AI for Automated Runbook Execution in DevOps

On-call engineers at 3am run the same 10 runbooks most weeks: restart a pod, rotate a secret, drain a node, clear a queue. Agentic LLMs with tool-use can execute these runbooks under supervision — dry-running the destructive commands, diffing state before and after, and surfacing a human-readable summary. The winning pattern is declarative: runbooks as YAML, the agent as the executor with hard approval gates on high-blast-radius actions.

Application facts

Domain: Developer & DevOps
Subdomain: Incident response
Example stack: Claude Sonnet 4.7 as the runbook-executor agent · Runbooks in YAML or CUE with typed parameters · kubectl + Terraform + Ansible via MCP tool servers · Temporal.io for durable workflow orchestration · Sigstore for signed action provenance

Data & infrastructure needs

Declarative runbook library (versioned, reviewed)
Service topology and dependency graph
On-call rotation and escalation policies
Incident history for learning loops

Risks & considerations

Prompt injection from log fields influencing tool calls
Blast-radius escalation — unintended mass-scale changes
Drift between documented runbook and actual system state
SOC 2 / ISO 27001 audit failures if action provenance is weak

Frequently asked questions

Is it safe to let an LLM execute runbooks?

Only with strict guardrails: typed parameters, dry-run for destructive actions, human approval for irreversible operations, signed action logs, and a kill-switch. The LLM should choose among approved runbooks, not invent commands.

What model is best for agentic runbook execution?

Claude Sonnet 4.7 leads on reliable tool use and instruction following for operational workflows as of April 2026. GPT-5 is competitive. Avoid smaller models for production systems; reasoning quality is decisive when a wrong action costs customer impact.

Regulatory considerations for automated runbooks?

SOC 2, ISO 27001, and ISAE 3402 for audit trails. RBI IT Framework and SEBI CSCRF for BFSI. HIPAA for PHI-touching systems. EU AI Act may classify high-blast-radius automation as high-risk for critical infrastructure.

Sources

NIST SP 800-61r3 incident handling — accessed 2026-04-20
Sigstore project — accessed 2026-04-20