Changed a prompt. Agent behaved differently in production. Which line?
Version, bundle, and ship AI agent behavior like software.
Prompts, policies, and model config define your agent — but most teams deploy them without versioning, testing, or rollback. PromptOps versions every prompt automatically. ReleaseOps bundles them with policies, promotes through gated environments, and traces exactly why behavior changed.
One workflow, two tools
How it works
Version your prompts
Write prompts as YAML templates with variables. PromptOps auto-versions them on every git commit — semantic tags, diff tracking, and version history out of the box.
Reference any version in code: :v1.2.0,
:latest, or even :unstaged
for testing uncommitted changes.
id: support-system
description: Customer support agent
variables:
customer_name: { required: true }
request: { required: true }
template: |
You are a support agent for Acme Corp.
REFUND POLICY:
- Auto-approve refunds up to $200
- Escalate refunds over $200
- Never approve if customer is abusive
Bundle and promote
ReleaseOps reads your versioned prompt via PromptBridge, bundles it with tool policies and model config into an immutable, SHA-256 content-addressed artifact.
Promote through environments with eval gates. Rollback instantly. Every action recorded in an audit trail.
from llmhq_releaseops.runtime import RuntimeLoader
loader = RuntimeLoader()
bundle, metadata = loader.load_bundle("support-agent@prod")
# Everything resolved and verified
model = bundle.model_config.model # claude-sonnet-4-5
prompts = bundle.prompts # versioned refs
policies = bundle.policies # tool access rules
# Metadata auto-injected into OTel spans
Know why behavior changed
When behavior shifts between versions, attribution traces each agent action back to the specific prompt lines and policy rules that influenced it.
Pattern matching with confidence scoring — not causal claims. Points engineers to the right place to investigate.
# Why did v1.0.0 ESCALATE the $120 refund?
Primary influence (confidence: 0.70):
Source: prompt (support-system@v1.0.0)
Line 15: "Escalate any refund over $50"
# Why did v1.1.0 APPROVE it?
Primary influence (confidence: 0.70):
Source: prompt (support-system@v1.1.0)
Line 13: "Auto-approve up to $200"
The key moment: one line changes everything
See the full workflow
The interactive demo runs both tools end-to-end with a real scenario: 5 customer requests, two prompt versions, one behavioral divergence. No API keys needed.
Get started in seconds
Install both, or start with either — they work standalone or together.
pip install llmhq-promptops llmhq-releaseops