LLM-based agents are increasingly deployed in safety-critical applications, yet current guardrail systems fail to prevent violations of temporal safety policies — requirements that govern the ordering and sequencing of agent actions. For instance, agents may access sensitive data before authenticating users or process refunds to unauthorized payment methods.
We present Agent-C, a novel framework that provides run-time guarantees ensuring LLM agents adhere to formal temporal safety properties. Agent-C introduces a domain-specific language for expressing temporal properties, translates specifications to first-order logic, and uses SMT solving to detect non-compliant agent actions during token generation. When the LLM attempts to generate a non-compliant tool call, Agent-C leverages constrained generation techniques to ensure compliance and generate a safe alternative.
Existing guardrails for LLM agents rely on imprecise natural language instructions or post-hoc monitoring. They lack formal guarantees and cannot reason about sequences of actions. This leads to real safety failures:
get_order_details without first authenticating the user via find_user_id_by_email, leaking sensitive order information. State-of-the-art guardrails like DynaGuard fail to enforce this temporal ordering.
Agent-C operates through three key components:
Before, After, Forall, Exists, and Seq that express temporal ordering requirements among tool calls and invariants across calls.
Agent-C is evaluated on the τ-Bench suite across retail and airline scenarios with both open-weight (Qwen3-8B/14B/32B) and closed-source (Claude Sonnet 4.5, GPT-5) models.
| Model | Agent | Conformance (Benign) | Conformance (Adversarial) | Harm |
|---|---|---|---|---|
| Qwen3-32B (Retail) |
Agent-C | 100.00% | 100.00% | 0.00% |
| AgentSpec | 84.06% | 98.04% | 1.96% | |
| DynaGuard | 77.10% | 66.67% | 19.61% | |
| Unrestricted | 37.69% | 70.59% | 13.73% | |
| Claude Sonnet 4.5 (Retail) |
Agent-C | 100.00% | 100.00% | 0.00% |
| Unrestricted | 77.43% | 82.35% | 17.65% | |
| GPT-5 (Retail) |
Agent-C | 100.00% | 100.00% | 0.00% |
| Unrestricted | 83.72% | 85.29% | 8.82% |
Agent-C achieves 100% conformance and 0% harm across all models, benchmarks, and settings — including adversarial scenarios. It also improves task utility compared to existing guardrails and unrestricted agents, e.g., from 77.4% to 100% conformance for Claude Sonnet 4.5 and from 83.7% to 100% for GPT-5 on retail benchmarks.
An Agent-C specification for the authentication-before-access policy:
before(
get_order_details(order_id=order_id), True,
f:find_user_id_by_email(email=.*),
output(f) != "Error: user not found" &&
state(order_belongs_to(order_id)) == output(f) &&
state(exists_order(order_id)) == true
)
This states that before calling get_order_details with any order ID, the agent must have previously called find_user_id_by_email, the authentication must have succeeded, and the order must belong to the authenticated user.
@article{kamath2025agentc,
title={Enforcing Temporal Constraints for LLM Agents},
author={Kamath, Adharsh and Zhang, Sishen and Xu, Calvin and Ugare, Shubham and Singh, Gagandeep and Misailovic, Sasa},
journal={arXiv preprint arXiv:2503.04562},
year={2025}
}