Runtime safety for AI tool execution

Ship AI agents you can trust in production.

Thorngate validates every AI tool call against versioned contracts, prevents duplicate executions, and gives you deterministic replay for failures — without replacing your existing stack.

Integrate Now Watch a Demo →

Works with

LangChain LangGraph OpenAI Agents MCP Vercel AI SDK Python SDK TypeScript SDK OpenTelemetry

thorngate — runtime

# malformed tool call intercepted before execution

TOOL_CALL send_email execution=tg_8f2a9c1d

input.to → "all-customers@acme.com"

✗ scope violation

input.subject → null

✗ required field missing

input.body → "[object Object]"

✗ expected string, got object

STATUS: BLOCKEDcontract=v2.4.1

✓ tool call never executed

✓ deterministic replay available

✓ full execution trace captured

$ █

Production failures

The failures AI observability tools don't prevent.

Most production incidents don't come from the model. They happen when agents call real systems with invalid arguments, duplicate retries, or outdated schemas.

01 / MALFORMED ARGUMENTS

Malformed tool arguments

LLMs generate structurally invalid payloads that still reach production APIs.

→ Wrong JSON shape sent to billing API. Charge fails silently.

02 / DUPLICATE EXECUTION

Duplicate actions on retry

The same refund, email, or database write executes twice after transient failures.

→ Customer refunded 3× because retry logic re-triggered a destructive tool.

03 / SCHEMA DRIFT

Silent API drift

Your tool changes shape. Your agent still calls the old contract.

→ Required field changed in production. Agents kept executing broken calls for hours.

04 / UNREPLAYABLE INCIDENTS

Incidents you can't reconstruct

You cannot reproduce the exact tool calls, retries, and decisions that caused the failure.

→ Agent retried 6× before failing. You can't tell which retry caused the duplicate charge.

How it works

A runtime safety layer between your agents and production systems.

Thorngate intercepts every tool call, validates it against contracts, enforces execution policies, and captures replayable traces — then passes the call through.

STEP 01

Wrap existing tools WORKS WITH YOUR EXISTING STACK

Add Thorngate around your existing tool calls. No orchestration migration. No framework rewrite. No proxy layer.

from thorngate import tool

safe_send_email = tool(send_email)

# that's it

STEP 02

Validate before execution INPUT + OUTPUT VALIDATION

Every tool call is checked against a versioned contract before execution. Catches hallucinated arguments, missing required fields, malformed payloads, and schema drift.

# contract: send_email@v2.4.1

input.to: string  required, scope=!broadcast

input.subject: string  required

input.body: string  required

# → violations block execution before sending

STEP 03

Prevent duplicate executions RETRY-SAFE EXECUTION

Thorngate automatically tracks execution state and idempotency. Retries no longer create duplicate refunds, emails, or writes.

STEP 04

Replay failures exactly DETERMINISTIC REPLAY

Every execution becomes replayable with exact production context. Test with different prompts, different models, or mocked tools. Turn production failures into regression tests in CI.

Core capabilities

Reliability primitives for AI agents.

Every piece is designed to compose with your existing infrastructure — not replace it.

[#]

Contract validation

Validate tool inputs and outputs against versioned JSON Schema contracts. Catches hallucinated arguments, missing fields, and schema drift.

[↺]

Deterministic replay

Replay any production execution exactly or with overrides. Turn real failures into regression tests that run in CI.

[⊘]

Idempotency governance

Prevent duplicate executions across retries, crashes, and orchestrator restarts. No more duplicate refunds or double-sent emails.

[~]

Execution traces

Capture the full causal chain: input → decisions → tool calls → retries → output. Every step, in order, always.

[v]

Contract lifecycle

Infer contracts from real traffic and version them automatically. Git-managed, reviewable, auditable.

[↗]

Audit export

Structured execution records for enterprise reviews and compliance workflows. OpenTelemetry-compatible.

Ship AI agents you can trust in production.

The failures AI observability tools don't prevent.

A runtime safety layer between your agents and production systems.

Reliability primitives for AI agents.

Catch invalid AI tool calls before your customers do.