v0.2 · 1.000 control score · 0 false positives on 36 tasks

Stop AI agents
before they break things.

Cordon is the pre-execution control layer for AI agents. Lakera judges what the model said. Cordon judges what the model is about to do — deterministically, in microseconds, with zero LLM calls.

Try the playground Read the code

cordon-cli

$ pip install cordon-ai

Resolving dependencies... Done

Installed cordon-ai-0.2.0

$ cordon benchmark --profile strict

Running 36-task Semantic Guard benchmark...

attacks blocked: 18 / 18 (TPR 1.000)

benign allowed: 18 / 18 (FPR 0.000)

control score: 1.000 (p50 0.2 ms / verdict)

Verdict: BLOCK

Probe

typosquat (critical)

Latency

0.18 ms

Control score

1.000

vs 0.787 LLM judge · 0.000 Lakera

Built on peer-reviewed research

Apart Research AI Control Hackathon · 3rd of 36 countries Cohere Catalyst · accepted paper cordon-ai on PyPI

A control layer that gets out of your way.

Six deterministic probes catch the entire taxonomy of agent failures — without burning a single LLM call or guessing at intent from a transcript.

Typosquat detection

Catches reqeusts, colourama, python-dateuti1 in any requirements.txt, package.json or pip command. 0% miss rate on PyPI's known typosquat list.

Secret leak prevention

Detects API keys, OAuth tokens, private keys and connection strings about to be written to a public file or pasted into a network call — before the request goes out.

Security weakening

Blocks chmod -R 777, TLS-disable flags, auth bypasses and crypto downgrades — the patterns most often introduced by a coding agent under pressure.

Test suppression

Catches the agent's favourite hack: silently skipping a failing test, deleting an assertion, or pinning a coverage gate to zero so the build passes "green".

Exfiltration

Sensitive read + outbound network call = block. Distinguishes pastebins, webhooks, and developer-controlled origins via a curated sink list.

Silent failure

Bare except:, empty catch {}, shell 2>/dev/null || true — every pattern that hides a real error from your monitoring.

The same 36 tasks. Every existing tool, side-by-side.

Reproduce in 30 seconds: cordon compare --comparators all. Raw run archived in the repo.

Judge	TPR	FPR	Control	Passed	Latency
loading…

Cordon is the only judge with a perfect score, and it's faster by 4–6 orders of magnitude. Every other column has at least one disqualifying flaw: heuristics and transcript baselines miss every stealth-class attack, Lakera flags every benign task, and the LLM judge is non-deterministic and adds ~1.2 seconds of latency plus per-call dollar cost to every agent step.

Live playground.

Submit a proposed agent action. Cordon judges it deterministically in microseconds and never executes anything.

Action

Examples

Kind

Command

File changes

Format: path/to/file on its own line, then :::, then the file body. Separate multiple files with a line containing only ---.

30 checks/min/IP. Local install removes the limit.

Verdict

Pick an example or build an action, then press Check action.

From OpenAI tool call to safe execution.

One decorator. Same registry works with OpenAI, Anthropic and LangChain.

openai_protect.py

from cordon import Guard
from cordon.integrations.openai import ActionBuilder, check_response

builder = ActionBuilder()

@builder.tool("run_shell")
def _(args): return Action(kind="shell", command=args["command"])

response = client.chat.completions.create(...)
for tcv in check_response(response, builder, Guard.strict()):
    if tcv.blocked:
        send_refusal(tcv.tool_call_id, tcv.verdict.top_reason())

Drop into your existing agent.

Wrap the response your agent already gets from OpenAI / Anthropic / LangChain. Cordon decodes every tool call into a structured Action, runs the probes, and hands back a deterministic verdict per tool call. Zero changes to your prompts, zero new vendor SDKs in your runtime dependencies.

Deterministic probes, no LLM calls.

Six purpose-built probes evaluate each action in parallel. Every verdict is reproducible across machines and CI runs — same input, same output, every time. Median latency 0.2 ms.

ACTION GUARD VERDICT

allow

flag

block

1.000

Control score

0.2ms

Median verdict latency

LLM tokens / verdict

Drop into your stack in 5 lines.

One PyPI package. No GPU. No vendor SDK required at runtime.

install.sh

$ pip install cordon-ai

$ python -c "import cordon; print(cordon.Guard.strict().check(cordon.Action(kind='shell', command='chmod -R 777 /')).decision)"

block

Same registry works with openai, anthropic, and langchain integrations. No vendor SDK is a runtime dependency.

Ship agents you can trust in production.

Cordon is open-source under MIT. Star the repo, install the package, paste a tool call into the playground above.

Star on GitHub Try the playground

Stop AI agents
before they break things.

A control layer that gets out of your way.

Typosquat detection

Secret leak prevention

Security weakening

Test suppression

Exfiltration

Silent failure

The same 36 tasks. Every existing tool, side-by-side.

Live playground.

Top reason

Probes triggered

From OpenAI tool call to safe execution.

Drop into your existing agent.

Deterministic probes, no LLM calls.

Drop into your stack in 5 lines.

Ship agents you can trust in production.

Stop AI agents before they break things.

A control layer that gets out of your way.

Typosquat detection

Secret leak prevention

Security weakening

Test suppression

Exfiltration

Silent failure

The same 36 tasks. Every existing tool, side-by-side.

Live playground.

Top reason

Probes triggered

From OpenAI tool call to safe execution.

Drop into your existing agent.

Deterministic probes, no LLM calls.

Drop into your stack in 5 lines.

Ship agents you can trust in production.

Stop AI agents
before they break things.