Open source · v0.1.3 MIT · Python 3.11+

Catch risky agent actions before they merge.

Mirage sits between your agent and its APIs. It checks every outbound HTTP call against declarative policy, returns deterministic mocks, and fails the build before a hallucinated route or duplicate charge ever ships.

View on GitHub

Outcome taxonomy: 4 deterministic states
Stack: httpx · pytest · CLI
Gate: CI fails on policy_violation

mirage · proxy + ci

~/agent

$ mirage proxy --mocks ./mocks.yaml --policies ./policies.yaml

[mirage] proxy listening on :8001 · 12 routes · 4 policies

$ pytest tests/test_procurement_agent.py

agent → POST /v1/get_quote allowed · 12ms

agent → GET /v1/suppliers allowed · 31ms

agent → POST /v1/submit_bid { amount: 50000 }

policy_violation

✗ enforce_bid_limit (max: 25000)

side_effect_count: 0

trace: artifacts/traces/procurement-run.json

1 failed in 0.34s

trace deterministic

outcome policy_violation

Positioning

Three categories. One open seat.

Mirage isn’t another evaluator or another dashboard. It’s the deterministic gate that runs in CI before a regression reaches production.

Vs quality eval

LangSmith · Braintrust · Patronus

They score whether the agent answered correctly.

Mirage scores whether the agent’s actions stayed inside policy.

Vs observability

Agent tracing dashboards

They watch what the agent already did, after the run.

Mirage gates what the agent is about to do — pre-merge, in CI.

Vs HTTP mocks

VCR.py · respx · responses

They replay recorded responses; cassettes lock regressions.

Mirage enforces declarative policy on synthetic mocks, so a brand-new risky action is caught the first time it appears.

How it works

Three files. One deterministic gate.

Mirage’s entire surface area is mocks, policies, and a session wrapper. No SDKs to learn, no model-in-the-loop, no live traffic in tests.

Declare the surface

Author mocks for the routes your agent calls and policies for the rules it must obey. Both are plain YAML — review them in PRs like any config.

mocks.yaml

yaml

# mocks.yaml
mocks:
  - name: get_quote
    method: POST
    path: /v1/get_quote
    response:
      status_code: 200
      json:
        quote_id: Q-001
        price: 24500

  - name: submit_bid
    method: POST
    path: /v1/submit_bid
    response:
      status_code: 201
      json: { order_id: ORD-9 }

Wrap your run

Drop MirageSession around your agent run. Outbound httpx traffic is intercepted, matched, and policy-checked. assert_clean() is your gate.

test_procurement.py

from mirage import MirageSession

with MirageSession(run_id="procurement") as mirage:
    run_my_agent(client=mirage.client)
    # Fail the build if any call hit a policy_violation
    # or missed the declared mocks.
    mirage.assert_clean()

Fail the build

Every run emits a deterministic trace. CI fails on policy_violation or unmatched_route — no flaky live API, no model-judge in the loop.

trace.json

json

{
  "run_id": "procurement",
  "outcome": "policy_violation",
  "policy": "enforce_bid_limit",
  "request": {
    "method": "POST",
    "path": "/v1/submit_bid",
    "json": { "amount": 50000 }
  },
  "side_effect_count": 0
}

Review console

Read every run as a precision instrument.

Outcome taxonomy, response headers, and policy decisions surfaced without dashboards or live agents — just the trace your CI already wrote.

mirage console · procurement-run.json v0.1.3

Mirage review console showing a procurement run that triggered a policy violation

Screenshot · production console at v0.1.3 See it on GitHub

Roadmap

The deterministic engine, today. Chaos engineering for agent governance, next.

Mirage 0.1.x is what runs in your CI today. The numbered milestones below are the path to 1.0 — versioned, public, and shipped on PyPI.

v0.1.3 Shipped

The deterministic engine
- Httpx-native proxy and MirageSession
- Declarative mocks + policies in YAML
- Outcome taxonomy + deterministic traces
- Pytest plugin, `mirage gate-run` CLI, review console
v0.2.0 In progress

Chaos library + scenario DSL
- Network and payload chaos modes
- Scenario YAML loader
- Reference scenarios under hostile conditions
v0.3.0 Planned

Containment metrics
- Containment rate, false-negative rate, time-to-detect
- CLI scenario runner with JUnit output
- Resilience tab in the review console
v1.0.0 Horizon

Policy chaos + reference suite
- Policy-layer chaos modes
- Authoring docs + reference scenario library
- 1.0 stability commitment

Get started

Ship the next agent commit behind a deterministic gate.

Install in 30 seconds. Author one mocks file and one policies file. Wrap your agent run. CI fails the next risky action before it merges.

Read the docs GitHub

Catch risky agent actions before they merge.

Three categories. One open seat.

Three files. One deterministic gate.

Declare the surface

Wrap your run

Fail the build

Read every run as a precision instrument.

The deterministic engine, today. Chaos engineering for agent governance, next.

The deterministic engine

Chaos library + scenario DSL

Containment metrics

Policy chaos + reference suite

Ship the next agent commit behind a deterministic gate.