mirage
Open source · v0.1.3

Catch risky agent actions before they merge.

Mirage sits between your agent and its APIs. It checks every outbound HTTP call against declarative policy, returns deterministic mocks, and fails the build before a hallucinated route or duplicate charge ever ships.

View on GitHub
Outcome taxonomy
4 deterministic states
Stack
httpx · pytest · CLI
Gate
CI fails on policy_violation
mirage · proxy + ci
~/agent
$ mirage proxy --mocks ./mocks.yaml --policies ./policies.yaml
[mirage] proxy listening on :8001 · 12 routes · 4 policies
 
$ pytest tests/test_procurement_agent.py
 
agent → POST /v1/get_quote allowed · 12ms
agent → GET /v1/suppliers allowed · 31ms
agent → POST /v1/submit_bid { amount: 50000 }
policy_violation
✗ enforce_bid_limit (max: 25000)
 
side_effect_count: 0
trace: artifacts/traces/procurement-run.json
 
1 failed in 0.34s
trace deterministic
outcome policy_violation

Positioning

Three categories. One open seat.

Mirage isn’t another evaluator or another dashboard. It’s the deterministic gate that runs in CI before a regression reaches production.

Vs quality eval

LangSmith · Braintrust · Patronus

They score whether the agent answered correctly.

Mirage scores whether the agent’s actions stayed inside policy.

Vs observability

Agent tracing dashboards

They watch what the agent already did, after the run.

Mirage gates what the agent is about to do — pre-merge, in CI.

Vs HTTP mocks

VCR.py · respx · responses

They replay recorded responses; cassettes lock regressions.

Mirage enforces declarative policy on synthetic mocks, so a brand-new risky action is caught the first time it appears.

How it works

Three files. One deterministic gate.

Mirage’s entire surface area is mocks, policies, and a session wrapper. No SDKs to learn, no model-in-the-loop, no live traffic in tests.

01

Declare the surface

Author mocks for the routes your agent calls and policies for the rules it must obey. Both are plain YAML — review them in PRs like any config.

mocks.yaml
yaml
# mocks.yaml
mocks:
  - name: get_quote
    method: POST
    path: /v1/get_quote
    response:
      status_code: 200
      json:
        quote_id: Q-001
        price: 24500

  - name: submit_bid
    method: POST
    path: /v1/submit_bid
    response:
      status_code: 201
      json: { order_id: ORD-9 }
02

Wrap your run

Drop MirageSession around your agent run. Outbound httpx traffic is intercepted, matched, and policy-checked. assert_clean() is your gate.

test_procurement.py
py
from mirage import MirageSession

with MirageSession(run_id="procurement") as mirage:
    run_my_agent(client=mirage.client)
    # Fail the build if any call hit a policy_violation
    # or missed the declared mocks.
    mirage.assert_clean()
03

Fail the build

Every run emits a deterministic trace. CI fails on policy_violation or unmatched_route — no flaky live API, no model-judge in the loop.

trace.json
json
{
  "run_id": "procurement",
  "outcome": "policy_violation",
  "policy": "enforce_bid_limit",
  "request": {
    "method": "POST",
    "path": "/v1/submit_bid",
    "json": { "amount": 50000 }
  },
  "side_effect_count": 0
}

Review console

Read every run as a precision instrument.

Outcome taxonomy, response headers, and policy decisions surfaced without dashboards or live agents — just the trace your CI already wrote.

mirage console · procurement-run.json v0.1.3
Mirage review console showing a procurement run that triggered a policy violation
Screenshot · production console at v0.1.3 See it on GitHub

Roadmap

The deterministic engine, today. Chaos engineering for agent governance, next.

Mirage 0.1.x is what runs in your CI today. The numbered milestones below are the path to 1.0 — versioned, public, and shipped on PyPI.

  1. v0.1.3 Shipped

    The deterministic engine

    • Httpx-native proxy and MirageSession
    • Declarative mocks + policies in YAML
    • Outcome taxonomy + deterministic traces
    • Pytest plugin, `mirage gate-run` CLI, review console
  2. v0.2.0 In progress

    Chaos library + scenario DSL

    • Network and payload chaos modes
    • Scenario YAML loader
    • Reference scenarios under hostile conditions
  3. v0.3.0 Planned

    Containment metrics

    • Containment rate, false-negative rate, time-to-detect
    • CLI scenario runner with JUnit output
    • Resilience tab in the review console
  4. v1.0.0 Horizon

    Policy chaos + reference suite

    • Policy-layer chaos modes
    • Authoring docs + reference scenario library
    • 1.0 stability commitment

Get started

Ship the next agent commit behind a deterministic gate.

Install in 30 seconds. Author one mocks file and one policies file. Wrap your agent run. CI fails the next risky action before it merges.