MCP
Authoring scope policies
Pick a tier, declare what an agent can do, and ship the smallest possible blast radius. Worked examples for every tier transition.
Last updated
TL;DR. Pick the lowest tier that satisfies the workflow. Declare a
structured scopes[] array on the token. Each entry is allow + optional
deny / resources / conditions / limits. The runtime fails closed —
if no entry matches, the request returns 403 missing_grant with the verb
it was looking for. Test mode + X-Matter-Test-Speed: instant lets you
iterate on a policy in seconds. Treat the policy as code: review it, diff
it, and version it alongside the agent that consumes it.
The structured scope-policy DSL is the lever you use to put an agent on the network with exactly the surface area it needs — no more, no less. This page is the authoring guide: how to pick a tier, write the JSON, test the policy, and iterate without spending production money.
For the resource shapes, the full evaluation order, and the ten-row table of
condition keys, see /api/auth/agents. This page is the
how — that page is the what.
Pick a tier
Start with the workflow, not the verbs. Ask one question at a time.
Does the agent need to write? If no — never, not even with
?dry_run=true — stop. Tier 1. A tier-1 token cannot mutate. The MCP
catalog hides every write tool; the API rejects every write verb at
policy-evaluation time. Cap-table mirroring, investor portals, and
read-only LLM Q&A all live here.
Does a human file the artifact, with the agent only drafting? If yes
— the agent prepares, the human ships — tier 2. Tier 2 lets the
agent call POST /v1/filings only with ?dry_run=true. The mutation
rehearses; the response is a draft document; nothing reaches the state
registrar. Paralegal copilots and "model the next round" tools live
here.
Does every consequential mutation get a named human signature? If
yes — agent drives, human signs each step — tier 3. Every
destructive or high-stakes write returns 202 pending_authorization and
pauses on an Authorization. Chief-of-staff agents live here.
Are the autonomous ops routine, capped, and reversible? If yes —
annual reports, franchise tax, BOI updates, mail acknowledgement — and
you are willing to declare a hard max_cost_usd cap, tier 4.
Anything outside the cap or hardcoded-destructive (dissolution, M&A
close, registered-agent change, token revocation) still pauses — tier 4
is not an unsafe word. Compliance robots and standing renewal queues
live here.
If two tiers feel close, pick the lower one. Tier downgrades are free at
runtime (a tier-3 token can voluntarily call ?dry_run=true); tier upgrades
require minting a new token.
The five-key DSL
Every entry in scopes[] has the same shape. Here is the minimum policy
that does anything useful — read entities in a portfolio, nothing else:
{
"scopes": [
{
"allow": ["entities.read", "entities.list"],
"resources": ["pf_studio_fund_i"]
}
]
}Each key is independent and adds restrictions:
| Key | Job | Default |
|---|---|---|
allow | The verbs the entry permits. Required. | — |
deny | Verbs to subtract from allow. Evaluated after. | [] |
resources | Typed-ID literals or <prefix>_* patterns the entry applies to. | ["*"] (everything the principal sees) |
conditions | JSON predicates against the request and resolved resource. | {} (always true) |
limits | Frequency and cost caps. | none |
The runtime reads them top-to-bottom on every request. See
/api/auth/agents#structured-scope-policy-dsl
for the exact evaluation order.
Tier transitions — worked examples
The four tiers are not silos. Most production agents start at tier 1 in shadow, graduate to tier 2 once a human is willing to file what it drafts, graduate again to tier 3 once the team trusts the workflow, and only some make it to tier 4. These are the policy diffs at each step.
1 → 2: from observe to prepare
The tier-1 read-only observer:
{
"tier": 1,
"scopes": [
{
"allow": ["entities.read", "entities.list", "filings.read", "filings.list"],
"resources": ["ent_Nq3KcAbc"]
}
]
}To upgrade so the same agent can rehearse filings — but never submit them —
mint a fresh token at tier 2 with the new write verb gated on
request.dry_run:
{
"tier": 2,
"scopes": [
{
"allow": [
"entities.read", "entities.list",
"filings.read", "filings.list",
"filings.create"
],
"resources": ["ent_Nq3KcAbc"],
"conditions": { "request.dry_run": true }
}
]
}The condition is the safety net. A POST /v1/filings without
?dry_run=true does not match and falls through to 403 condition_not_met.
Tier 2 also hides all non-prepare_* MCP tools from the catalog, so the
agent runtime cannot describe a real-mode mutation to the model.
2 → 3: from prepare to execute
To let the agent submit — but with a human signing every consequential
mutation — drop the request.dry_run condition and let the tier guard do
its job. Tier 3 will pause every destructive write on Authorization
regardless of policy:
{
"tier": 3,
"scopes": [
{
"allow": [
"entities.*",
"filings.*",
"documents.*",
"grants.*",
"resolutions.*"
],
"deny": ["entities.dissolve", "tokens.revoke", "tokens.rotate"],
"resources": ["ent_Nq3KcAbc"],
"limits": { "max_calls_per_hour": 240 }
}
]
}Three things to notice:
- The
*wildcard inentities.*coversread,list,create,update— but notdissolve, becausedenyremoves it. - The wildcard intentionally does not extend to operations the verb
catalog might add later (e.g. an unreleased
entities.transfer_jurisdiction). That is by design — defence in depth. limits.max_calls_per_hour: 240is cheap defence. Tier 3 already gates destructive writes on a human signature, so a runaway loop hits the rate limit, not your filing fees.
3 → 4: from execute to autonomous
Tier 4 is where you commit, in writing, to letting an agent spend money on its own. The diff has three parts.
First, narrow the verb surface. A tier-3 chief-of-staff token might allow
entities.*. A tier-4 compliance agent allows the bare minimum:
{
"allow": [
"entities.read",
"filings.read", "filings.list", "filings.create",
"tax_profiles.read", "tax_profiles.update",
"mail.acknowledge"
]
}Second, add closed-set conditions so the policy can only ever fire on
routine ops:
{
"conditions": {
"filing.type": ["annual_report", "franchise_tax", "boi_update"],
"mail.category": ["routine_correspondence"]
}
}Third — and this is the cap that lets the operator sleep — declare hard cost and frequency caps:
{
"limits": {
"max_cost_usd": 5000,
"max_filings_per_quarter": 12,
"max_calls_per_hour": 60
}
}max_cost_usd is a sliding monthly window. The 5,001st dollar of a month
does not fail — it returns 202 pending_authorization and pauses on an
Authorization. Tier 4 has a tier-3 fallback, not a tier-1 fallback. The
agent never silently degrades.
Anti-pattern — the lazy *
Do not write {"allow": ["*"], "resources": ["*"]} "for now". The DSL is
designed to make the laziness expensive: there is no good way to type that
that does not look obviously wrong in a code review. If you find yourself
reaching for it during a hackathon, mint a sk_test_ key with auto_approve: true instead — it gives you the same iteration speed without the
production-shaped exposure. See the Iterate without paying
section.
A decision tree
┌─ no ─→ Tier 1 (observe-only)
Mutates anything? ──┤
└─ yes ─┐
├─ Always ?dry_run=true ──→ Tier 2 (prepare-only)
├─ Human signs each write ──→ Tier 3 (execute)
└─ Routine + capped ops ──┐
├─ Hardcoded destructive only? ──→ stay Tier 3
└─ otherwise ──→ Tier 4 (autonomous-with-cap)If the same agent has both a routine surface (renewals) and a non-routine
surface (priced rounds), mint two tokens. Resist the temptation to make
one token straddle two tiers via conditions — readability is part of
auditability, and a single token doing two things at two different
liability levels is a code smell auditors notice.
Iterate without paying
Live-mode iteration on agent policies is a non-starter — formation cascades take days, filings cost money, dissolutions are irreversible. Test mode exists for exactly this loop.
- Mint a
sk_test_key. Test mode is an isolated dataset with no real money and no real state filings. - Mint a test-mode
tok_withauto_approve: true. EveryAuthorizationflips frompendingtoapprovedin ~50 ms. The full tier-3 → human-signs → resume cycle plays out without a human in the loop, but the audit trail still records the standing approval. Use this on your CI builds. - Send
X-Matter-Test-Speed: instanton the first request of the workflow. The whole formation cascade resolves before your next assertion runs. A 7-day annual report cycles in under a second.
# 1. Mint the policy as a test-mode token.
curl -X POST https://api.mattermode.com/v1/tokens \
-H "Authorization: Bearer $MATTER_TEST_KEY" \
-H "Matter-Version: 2026-05-01" \
-d '{
"tier": 3,
"principal": { "human_id": "usr_test", "agent_id": "agt_dev" },
"scopes": [{
"allow": ["entities.*", "filings.*", "documents.*"],
"deny": ["entities.dissolve"],
"resources": ["ent_*"]
}],
"auto_approve": true,
"api_version": "2026-05-01"
}'
# 2. Drive the workflow against the test token at warp speed.
curl -X POST https://api.mattermode.com/v1/intents/execute \
-H "Authorization: Bearer $TEST_TOK" \
-H "Matter-Version: 2026-05-01" \
-H "X-Matter-Test-Speed: instant" \
-H "Idempotency-Key: $(uuidgen)" \
-d '{ "intent": { "type": "form_company", "jurisdiction": "US-DE" } }'The test-mode root key signs receipts with a separate hierarchy — test
artifacts cannot be replayed against live mode, and the auto_approve
flag is rejected on live keys with 400 invalid_request.auto_approve_live_mode.
Common pitfalls
Related
/api/auth/agents— the canonical reference for the 4-tier model, the dual-attribution envelope, theAuthorizationresource, and the full condition catalog./api/auth/tokens— theTokenresource, generated from the spec./api/conventions/test-mode—sk_test_keys,X-Matter-Test-Speed,TestClock,auto_approve./api/conventions/idempotency— howIdempotency-Keyis held across a paused authorization./mcp/authentication— Bearer + OAuth 2.1 + tier filtering attools/listtime./mcp/workflows— composite tools that stitch together multi-step operations within a single tier.