API · Conventions
Rate limits
Per-key limits, 429 response shape, X-RateLimit headers, and burst exemptions for trusted agents.
Last updated
TL;DR. Limits applied per key or token. X-RateLimit-Limit,
X-RateLimit-Remaining, and X-RateLimit-Reset on every response. On 429, honor the
retry_after in the problem body (seconds) — it's the authoritative backoff signal.
Matter applies rate limits per key or token. The defaults are generous for normal use and scale on request.
Default limits
| Key type | Burst (per second) | Sustained (per minute) |
|---|---|---|
sk_live_ | 100 | 1000 |
sk_test_ | 25 | 250 |
pk_live_ / pk_test_ | 10 per IP | 100 per IP |
tok_* (agent tokens) | Per structured policy | Per structured policy |
Agent tokens can be granted burst exemptions via the limits.frequency_caps field on the
token's structured policy. This is how tier-4 automation agents avoid rate-limit contention
during scheduled filing runs.
Rate-limit headers on every response
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 99
X-RateLimit-Reset: 1745251260Budget your request rate against these. X-RateLimit-Reset is a Unix timestamp.
429 response shape
When you exceed a limit, Matter returns a 7807 problem object plus the standard Retry-After
header (seconds):
HTTP/1.1 429 Too Many Requests
Retry-After: 30
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1745251290
Content-Type: application/problem+json
{
"type": "https://mattermode.com/docs/errors/rate_limited",
"title": "Too Many Requests",
"status": 429,
"code": "rate_limited",
"detail": "This key exceeded the 100 req/s burst limit. Retry after 30s.",
"retry_after": 30,
"request_id": "req_Qw9xYz8A"
}Retry strategy
Respect Retry-After. If it's absent on a transient 5xx, back off exponentially starting at
1s with jitter: 1s, 2s, 4s, 8s, 16s — stop after 5 attempts for a single logical request.
Most SDK clients do this automatically. If you're writing against the API directly, use a
library like backoff (Python) or p-retry (JavaScript).
Raising limits
Send support@mattermode.com with:
- Account ID.
- Typical requests per second at peak.
- Burst-rate needs (if any).
- Use case (compliance automation, venture studio batching, etc.).
Production accounts with established traffic patterns are routinely raised to 500+ RPS
burst. Venture-studio accounts running portfolio batching workloads use a separate burst
budget tied to the Portfolio resource.