Runbooks
DDoS response
SEV1 runbook for volumetric attack or anomalous traffic surge.
Last updated
DDoS response (SEV1)
Triggered by: edge WAF (P0.C9) reports traffic ≥ 5× baseline for ≥ 2 minutes, OR a single source ASN/IP exceeding threshold, OR customer reports of service degradation traced to traffic surge.
On-call: Platform (primary), Security (secondary). Pager: SEV1. Estimated MTTR: 5-30 minutes for L3/L4; 1-4 hours for L7.
Stop-the-bleed
- Engage WAF defense: edge WAF rate limits + IP/ASN
blocklisting kick in automatically. Verify status:
matter ops waf-status - Engage surge mode if needed (P0.E15): non-critical paths shed
to preserve write + audit + dissolve.
matter ops surge-mode-engage --reason "ddos response" - Page the edge provider (Vercel / Cloudflare) for L3/L4 absorption.
Diagnose
Identify attack characteristics:
- L3/L4 (network): high PPS, single ASN. Provider handles.
- L7 application: legitimate-looking HTTP traffic. Look for cardinality of sources, request distribution by op.
- Slow loris: many open connections, slow data. Edge connection caps mitigate.
- Auth-flood: hitting
/v1/oauth/tokenor login flows.
matter ops anomaly-snapshot --window 5mRecover
Per attack type:
- L3/L4: provider absorbs; nothing for us to do beyond monitoring.
- L7: tighten edge rules (IP/ASN tier-based shedding). Per-customer rate-limit overrides for known-good accounts (don't block legitimate traffic).
- Auth-flood: enable CAPTCHA + step-up on the affected flow; defer the underlying auth processing.
Communicate
- Status page updated if customer-facing performance affected.
- Affected customers emailed if SLA credits triggered.
Post-recovery action items
- Block-list updated.
- WAF rules tuned if defaults missed the pattern.
- Postmortem if customer impact > 1 hour.