Runbooks
MFN cascade
SEV2 runbook for an MFN cascade that partially fails or stalls.
Last updated
MFN cascade (SEV2)
Triggered by: an MFN-bearing convertible's terms were improved
strictly (lower cap OR higher discount OR lower interest, none
worse), triggering propagation to siblings (apps/api/lib/convertible-amend.ts).
The cascade saga executes per-sibling; partial failure is the alert
condition.
On-call: Primary (cap-table bounded context). Estimated MTTR: 30-90 minutes.
Stop-the-bleed
- Freeze new MFN-bearing convertibles in the affected entity:
matter ops freeze-convertibles --entity-id <id> --reason "mfn cascade in flight" - Snapshot the affected convertibles + amendments:
matter ops snapshot-cascade --cascade-id <id>
Diagnose
Per-sibling cascade state is in saga_instance rows. Check:
- Which siblings completed?
- Which failed and why?
- Are any in a compensating state?
matter ops mfn-cascade-status --cascade-id <id>Failure modes:
Mode A: Counterparty consent missing
The cascade tried to amend a sibling whose counterparty consent wasn't pre-collected. Resolution: collect consent, resume.
Mode B: Fresh-409A drift
A sibling's cap reduction needs fresh 409A but the available valuation expired mid-cascade. Resolution: re-run valuation, resume.
Mode C: Concurrent amend race
Another amend operation on the same convertible raced with the cascade. Optimistic-concurrency conflict at the cascade step. Resolution: retry the failed leg.
Recover
matter ops mfn-cascade-resume --cascade-id <id>The saga primitive's decideResume (P5.4) determines whether to
retry vs compensate. Per packages/saga/close-package.tla —
non-compensable steps halt; compensable retry.
Validate
matter ops verify-mfn-cascade --entity-id <id>Expected: every MFN-bearing convertible's terms ≥ the triggering
amendment's terms. The isMfnTriggered check should now return
false for the cascade trigger.
Post-recovery action items
- Add property test for the failed mode using
apps/api/lib/saga-property.tsharness. - Update
convertible-amend.tsadmissibility if a missing gate caused the failure.