Security
Data classification
How Matter classifies every data field. Four classes (Public, Confidential, Restricted, Highly Restricted) with per-class handling rules for encryption, redaction, retention, audit-on-read, and export. Codegen-driven from the OpenAPI spec.
Last updated
Every field Matter stores or transmits has a classification. The classification determines handling rules: whether the field is encrypted at rest, whether it is redacted from observability sinks, whether reads are audited, what retention applies, whether it can appear in webhook payloads, whether it appears in customer data export.
The matrix below is the source of truth. Field-level classifications are declared in the OpenAPI spec via x-matter-classification on each property; codegen at apps/api/scripts/generate-classification-matrix.ts enforces drift between spec and runtime. The drift gate is part of the CI suite (P0.D5).
The four classes
Public
Information that is intentionally public-facing. No handling restrictions.
Examples:
Entity.legal_name(registered with the SOS; appears on public filings)Entity.entity_type(c_corp,llc, ...)Entity.jurisdictionEntity.formed_atTrademark.serial_numberTrademark.registration_number- Resource typed IDs (
ent_...,doc_...) - Object timestamps (
created_at,updated_at) - Plan tier names, error codes, operation IDs.
Handling:
- Encryption at rest: not required (encrypted at the storage layer for hygiene; not field-level).
- Redaction: none.
- Audit-on-read: no.
- Retention: forever (per resource retention policy).
- Export: included in customer data export.
- Webhooks: can appear in any payload.
Confidential
Customer-internal information that should not leak between tenants but is not personally identifying or financially sensitive.
Examples:
Entity.dba_name(sometimes confidential)EquityPlan.authorized_shares(customer-internal; competitors should not see)Round.pre_money_valuationConvertible.discountandConvertible.valuation_capToken.scopePolicy- Cap-table aggregate numbers visible only to portfolio members.
Handling:
- Encryption at rest: not field-level (encrypted at storage layer).
- Redaction: in customer data export to other tenants. Never appears in cross-tenant responses.
- Audit-on-read: no.
- Retention: per resource retention policy.
- Export: included in the owning customer's data export; never in other customers'.
- Webhooks: appears in payloads only to webhook endpoints owned by the same tenant.
Restricted
Personally identifying information about stakeholders, founders, employees, contractors. Subject to GDPR, CCPA, and similar regimes.
Examples:
Stakeholder.emailStakeholder.phoneStakeholder.address.street_line_1,street_line_2Stakeholder.preferred_nameStakeholder.titleOfficer.work_emailDirector.contact_phoneBankAccount.account_holder_name
Handling:
- Encryption at rest: field-level encrypted with per-tenant DEK via
Encrypted<T>wrapper (P0.C2). Blind indexes for equality lookup. - Redaction: scrubbed from observability sinks (Sentry, Logtail, traffic capture). Path-level redaction via
x-matter-piicodegen. - Audit-on-read: no (default). Per-org policy may upgrade.
- Retention: per the retention policy. Subject to GDPR Article 17 erasure.
- Export: included in the owning customer's data export (their own data is theirs).
- Webhooks: redacted in fat payloads by default; customer opts in to receive PII fields via
WebhookEndpoint.include. Thin payloads carry no PII.
Highly Restricted
The highest-sensitivity data: government identifiers, financial credentials, cryptographic material.
Examples:
Stakeholder.tax_id(SSN, ITIN, foreign equivalent)Stakeholder.date_of_birthEntity.ein(federal Employer Identification Number)BankAccount.account_numberBankAccount.routing_numberTaxProfile.state_tax_ids[].valueToken.hashedSecret(never returned; here for completeness)WebhookEndpoint.secret(returned once on creation; never again)SigningEnvelope.signature_value(cryptographic signature material)
Handling:
- Encryption at rest: field-level encrypted with per-tenant DEK via
Encrypted<T>. Mandatory. Codegen drift gate fails CI if a Highly Restricted field is not encrypted. - Redaction: scrubbed from every observability sink. Path-level redaction is mandatory. Stored idempotency-record responses redact before persistence.
- Audit-on-read: mandatory. Every read of a Highly Restricted field emits an AuditEntry with action
<resource>.read(P0.B6 + P0.F18). The customer can see who read what when. - Retention: per the retention policy. Subject to GDPR Article 17 erasure with DEK destruction.
- Export: included in the owning customer's data export but with extra customer-side encryption (export tarball is encrypted with a customer-supplied key).
- Webhooks: never appears in fat or thin payloads. The customer must call the API directly with appropriate auth + audit-on-read.
Per-field declaration
In the spec:
components:
schemas:
Stakeholder:
type: object
properties:
email:
type: string
format: email
x-matter-pii: true
x-matter-classification: restricted
x-matter-explainer:
context: "Stakeholder's preferred contact email. Used for grant agreement delivery, board meeting invites, dissolution notices."
tax_id:
type: string
x-matter-pii: true
x-matter-classification: highly_restricted
x-matter-encrypted: true
x-matter-explainer:
context: "Federal tax identifier (SSN for US persons, ITIN, or foreign equivalent). Required for 1099 / W-2 reporting, restricted-stock grants, 83(b) elections."Codegen produces:
apps/api/lib/data-classification.generated.ts— runtime classification map keyed by(resource_type, field_path).apps/api/lib/pii-paths.generated.ts— flat list of PII paths for redaction.apps/api/lib/encrypted-fields.generated.ts— list of encrypted fields cross-checked against the Prisma schema (CI gate).
Operational consequences
The classification drives behaviour in five places:
- Encryption at write. Restricted (optional) and Highly Restricted (mandatory) fields are wrapped in
Encrypted<T>on the Prisma model. The service layer callsawait ctx.crypto.encrypt(value)on write andawait ctx.crypto.decrypt(value)on read. - Redaction in observability.
apps/api/lib/redact.tswalks objects before they reach Sentry / Logtail / traffic capture / MCP results / idempotency-record storage. - Audit-on-read.
apps/api/lib/middleware/audit-on-read.tschecks the response payload's field classifications; emits an AuditEntry per Highly Restricted field accessed. - Webhook payload assembly.
packages/webhooks/src/dispatch.tsconsults the classification map when assembling fat payloads. - Customer data export. The export pipeline includes all owning-customer data but encrypts the tarball with the customer's key for Highly Restricted fields.
Why this matters
Without a single classification matrix:
- PII drifts into logs accidentally. We fail SOC 2 + GDPR.
- Encryption is decided per-engineer; some fields get missed. Insider DB dump becomes an incident.
- Read audits are added inconsistently. Compliance breach goes undetected.
- Webhook payloads leak sensitive data to customer endpoints not configured to handle it.
The classification matrix is codegen-enforced; the drift gate fails CI on any field that lacks a classification. The matrix changes only via spec PR + Council review.
See also
- Threat model — T11–T15 + T22 cover data confidentiality threats.
- Retention policy — how long each class is retained.
- Customer SLA — the data-handling commitments.
- GDPR runbook (erasure handling lands P0.D4).