Checkpointer encryption runbook
Operational procedures for activating envelope encryption on the LangGraph checkpointer when a forward trigger fires.
System reference: Codex system-design/agent-architecture.mdx · ADR-043 D10.
Trigger conditions
Activation triggers are owned by ADR-043 D10 — see the ADR for the authoritative list. The checkpointer relies on disk-level encryption + role-scoped DB access + audit logs + retention cascade until a trigger fires.
Implementation shape
The activation builds an EncryptedSerializer wrapping AsyncPostgresSaver’s SerializerProtocol. Per-domain DEK generated and wrapped via the Supabase key-management substrate (Vault / pgsodium). DEK caching with TTL. Provider-swap seam via KeyManagementProvider protocol per ADR-037 D11.
Estimated effort: ~2 engineer-weeks plus master-key provisioning plus rotation runbook authoring.
Components
KeyManagementProviderprotocol inspectral.core.crypto.protocols(or domain-appropriate location).SupabaseVaultKeyProviderimpl inspectral_workersinfrastructure — the master-key root of trust is the Supabase substrate (Vault / pgsodium), not an external cloud KMS (ADR-037 D12). The provider-swap seam (component 1) keeps an alternate root reachable without a rewrite if a future requirement demands it.EncryptedSerializerwrapping the checkpointer’s serializer (the LangGraph default; provider-swap seam preserved).- Per-domain DEK lifecycle:
- Generate DEK on domain creation; wrap with the Vault master key; store wrapped DEK in
platform.domain_keys(or analogous). - Cache unwrapped DEK in process memory with TTL (default 1 h).
- Rotate the Vault master key per quarterly cadence; re-wrap DEKs without re-encrypting payloads.
- Generate DEK on domain creation; wrap with the Vault master key; store wrapped DEK in
Migration
When activated:
- Land
platform.domain_keysmigration. - Provision the Vault master key per environment (a pgsodium key id held in Supabase Vault).
- Deploy a backfill job that generates per-domain DEKs for existing domains.
- Deploy the workers update with
EncryptedSerializerenabled via feature flag. - Re-encrypt existing checkpointer rows (one-time backfill; runs in workers).
- Remove the feature flag once backfill completes.
Verification
After activation:
-- Confirm checkpointer rows are encrypted (payloads should be base64 ciphertext, not the cleartext serializer output)SELECT pg_typeof(state), octet_length(state)FROM langgraph.checkpointsLIMIT 10;A roundtrip test confirms the workers can decrypt and resume an arbitrary thread.
Rotation
Quarterly cadence (mirrors the ADR-062 D5 secrets rotation).
- Rotate the Vault master key version.
- Re-wrap all domain DEKs against the new key version (no payload re-encryption needed).
- Verify a sample of threads decrypts successfully.
Old master-key versions retained in Vault for audit + emergency decrypt.
Disaster scenarios
- DEK unwrap fails (Vault key unavailable): workers fail closed;
/healthreturns 503 (auth check fails on Spectral Agent paths). Resolve Vault/pgsodium key access; verify with a sample roundtrip. - Domain DEK lost: the domain’s checkpointer history becomes unrecoverable. Mitigation: the wrapped DEK lives in
platform.domain_keys, covered by the standard Supabase backup + PITR cadence (ADR-040). - Vault master key lost: all domain DEKs unwrappable; full checkpointer history unrecoverable. The master key shares the Supabase failure domain with the data (the tradeoff of a Supabase-native root of trust vs an external KMS), so the mitigation is the same DR posture as the rest of the data — Supabase managed backups + PITR, with the Vault key included in the project’s backup scope. DR-runbook escalation on a confirmed loss.
See also
- ADR-043 — Spectral Agent conversation persistence (D10 forward trigger)
- ADR-037 — D12 Supabase-substrate key management (no external KMS)
docs/runbooks/secrets-management.md— quarterly rotation cadencedocs/runbooks/disaster-recovery.md— DR scenarios