Skip to content
GitHub
Decisions

ADR-037: Secrets management — provisioning-script architecture and target-swap discipline

Status: Accepted (2026-04-21) — D1 and D2 superseded by ADR-046; D5 partially superseded by ADR-073; D9 superseded by ADR-073; runtime backend swapped from GCP Secret Manager + Workload Identity Federation to Render Environment Groups + GitHub-stored Render API key when ADR-046 selected Render as the alpha PaaS; D11’s provider-swap seam exercised by that supersession.

Context

Spectral has four secret classes to manage end-to-end:

  1. Platform operational secrets — provider API keys (Anthropic / OpenAI / Google), observability vendor keys (Logfire / Sentry / Grafana per ADR-036), Supabase service-role / anon / JWT keys, DB connection strings.
  2. Customer BYO credentials — ADR-035 D4 reservation; post-alpha feature.
  3. CI secrets — test / deploy pipeline credentials (covered in detail by ADR-062).
  4. Local dev secrets — engineer laptops.

Alpha posture: solo-builder → 2–3 engineers, no dedicated security engineer, pre-SOC2, business hours. The hosting target was unknown at TA-17 disposition time (TA-21 not yet run); the spike’s primary research lean was GCP (Workspace already in use). TA-21 (ADR-046) later selected Render as the alpha PaaS, exercising D11’s provider-swap seam.

Research synthesis: every managed secrets SaaS (Doppler, Infisical, HashiCorp Vault Cloud) requires its own long-lived service token for client authentication — a credential guarding the other credentials. GCP Workload Identity Federation eliminated this axis at the time of disposition (and Render’s own per-service env-group model substitutes cleanly). Supply-chain track record for credential-holding SaaS (CircleCI, LastPass, Okta) is real and recent; Spectral already removed LiteLLM post-supply-chain compromise (ADR-035 D1), and adding a new credential-holding SaaS would be inconsistent with that discipline.

The strongest architecture that emerged from disposition is a deployer-operated provisioning script as the deploy-time orchestrator, abstracting “where secrets get pushed to” behind a target-function interface. This decoupled the provisioning flow from the hosting choice — the TA-21 swap to Render touched target functions only, not .env.example, prompt flow, rotation semantics, or cache format.

Decision

D1 — Runtime read source

Superseded by ADR-046. Runtime backend is Render Environment Groups (was GCP Secret Manager at original disposition); see ADR-046 for the current decision and ADR-046’s addendum for the rationale of the swap. The provisioning-script interface (D4) was unchanged by the swap.

D2 — Runtime identity

Superseded by ADR-046. Runtime identity is the Render API key as a long-lived GitHub secret in scoped Environments, with quarterly rotation per ADR-062 D5 (was Workload Identity Federation at original disposition). See ADR-046 for the current decision; ADR-062 captures the long-lived-key mitigation (rotation cadence, scoped Environments, multi-layer leakage scanning).

D3 — Local dev laptops are independent of the provisioning script

.env.example (template, checked in) plus .env.local (gitignored) plus direnv plus Pydantic Settings. Dev-tier values come from personal vendor accounts (developer-minted free-tier API keys) or a team shared vault for shared dev-tier credentials. Dev laptops do not need GCP Secret Manager or Render API access.

D4 — tools/provision/setup.sh is the canonical deploy-tier provisioning orchestrator

  • Language: bash (macOS / Linux / Windows-via-WSL). POSIX-compliant.
  • Modes: init (first-time bootstrap), update (add / refresh specific secrets), rotate (rotate one or many), verify (diff provisioned state versus cache; --dry-run is required on mutating modes).
  • Target environment: --env=staging|production required. Script filters the secret set by scope and pushes only the relevant subset.
  • Scope annotation: .env.example entries carry # @scope=shared|staging|production (comma-separated for multi-env) above their variable declarations. Unannotated default = staging,production (safer for a template whose purpose is to catch oversight). Single cache file .env.provision with scope-prefixed entries (shared:<NAME>, staging:<NAME>, production:<NAME>) — refined during artifact landing from the original disposition’s dual-cache design.
  • Target functions: one bash function per target (push_render_env, push_github_secret, push_supabase_secret, …). Originally push_gcp_secret; ADR-046 swap renamed to push_render_env plus new wiring.
  • Cache discipline: .env.provision is gitignored, backed up to a shared vault after each provisioning run.
  • Rotation: --rotate=<secret_name> prompts for new value (or mints via vendor CLI where supported), updates the cache file, pushes to all targets for that secret, optionally triggers rolling restart via a provider-specific hook.
  • Provider-swap discipline: adding a new target = one new bash function matching the target convention plus wiring. No change to .env.example, .env.provision, or the prompt / rotation / verify flow. Documented in tools/provision/README.md so future-us doesn’t paint over it.

D5 — Cofounder personal key-source discipline

Partially superseded by ADR-073. 1Password is named as the documented default operator-workstation secrets store under ADR-073 D5; the privacy intent is satisfied by recommending a default for any operator rather than treating individual operator tooling as private.

D6 — Customer BYO credentials via Supabase Vault for alpha; deferred implementation

When BYOK ships as a customer feature, encrypted blobs live in a platform-scoped core.workspace_secrets table with per-workspace access via SECURITY DEFINER checking auth.uid() against workspace membership; AEAD with workspace_id as AAD. No table, no code landed now — this ADR locks the mechanism; implementation waits until customer UX demands it.

D7 — Rotation runbook is the provisioning script plus a one-page ops doc

docs/runbooks/secrets-management.md documents standard rotation (setup.sh --mode=rotate --secret=<name> --env=<env>), emergency rotation procedure (full-surface rotation with incident reference), vendor-specific caveats, and rolling-restart triggers per target.

The CI-specific rotation surface extends through docs/runbooks/ci-secrets.md (per ADR-062).

D8 — Composite audit trail

Script runs logged via git history (commits touching tools/provision/) plus shared-vault version history of .env.provision backups plus platform-native audit logs (originally GCP Cloud Audit Logs; now Render audit + GitHub org audit log) plus Supabase PGAudit. No unified UI at alpha. BigQuery export wiring is post-alpha SOC2-readiness work.

D9 — The provisioning script is the alpha IaC

Superseded by ADR-073. Declarative IaC (OpenTofu) adopted at alpha rather than deferred to post-alpha; setup.sh is now a thin orchestrator wrapping tofu plan/apply plus CLI gap-fills and human-in-loop steps.

D10 — TA-16 observability vendor key binding

Logfire / Sentry / Grafana Cloud tokens declared in .env.example, provisioned via setup.sh, pushed to the runtime backend (now Render Env Groups), read at FastAPI composition root via Pydantic Settings. Closes the ADR-036 → ADR-037 dependency.

D11 — Provider-swap design discipline

The ProvisionTarget function-per-target convention is the seam. Alpha originally implemented GCP SM + GitHub + Supabase targets; ADR-046 swapped GCP SM → Render Env Groups (added push_render_env; retained push_github_secret and push_supabase_secret). A future swap (Render → another PaaS, or back to a hyperscaler) = add new target functions, rewrite the runtime-identity bootstrap, update the runbook. .env.example, prompt flow, rotation semantics, cache format unchanged. The swap from GCP to Render is the worked example that validates the seam.

D12 — GCP KMS reserved as the master-key holder for post-alpha customer BYOK

When D6’s implementation trigger fires, the migration from Supabase Vault to application-layer envelope encryption uses GCP KMS as the master key holder. (Even if compute lives on Render, the KMS dependency is decoupled — KMS is the master-key root of trust, not a runtime service.) Dev environment never needs KMS (synthetic / test keys only); staging and production use KMS when the BYOK feature ships. Architectural decision locked now; infrastructure work deferred.

Alternatives considered

Doppler Team / Infisical / 1Password op run as runtime source of truth. Strong products with genuine drift-detection and audit UI wins. Rejected: each introduces a new credential-bearing SaaS subprocessor; each requires a long-lived service token guarding the other secrets; ~$50–90/month versus ~$3/month native at alpha; inconsistent with ADR-035 D1 supply-chain discipline.

HashiCorp Vault self-hosted. Operational attention tax incompatible with 1–3 engineer alpha.

1Password op run as primary runtime. Retained for human-shared creds, not runtime service. The Connect-server sidecar is a moving part we do not need.

Platform-native secrets only (Fly.io / Render / Railway env) — was an alternative; became the actual D1 path post-ADR-046. Folded into D11 originally; now exercised.

AWS Secrets Manager / Azure Key Vault. Ruled out by the hosting choice landing on Render.

Per-workspace external secrets manager for BYO customer creds. Fails on cost and IAM blast radius.

Application-layer envelope encryption with GCP KMS from day one for customer BYO. Correct post-alpha shape; premature to build now (D6).

.env.provision.* encrypted at rest with age/sops. Considered; rejected for alpha. The decryption key becomes its own chicken-and-egg problem; disk encryption plus shared-vault backup is adequate.

Consequences

  • The provisioning script interface is load-bearing for ADR-046. Changes to the ProvisionTarget function convention after ADR-046 land require ADR-level review. The GCP→Render swap demonstrated the seam works.
  • Runtime identity story (D2 supersession by ADR-046). Long-lived Render API key in scoped GitHub Environments with quarterly rotation per ADR-062. The “no long-lived service account keys anywhere” property is relaxed but mitigated.
  • Unblocks ADR-036 — observability vendor keys have a resolved home.
  • Unblocks ADR-062 — CI secrets handling for integration tests inherits the GitHub-Environment-scoped pattern.
  • Partial unblock for ADR-035 D4 BYO credentials — mechanism locked (D6 + D12); implementation deferred to first customer UX.
  • Alpha accepts fragmented audit across Render / GitHub / Supabase consoles plus git history plus shared-vault version history in exchange for zero new credential-bearing vendors.
  • No Terraform / Pulumi for secret provisioning at alpha (D9).
  • Cofounder discipline (D5) carries forward across all later ADRs that touch secrets (ADR-046, ADR-049, ADR-052, ADR-062). Specific upstream key-source tooling does not appear in system docs.

References

  • ADR-065spectral.core admission discipline
  • ADR-035 — supply-chain discipline that motivated rejecting credential-bearing SaaS
  • ADR-036 — vendor key binding (D10)
  • ADR-046 — Render alpha PaaS; supersedes D1 and D2 of this ADR
  • ADR-049 — TA-20 documentation discipline that reaffirms D5
  • ADR-062 — CI secrets handling (extends D7 rotation surface)
  • TA-17 disposition — SPEC-320 comment 6d77c4f0
  • TA-17 verification — SPEC-320 comment 47b50059
  • TA-19 D7 extension note — SPEC-320 comment 5f74104f (auth substrate dynamic-keys requirement; satisfied by env-var sourcing)
  • TA-20 documentation-discipline note — SPEC-320 comment 1f23aa1c (sweep removing personal-tooling references from system docs)
  • TA-26 env-group placement principle — SPEC-320 comment 3f6abd77
  • TA-25 rotation cadence reaffirmation — SPEC-320 comment 8c8c15fe
  • .env.example, tools/provision/setup.sh, tools/provision/README.md, docs/runbooks/secrets-management.md (commit 1d3326e)