Deployment runbook
How to deploy Spectral to production and how to roll back. The topology is ADR-109 (one Cloudflare API/workers container plus Pages frontends/docs) and the deploy model is ADR-110 (config resolved by provision.sh, applied by GitHub Actions).
Model
mainis integration — pushingmaindoes not deploy.- A fast-forward push of the
productionbranch deploys product surfaces — it triggers.github/workflows/deploy-production.ymlfor the API/workers container and.github/workflows/deploy-pages.ymlforapp.,ops., anddocs.. Theproduction-ff-onlyruleset enforces fast-forward + no-delete on that branch. - The deploy reads the
productionGitHub Environment (populated byprovision.sh), so config/secrets are never passed by hand.
One-time / on-config-change: publish the environment
Whenever infra/environments.toml or a referenced 1Password value changes (a new key, a rotated secret), republish the GitHub Environment:
op signin # the 1Password CLI must be authenticatedtools/provision/provision.sh --env production --dry-run # review the plantools/provision/provision.sh --env production # publish (y/N)The GitHub substrate (the production Environment + the FF ruleset) is bootstrapped once with tools/provision/github_resources.sh.
Deploy
git switch -c production main # first time only; thereafter the branch existsgit merge --ff-only main # advance production to the integration tipgit push origin production # fast-forward push → fires deploy-production.ymlgh run watch # follow it(Or re-run without a new commit: gh workflow run deploy-production.yml --ref production.)
deploy-production.yml runs, in order:
- Apply Supabase migrations —
supabase db pushagainst the session-pooler DSN. - Deploy the app container —
wrangler deploybuildsinfra/docker/app.Dockerfile, pushes the image, and deploys thespectralWorker + container; then sets the container’s runtime variables (--var) and secrets (wrangler secret bulk). - Reconcile the edge — create any R2 buckets + attach
api.runspectral.comto the Worker. - Smoke — poll
https://api.runspectral.com/healthuntil 200 (a cold first custom-domain attach waits on cert issuance; the gate is patient).
deploy-pages.yml builds and deploys these Cloudflare Pages projects from the same production branch:
| Hostname | Project | Source | Notes |
|---|---|---|---|
app.runspectral.com | spectral-dashboard | apps/dashboard | Vite SPA; email/password sign-in uses the public Supabase publishable key; Pages Function proxies /api/* to api.runspectral.com with the /api prefix stripped. |
ops.runspectral.com | spectral-operations | apps/operations | Vite SPA; email/password sign-in uses the public Supabase publishable key; Pages Function middleware gates non-public routes with Supabase JWKS + organization_role=operations; /operator/* proxies to the API. |
docs.runspectral.com | spectral-docs | apps/docs-user | Astro static docs. |
Verify
curl -s https://api.runspectral.com/health | jqExpect 200 with outbox: wired (and world_agent_chat: wired). llm and delegation may report degraded until the per-domain LLM credential and the act-as signing key are provisioned — both expected pre-SPEC-549.
Cutover and rollback
Cutover is generation-stamped, not blue-green (ADR-109 D5): the new container claims its generation’s outbox rows; the prior generation’s rows simply stop being claimed (no color flip, no reaper). Schema migrations are forward-only + expand/contract, so the prior generation’s code keeps working against the new schema during the overlap.
Because production is fast-forward-only, rollback is forward-fix: revert the offending commit on main, fast-forward production, and redeploy. The expand/contract rule guarantees the reverted (prior-generation) code runs against the already-migrated schema, so a code revert is safe without a schema rollback.
Common failures
| Symptom | Cause / fix |
|---|---|
Migration step: failed to parse … DSN | A DATABASE_URL problem — confirm the op value is a valid session-pooler URL (port 5432) and that provision.sh published it correctly (see solutions/integration-issues). |
| Migration step: connection timeout | The direct connection is IPv6-only; the runner is IPv4. Use the session pooler DSN. |
Container step: wrangler not found | Edge deps must install with --ignore-workspace (the edge package isn’t a pnpm workspace member). |
| Smoke: persistent 403 | Cloudflare is blocking datacenter origins (Bot Fight Mode). It must stay off on api. (ADR-052 D3); scope bot protection to the human portals only. |
Related
- ADR-109 — hosting topology, generation cutover, expand/contract migrations.
- ADR-110 — provisioning + deploy model.
- ADR-052 — edge posture (
api.proxied, Bot Fight Mode off). secrets-management.md— publishing + rotating secrets.edge.md— edge / DNS operations.