Skip to content
GitHub
Operator

Deployment runbook

How to deploy Spectral to production and how to roll back. The topology is ADR-109 (one Cloudflare API/workers container plus Pages frontends/docs) and the deploy model is ADR-110 (config resolved by provision.sh, applied by GitHub Actions).

Model

  • main is integration — pushing main does not deploy.
  • A fast-forward push of the production branch deploys product surfaces — it triggers .github/workflows/deploy-production.yml for the API/workers container and .github/workflows/deploy-pages.yml for app., ops., and docs.. The production-ff-only ruleset enforces fast-forward + no-delete on that branch.
  • The deploy reads the production GitHub Environment (populated by provision.sh), so config/secrets are never passed by hand.

One-time / on-config-change: publish the environment

Whenever infra/environments.toml or a referenced 1Password value changes (a new key, a rotated secret), republish the GitHub Environment:

op signin # the 1Password CLI must be authenticated
tools/provision/provision.sh --env production --dry-run # review the plan
tools/provision/provision.sh --env production # publish (y/N)

The GitHub substrate (the production Environment + the FF ruleset) is bootstrapped once with tools/provision/github_resources.sh.

Deploy

git switch -c production main # first time only; thereafter the branch exists
git merge --ff-only main # advance production to the integration tip
git push origin production # fast-forward push → fires deploy-production.yml
gh run watch # follow it

(Or re-run without a new commit: gh workflow run deploy-production.yml --ref production.)

deploy-production.yml runs, in order:

  1. Apply Supabase migrationssupabase db push against the session-pooler DSN.
  2. Deploy the app containerwrangler deploy builds infra/docker/app.Dockerfile, pushes the image, and deploys the spectral Worker + container; then sets the container’s runtime variables (--var) and secrets (wrangler secret bulk).
  3. Reconcile the edge — create any R2 buckets + attach api.runspectral.com to the Worker.
  4. Smoke — poll https://api.runspectral.com/health until 200 (a cold first custom-domain attach waits on cert issuance; the gate is patient).

deploy-pages.yml builds and deploys these Cloudflare Pages projects from the same production branch:

HostnameProjectSourceNotes
app.runspectral.comspectral-dashboardapps/dashboardVite SPA; email/password sign-in uses the public Supabase publishable key; Pages Function proxies /api/* to api.runspectral.com with the /api prefix stripped.
ops.runspectral.comspectral-operationsapps/operationsVite SPA; email/password sign-in uses the public Supabase publishable key; Pages Function middleware gates non-public routes with Supabase JWKS + organization_role=operations; /operator/* proxies to the API.
docs.runspectral.comspectral-docsapps/docs-userAstro static docs.

Verify

curl -s https://api.runspectral.com/health | jq

Expect 200 with outbox: wired (and world_agent_chat: wired). llm and delegation may report degraded until the per-domain LLM credential and the act-as signing key are provisioned — both expected pre-SPEC-549.

Cutover and rollback

Cutover is generation-stamped, not blue-green (ADR-109 D5): the new container claims its generation’s outbox rows; the prior generation’s rows simply stop being claimed (no color flip, no reaper). Schema migrations are forward-only + expand/contract, so the prior generation’s code keeps working against the new schema during the overlap.

Because production is fast-forward-only, rollback is forward-fix: revert the offending commit on main, fast-forward production, and redeploy. The expand/contract rule guarantees the reverted (prior-generation) code runs against the already-migrated schema, so a code revert is safe without a schema rollback.

Common failures

SymptomCause / fix
Migration step: failed to parse … DSNA DATABASE_URL problem — confirm the op value is a valid session-pooler URL (port 5432) and that provision.sh published it correctly (see solutions/integration-issues).
Migration step: connection timeoutThe direct connection is IPv6-only; the runner is IPv4. Use the session pooler DSN.
Container step: wrangler not foundEdge deps must install with --ignore-workspace (the edge package isn’t a pnpm workspace member).
Smoke: persistent 403Cloudflare is blocking datacenter origins (Bot Fight Mode). It must stay off on api. (ADR-052 D3); scope bot protection to the human portals only.
  • ADR-109 — hosting topology, generation cutover, expand/contract migrations.
  • ADR-110 — provisioning + deploy model.
  • ADR-052 — edge posture (api. proxied, Bot Fight Mode off).
  • secrets-management.md — publishing + rotating secrets.
  • edge.md — edge / DNS operations.