ADR-048: Deployment topology — six Render services + two Cloudflare Pages projects, deployment-generation routing
Status: Accepted (2026-04-22) — D2’s inheritance from ADR-044 D5 superseded by ADR-063; SQL grants between contexts do not ship by default; call flow between contexts goes through framework-layer composition; notification flow between contexts continues via events.
Context
This ADR codifies the deployment topology for Spectral 0.3.0 against the Render alpha PaaS substrate (ADR-046). Scope: service granularity, rollout discipline, deploy coordination, healthcheck + drain contracts, environment separation. Most topology was implied by ADR-046 D2/D5/D6/D7/D16; this ADR makes it explicit and closes the genuinely-new calls: migration hardening, cutover mechanism, worker version coexistence, /version contract, key-exchange auth.
Key architectural result: single-color rolling with deployment-generation stamping replaces both an over-engineered drain-monitor proposal and an under-engineered “handler discipline” proposal. Generation stamping gives a structural guarantee — gen-N events processed by gen-N code — that eliminates cognitive-load tax on handler engineers.
CD pipeline orchestration is split out as ADR-053 (TA-26). This ADR owns topology + deploy contracts; ADR-053 owns workflow machinery.
Decision
D1 — Six Render deployables + two Cloudflare Pages projects per environment
- Render:
api(web, FastAPI+uvicorn),dashboard(web, TanStack Start,app.runspectral.com),operations(web, TanStack Start,ops.runspectral.com),workers(background worker, LISTEN/NOTIFY consumer),retention-run(cron, ADR-042 DELETE sweep),backup-nightly(cron, ADR-040pg_dump→ age → GCS). - Cloudflare Pages:
docs-user(public,docs.runspectral.com),docs-codex(Pages Function JWKS auth,codex.runspectral.com). - Split axes: runtime profile, audience, HTTP surface. Not by context — contexts are code boundaries, not deploy boundaries. Per ADR-049 D1, the six Render services map to five container images (the
retention-runcron reuses theworkersimage).
D2 — Single workers service at alpha; outbox architecture collapsed
- Single
core.outboxtable (collapses per-context outbox tables from the original ADR-044 D3). - Single
core.event_handledkeyed on(handler_name, idempotency_key)withhandler_namescope-qualified (e.g.,worlds.scan_completed_indexer). sourcecolumn (rename fromsource_bc; envelope field also renamed;SourceBCLiteral alias dropped).content_classcolumn on both tables; future-proofs per-class retention divergence.channelcolumn with default'outbox_default'; publisher sets explicitly when taxonomy expands;core.outbox_notify()trigger readsNEW.channelandpg_notifys — function stays frozen forever.- Inheritance from ADR-044 D5 superseded by ADR-063 — no SQL grants between contexts. Notification flow between contexts continues via events; call flow between contexts uses DI through the workers entrypoint composition seam (per ADR-060 / ADR-063). The “single workers” decision is reinforced by ADR-060 D-runtime — workers IS the framework-layer composition seam where dependencies between contexts wire up at startup.
D3 — Docs on Cloudflare Pages, not static-mount
docs.runspectral.com→ Pages (docs-user), public.codex.runspectral.com→ Pages (docs-codex) with a Pages Function enforcing JWKS-local auth +OPERATIONS_SCOPEScheck (same pattern as Operations Start per ADR-046 D9).- Deploy via
wrangler pages deploy --branch <env>invoked from bash (supply-chain protection per ADR-046 D14). - Supersedes part of ADR-046 D5 — Start services no longer static-mount docs. Astro builds output
dist/for Pages, not Start-service public-dirs.
D4 — Supabase branching + Management API + hardened expand/contract
- Staging: a persistent preview branch of the production Supabase project (one project, two branches).
- Orchestration: Management API from GH Actions (not the Supabase GitHub integration), enabling tag-based trunk dev.
- Migration discipline hardening: AST-level compat lint rejecting DROP COLUMN / incompatible ALTER TYPE / DROP TABLE / NOT-NULL-without-default / UNIQUE-on-populated-column without an explicit
-- compat: breakingmarker (pertools/quality/check_migration_compat.py); schema-version gate in cutover workflow (green must report expected migration head); pre-merge dry-run on a throwaway branch with--with-data; maintenance-window pattern documented for truly breaking migrations. - True DB-layer blue/green is not achievable on managed Supabase (no promote/swap primitive; no inbound logical replication). Hardened expand/contract is the realistic path.
D5 — Deployment-generation stamping + single-color rolling
- Monotonic
generationstamped on every outbox row at publish time by the publisher from theSPECTRAL_GENERATIONenv var. Lives on the outbox row only; envelope stays substrate-agnostic. - Scalar
SPECTRAL_GENERATIONper service. Workers always tied to exactly one generation:WHERE generation = $MY_GENERATION. - Per-generation LISTEN channels (
outbox_gen_<N>): a V2 worker is structurally incapable of receiving V1 NOTIFY. core.deploymentstable +core.deployment_generation_seqfor the monotonic counter. GH Actions writes a row viaINSERT … RETURNING generation— atomic, single round trip. Per-env scope.- Reaper re-PENDs stuck IN_FLIGHT rows within own generation (crash recovery); cross-generation orphan-sweep is dropped.
- Legacy-drain GH workflow (
drain-legacy-generation.yml): readscore.deploymentsfor code reference by generation, deploys a temporary worker at that code withSPECTRAL_GENERATION=<target>andSPECTRAL_DRAIN_AND_EXIT=true; worker auto-exits after cooling period; workflow deletes the service. - Handler-evolution policy (lightweight, not tooling): deploy handler changes freely by default; flag only for (a) external-contract changes, (b) product-committed time-precise activation semantics, (c) invariants that must be uniformly enforced (prefer DB-level enforcement).
- Forward trigger for version-gated claims: first handler change that genuinely cannot process prior-generation events.
SPECTRAL_GENERATIONplacement. Lives on the service (per-service env var), set at deploy time as part of the per-service Render API call, NOT in the env group (ADR-053 D7 correction). Env-group changes do not bump generation; the protection is structural rather than configuration-dependent.- Cutover sequence is the explicit 12-step contract codified in ADR-053 D9.
D6 — Path-filtered rollout via .github/deploy-manifest.yml
- The manifest declares path → service mapping; GH Actions diffs the current commit against the previous-deployed commit, maps changed paths to an affected-services set, deploys only those.
- Force-full-redeploy paths:
render.yamlvariants,.github/deploy-manifest.yml,infra/**. api+workersare a coupled-deploy pair (generation alignment). If either’s code changes, both redeploy.- Render
autoDeploy: falseon all services; RenderbuildFilterunused; orchestration is GH-Actions-native. - Coverage check:
tools/quality/check_deploy_manifest_coverage.py.
D7 — /health + /version + /version/detail + core.workers
/health: public, binary — 200 with bodyokor 503 with bodydegraded. No JSON, no check names. Probes: database + auth (vendor-agnostic). LLM/email/storage excluded./version: public, minimal — service, environment, generation, tag (nullable), color (prod only), reference (short 8-char), deployed_at./version/detail: auth-gated via dual-path key-exchange middleware. Includes full reference, schema (migration head), runtime/framework/os, deps_lock_hash, build_time, start_time, check statuses with latency.core.workersheartbeat/diagnostic table for the worker equivalent (no HTTP surface).- Key-exchange middleware: extracts a key from
Authorization: BearerorX-API-Key, validates against an env-var-sourced registry, mints an internal JWT with the existing scope/issuer/key-format taxonomy; auth middleware is a no-op multi-issuer validator. Secret rotation = deploy side-effect; no rotation playbook. - The auth substrate (TA-18 area) extends to support dynamic keys sourced from env vars alongside DB-backed keys (extension noted in ADR-037 carry-forward).
- Key format:
sk_deploy_<32chars>prefix+random. - Contract test:
Authorizationheader value never appears in log output.
D8 — Worker drain parameters
HANDLER_MAX = 60s(asyncio.wait_forbound on each handler).maxShutdownDelaySeconds = 90s(HANDLER_MAX + 30s buffer; under Render’s 300s ceiling).- Reaper interval = 30s; claim TTL = 300s (5× HANDLER_MAX).
SPECTRAL_DRAIN_COOLING_SECONDS = 60sdefault for legacy-drain workers.
D9 — Single region, Virginia (us-east)
All six Render services + Supabase primary + cron jobs in Virginia. Cloudflare Pages / LB globally edge-distributed.
Forward trigger: non-US pilot, regulatory data-residency, p99 latency floor, single-region availability incident.
D10 — Two environments + plan sizing
- Two Render blueprints:
infra/render/production.yamlandinfra/render/staging.yaml. - Staging: Starter tier, single-color per service.
- Production: Standard tier at alpha; blue/green pairs for web services (api, dashboard, operations); workers/cron/docs stay single-instance.
- Two GitHub Environments:
staging(auto-deploy on push-main);production(tag-triggered with deployment protection rules). - Two Supabase environments: main branch (prod) + persistent preview branch (staging), each with its own
core.deploymentscounter. - Two Render Environment Groups:
spectral-staging-runtime,spectral-production-runtime; each carries rotating key material. - Two Cloudflare Pages targets per docs project.
Alternatives considered
Blue/green worker service pairs. Rejected; SKIP LOCKED + dedup makes overlap correct; amplifies wakeups for zero customer benefit.
External deploy-pipeline drain monitor. Rejected; second source of truth competing with substrate guarantees.
Render preDeployCommand for migrations. Rejected; GH Actions orchestrates all deploys.
Static-mount docs on Start services. Rejected (ADR-046 D5 narrowed by D3 here in favor of Pages on day 1).
Supabase GitHub integration. Rejected; push/merge-only, kills tag-based trunk dev.
True DB-layer blue/green on Supabase. Not achievable.
Multi-generation worker subscription. Rejected; violates the structural guarantee.
Pre-insert classification trigger. Rejected; publishers classify events.
One service per context. Rejected (category error — contexts are code boundaries, not deploy boundaries).
Stored procedure for core.deployments insert. Rejected in favor of sequence (Postgres-idiomatic).
LLM/email/storage checks in /health. Rejected; only actionable total-outage dependencies.
/version/detail deferred to post-alpha. Rejected; ships at alpha.
Tag-as-generation. Rejected; SQL ordering needs monotonic integers.
Per-service core.deployments rows. Rejected; generation is event-substrate-scoped.
Consequences
- Cross-version worker coexistence solved structurally, not by human discipline.
- Handler-evolution cognitive load collapsed to a three-criterion flag rule.
- Migration safety hardened at the tool layer (lint + gate + dry-run).
- Secret rotation = deploy side-effect; no rotation runbook class for the relevant key material.
- Public
/health+/versionminimize attack-surface leak; diagnostic info behind auth. - Uniform
autoDeploy: false+ GH Actions across Render + Cloudflare Pages + Supabase. - Independent docs deploy cadence via Pages.
- Eight deployables per env (six Render + two Pages) — the Pages projects are zero-runtime-cost.
- ~200 LOC new substrate (generation counter + legacy-drain + key-exchange middleware).
- Supabase branching coupling — both envs affected if branching degrades.
- Two new tables (
core.deployments,core.workers) — small operational surface. - Customer-facing API request routing during rollover remains rolling — unavoidable without API-level versioning, to be handled at the API-version level when external call sites demand.
- TA-15 / ADR-060 D-runtime reinforces the single-workers architecture: agent runtimes (Spectral / Ops / World) all live in workers; framework-layer composition between contexts happens at the workers entrypoint.
- Race C (Render pod crash-restart env-snapshot semantics during rolling deploy) materially mitigated for
SPECTRAL_GENERATIONvia per-service placement (D5); still matters for other shared values via env group (mitigation captured indocs/runbooks/deployment-topology.md).
References
- ADR-065 —
spectral.coreadmission discipline - ADR-031 — single-library structure
- ADR-039 — JWKS-local; scope taxonomy
- ADR-040 — backup-nightly cron
- ADR-041 — direct-to-Postgres listener (D9 dedicated connection)
- ADR-042 — retention-run cron
- ADR-044 — outbox + envelope contract; D5 partial supersession noted there
- ADR-046 — Render PaaS
- ADR-049 — five container images mapped to six Render services
- ADR-052 — Cloudflare CNAME flip routing for blue/green; Pages Function pattern
- ADR-053 — CD pipeline orchestration (cutover sequence; pre-merge dry-run; concurrency mutex)
- ADR-060 — D-runtime reinforces single-workers; framework-layer composition seam
- ADR-063 — D2 inter-context inheritance superseded
- TA-19 disposition — SPEC-322 comment
ddce0896 - TA-19 verification — SPEC-322 comment
00e22416 - TA-20 image-count refinement bookkeeping — SPEC-322 comment
e14df5df - TA-26 cutover sequence + env-var placement — SPEC-322 comment
7b81b6fc - TA-15 D2 inheritance supersession — SPEC-322 comment
11e52812 .github/deploy-manifest.yml— path-filter contractinfra/render/{production,staging}.yaml— blueprint placeholderssupabase/migrations/20260422170000_core_deployments.sqlsupabase/migrations/20260422170100_core_workers.sqlsupabase/migrations/20260422170200_core_outbox.sqlsupabase/migrations/20260422170300_core_event_handled.sqldocs/runbooks/deployment-topology.md— operational runbookdocs/runbooks/legacy-drain.md— legacy-drain workflow