ADR-041: Connection pooling — Supavisor transaction mode, psycopg 3, SET LOCAL inside transactions
Status: Accepted (2026-04-21)
Context
Spectral runs a single Supabase Pro project with three Postgres schemas (core, worlds, platform) in one DB. Per-context Postgres roles with scoped search_path set at role level (per ADR-032 D3). RLS as backstop, with policies referencing current_setting('app.account_id') and current_setting('app.workspace_id') session vars rather than auth.uid() (per ADR-039 D13). FastAPI API plus async workers; Python 3.14; pydantic 2 already the domain plus request/response surface. Solo-builder US-only NDA alpha; cost comfort band $20–100/month. The event substrate (TA-5) was not yet decided at TA-3 disposition, but LISTEN/NOTIFY was the handoff-flagged likely path — this ADR reserves optionality.
Research convergence (landscape survey + adversarial pair):
- Supavisor transaction mode is the correct primary endpoint (free with Pro; multiplexes to 200 clients on Micro versus 60 raw
max_connections). SET LOCALinside an explicit transaction — not at pool checkout — is the correctness-critical pattern. Checkout hooks fire on logical-connection-checkout, but Supavisor transaction mode multiplexes physical backends per transaction, so whateverSETruns at checkout can land on a different backend than the query.- psycopg 3 with
prepare_threshold=Noneis cleaner than asyncpg for transaction-pooled Postgres. asyncpg has documented prepared-statement collision bugs behind transaction poolers (asyncpg#1058; supabase#39227); psycopg 3 has a single honored switch.
Decision
D1 — Supavisor transaction mode (port 6543) is the primary endpoint
API and worker request-scope work runs through the transaction-mode pooler. Direct-to-Postgres (port 5432 on the project host) is reserved for the LISTEN/NOTIFY listener pattern if TA-5 selects it (see D9). Supavisor session mode (port 5432 via the pooler) is not used.
D2 — psycopg 3 async with psycopg_pool.AsyncConnectionPool; no SQLAlchemy
- Spectral uses Supabase CLI migrations (per ADR-032 D4), not Alembic — SA’s flagship migration integration is moot here.
- pydantic is already the domain plus request/response plus framework view-model surface; ORM-to-pydantic shims do not exist in this codebase.
- The three ORM value-adds (query-language abstraction, result-mapping shims, SQL-injection mitigation) all collapse for Spectral: locked to Postgres; result mapping is a
model_validate(dict(row))helper; psycopg 3 parameterized queries are injection-safe. - Prior-version experience: v0.2.0 SA + Supabase migration integration was significantly more painful than expected.
prepare_threshold=Noneon psycopg 3 is a single honored switch for transaction-mode compatibility.
D3 — Tenancy and session variables are set via SET LOCAL inside an explicit transaction; not at pool checkout
This is the correctness-critical decision. Pool-checkout hooks fire at logical-connection-checkout time, but Supavisor transaction mode assigns a fresh physical backend per transaction — anything SET at checkout may not be on the backend where the query runs. SET LOCAL inside BEGIN…COMMIT is Postgres-guaranteed to stay on the same backend and clear at commit/rollback, with no checkin-discipline dependency.
D4 — Session variable names are a contract across contexts; constants live in spectral.core.db.session_vars
SESSION_VAR_ACCOUNT_ID: Final[str] = "app.account_id"SESSION_VAR_WORKSPACE_ID: Final[str] = "app.workspace_id"SESSION_VAR_USER_ID: Final[str] = "app.user_id" # added by ADR-059 (TA-13)Referenced by every RLS policy in every schema and by every request-scoped transaction setup. Contract tests pin the exact string values (changing them silently would break every RLS policy simultaneously). Lands in spectral.core under ADR-065 admission discipline.
D5 — One connection pool per context; context role baked into the connection string
Three pools: spectral_core_app, spectral_worlds_app, spectral_platform_app. search_path set at role level at provision time via ALTER ROLE spectral_<context>_app SET search_path = <schema>, public. SET ROLE at checkout is not used — role-at-login is enforced by Postgres, visible in pg_stat_activity, and simplifies audit identity attribution. The 3× pool footprint cost is negligible against Supavisor’s 200-client cap on Pro Micro.
D6 — Per-process pool sizing (alpha defaults)
| Surface | Pool sizing (per context per process) |
|---|---|
| API (FastAPI, Render service) | min_size=2, max_size=5, timeout=10s |
| Workers (async queue consumers) | min_size=2, max_size=10, timeout=30s |
| Listener (if TA-5 = LISTEN/NOTIFY, see D9) | 1 dedicated connection, not in any pool |
Single-instance API + single-instance worker budget: 3 × 5 + 3 × 10 = 45 client-side sockets on the Supavisor transaction-mode endpoint. Well under the 200-client Pro Micro cap. Scales linearly with instance count until 80% of cap, at which point the trigger is to scale Supabase compute, not the pool.
D7 — Idle and reconnect discipline
max_idle=240s(sits under Supavisor’s 5-minute idle reaper)configurecallback setsautocommit=Falseso every query goes through an explicit transaction (forcing D3 correctness by construction — a bare query without a transaction context cannot set session vars and will fail closed under RLS, which is the right failure mode)checkcallback runsSELECT 1for pre-ping; cheap, catches reaper-killed connections- Pool creation is
lifespan-scoped in FastAPI (never module-level) so cold-start preemption does not leak pools reconnect_timeout=None— let the client retry with explicit backoff in app code
D8 — The transaction-scoped context manager is the sole sanctioned path to acquire a connection
Shape (per app, sibling implementations in spectral_api.db and spectral_workers.db):
async with db.request_scope(account_id, workspace_id) as conn: # BEGIN opened; SET LOCAL app.account_id = $1; SET LOCAL app.workspace_id = $2 # conn is psycopg AsyncConnection; pool + context selected by caller ... # COMMIT (or ROLLBACK on exception) — session vars cleared automaticallyA FastAPI dependency wraps it per request; the worker equivalent wraps it per job. No code outside these wrappers may acquire a raw pool connection. Violation = architecture-validator failure (queued post-alpha lint).
D9 — Listener architecture (optionality-preserving commitment)
If TA-5 selects LISTEN/NOTIFY as the event substrate, the listener process owns a dedicated direct-to-Postgres connection on port 5432 of the project host, not via Supavisor. Rationale: LISTEN pins a session for the client lifetime — Supavisor transaction mode releases the session at commit (breaks LISTEN entirely; supavisor#85 open since 2023), and Supavisor session mode burns a long-lived pool connection for a mostly-idle listener. Direct-to-Postgres owns the pattern cleanly. The listener role is narrow (spectral_listener, USAGE + minimal SELECT), distinct from per-context app roles.
This ADR does not prejudge TA-5’s choice between LISTEN/NOTIFY and outbox-polling. Outbox-polling would use the same per-context transaction-mode pools defined here with zero architectural changes. ADR-044 (TA-5) selected an outbox-with-LISTEN/NOTIFY-notification combination; the listener pattern is exercised.
D10 — Observability hooks (scaffolded, not wired)
opentelemetry-instrumentation-psycopgreserved (added to[project.optional-dependencies] telemetrygroup at close-pass, not at TA-3 commit); when wired, each span carriesdb.connection_rolefrom the pool’s context identity automatically.pool.get_stats()exposed via a future/ops/db-statsendpoint, ops-role gated. Scope for ADR-048, not here.
D11 — Context / schema / content-class / scope naming coherence (retroactive amendment to TA-1 / TA-16 / TA-18)
Surfaced by D5’s role-name mechanics (spectral_spectral_app was ugly). The context previously named scanning was too narrow a descriptor for a context that owns workspaces, workspace membership, evaluation frameworks, change sets, scans, and the full optimization pipeline. Reusing the brand name (spectral) for a schema or context creates collision with the library namespace.
Resolution: align context / Postgres schema / content-class / scope naming along a single semantic triangle.
- Context
scanning→platform(src/spectral/scanning/→src/spectral/platform/). - Postgres schema
spectral→platform(retroactive amendment to ADR-032 D2; substance unchanged, name aligned to the context). Zero migrations were yet landed in the oldspectralschema, so this was a no-op in migration history. - Per-context Postgres roles:
spectral_core_app/spectral_worlds_app/spectral_platform_app. - ContentClass rebalance (retroactive amendment to ADR-036 D6):
PLATFORMnow means customer content processed/generated in the platform context;OPERATIONSis the new value replacing the oldPLATFORMmeaning (Spectral-operated reasoning);SYNTHETICunchanged; the oldCUSTOMERvalue is subsumed into the newPLATFORM. - Scopes rename (retroactive amendment to ADR-039 D11):
SCOPE_*_PLATFORM→SCOPE_*_OPERATIONS;PLATFORM_SCOPES→OPERATIONS_SCOPES. String values"read:platform"→"read:operations". - Forward-only migration
supabase/migrations/20260421030000_core_naming_coherence.sqlupdatescore.llm_usageCHECK constraints forbcandcontent_classcolumns.
The semantic triangle is now internally consistent:
- Customer-facing side (all “platform”): context
spectral.platform/ schemaplatform/ rolespectral_platform_app/ContentClass.PLATFORM. - Spectral-internal side (all “operations”):
AccountRole.OPERATIONS/OPERATIONS_SCOPES/SCOPE_*_OPERATIONS/ContentClass.OPERATIONS/bc='app:operations'. - Test-content side:
ContentClass.SYNTHETIC.
Renaming core → shared was considered and rejected on cost/clarity grounds.
Alternatives considered
SQLAlchemy 2.x Core + psycopg 3 driver. Rejected per D2.
asyncpg with statement_cache_size=0 + prepared_statement_name_func=uuid. Faster than psycopg 3 on microbenchmarks, but the “statement_cache_size=0 is necessary but not sufficient” failure mode (asyncpg#1058) means defense-in-depth is a workaround stack, not a clean switch.
Single engine + SET LOCAL ROLE at transaction start. Smaller pool footprint, one credential to rotate. Rejected per D5: role-at-login gives Postgres-enforced identity, simpler audit attribution, and the connection-budget concern is weak under Supavisor transaction-mode multiplexing.
Pool-checkout event hook to set session vars. Rejected per D3: correctness bug under Supavisor transaction mode.
Session-mode Supavisor endpoint for listener. Rejected — pins a pooled connection, the worst of both worlds versus direct-to-Postgres.
No pool at all — per-request connect/disconnect. Trivially rejected. Cold-start latency makes this unworkable.
piccolo / edgedb / other niche ORMs. SaaS/substrate sprawl; non-mainstream data layer; bootstrap-team can’t absorb the ecosystem cost.
Context name bench / studio / evals. Rejected per D11. platform carries the brand→context symmetry and matches what the context actually is.
Rename core → shared. DDD-accurate but cost-prohibitive.
Consequences
- Zero additional alpha cost. Supavisor included in Pro; no new SaaS.
spectral.core.db.session_varsis a load-bearing constant surface across contexts. Any future change to the values must ship a migration that updates every RLS policy simultaneously.- D8’s transaction-scoped context manager is the architectural seam for tenancy enforcement. Every request path and every job path acquires its connection through it or fails. A post-alpha lint enforces single-path discipline.
- psycopg 3 direct + raw SQL means a minimal
row_to_model(row, Model)helper (~20 lines per app). Accepted cost. spectral_apiandspectral_workersscaffolding adds the psycopg 3 dep + adbsubmodule when the API surface and worker runtime land.- The three-context pool shape scales per Render service instance. Scaling trigger: instance count where (3 × 5 × API instances) + (3 × 10 × worker instances) > 160 → scale Supabase compute tier, not the pool.
- LISTEN/NOTIFY listener consumes one Postgres
max_connectionsslot per listener process. On Pro Micro, 1/60; negligible. - Contract-test requirement at landing time (the discipline in force when this ADR landed; superseded by ADR-065):
spectral.core.db.session_varschange shipped a contract-test change in the same commit. Test landed at commit7b6f37e. - D11 naming coherence was retroactive. Amendment comments posted on SPEC-304 / SPEC-319 / SPEC-321. ADRs 032–039 were written with the new names directly — no supersession edges required for those ADRs. ADR-031’s status line is amended to note the context rename.
References
- ADR-065 —
spectral.coreadmission discipline - ADR-031 — single-library structure (context rename addendum)
- ADR-032 — schema topology; per-context roles
- ADR-033 — RLS backstop pattern
- ADR-036 — ContentClass rebalance (retroactive)
- ADR-039 — D13 session-var convention; scopes rename (retroactive)
- ADR-044 — listener pattern exercised
- ADR-048 — observability hook wiring (D10 scope)
- ADR-059 —
SESSION_VAR_USER_IDaddition - TA-3 disposition — SPEC-306 comment
4231d271 - TA-3 verification — SPEC-306 comment
4c79d169 src/spectral/core/db/session_vars.py— session-var contract across contexts
Addendum: ADR-010 — psycopg3 + async DB migration
ADR-010 (Accepted 2026-04-09; retired by this ADR) recorded the migration off psycopg2 (sync) onto psycopg[binary]>=3.1.0 (async) for the v0.2 API. The migration was triggered by an OTEL retry-storm scenario where a single slow DB call would block the uvicorn event-loop thread and freeze every in-flight request, because every FastAPI handler was async def while every DB call inside went through a synchronous threaded pool.
Why a future reader should know about ADR-010:
- psycopg-3 is the active driver for Spectral. The decision is settled; the v0.2 → 0.3.0 rebuild keeps psycopg-3 and adds the Supavisor +
SET LOCALdiscipline this ADR specifies on top. - The “every async handler must use an async driver” principle preserved here is load-bearing for D8’s transaction-scoped context manager — a synchronous DB call in an async stack is the failure mode ADR-010 originally addressed.
- The
psycopg2-binarydependency that ADR-010 left side-by-side was removed during the 0.3.0 rebuild; currentpyproject.tomlcarries psycopg-3 only.
Git history at the commit retiring ADR-010 preserves the original text.