Decisions

ADR-033: Tenancy enforcement — app-layer primary, RLS backstop

Context

The default Supabase posture frames RLS as the primary mechanism for tenant isolation, paired with a direct-Supabase-SDK-from-frontend access pattern. Adversarial research during TA-1 surfaced two findings that decisively reframed the question:

The pro-RLS argument’s strongest case requires an exposed anon key. When the anon key is publicly reachable, RLS is the only thing standing between an attacker and other tenants’ data; the RLS-primary framing earns its keep there.
The anti-RLS case bites hardest when there is a server-side API in the trust path. CVE-2025-48757 / the Lovable incident documented 170+ apps with 13k users where RLS was the only control and 10.3% of deployments mis-configured at least one policy. Bypass surfaces (SECURITY DEFINER functions, view inheritance, planner cliffs, service_role) compound the failure modes. RLS as a backstop is cheap; RLS as the only boundary is a load-bearing single point of failure.

Spectral has FastAPI in the trust path (the API tier per ADR-034). ADR-034’s decision to remove the direct-SDK frontend pattern eliminates the “anon key reachable” concern. With both conditions in place, the right tenancy posture is app-layer primary, RLS backstop.

Decision

D1 — App-layer tenancy filtering is the primary enforcement boundary

FastAPI validates the Supabase-issued JWT on every authenticated request, loads tenant context (org_id from the JWT, domain_id from the request scope), and builds queries via a typed TenantScopedQuery helper that the architecture validator refuses to let bypass. This is where 95% of isolation actually lives.

Raw psycopg query construction in application code touching tenant-scoped tables is a validator failure; helpers are the only path. The helper composes queries with tenant predicates injected by middleware so a programmer cannot accidentally write a query that ignores tenancy.

D2 — RLS stays as backstop, not primary

Simple policies (domain_id = current_setting('app.domain_id')::uuid) are set from the FastAPI connection-checkout hook. They defend against app-layer regressions, not against a public anon key — there isn’t one (per ADR-034). Policies stay dumb enough that the anti-RLS bypass surfaces (SECURITY DEFINER, view inheritance, planner cliffs) are not biting.

service_role is disciplined to worker and ops-only contexts; customer-request paths never use it. “Ops-only contexts” here means worker/system jobs and access classes that are not customer-tenanted — worlds-context authoring (world-scoped under the worlds owner role; worlds tables carry no org_id/domain_id) and platform-owned cross-tenant aggregate tables that have no customer RLS by construction. It does not mean operators read or write customer-tenanted data through a globally RLS-exempt connection: that path is assumed-identity under RLS per D4. RLS applies to every tenant-scoped app table regardless of schema (core, platform). core/platform tenant-scoped tables carry org_id plus domain_id and back-stop on domain_id = current_setting('app.domain_id'). Per-user-owned platform records (audit_log, domain_members, api_keys, agent-memory) back-stop on app.user_id.

The session-var convention is captured in spectral.core.db.session_vars:

SESSION_VAR_ORG_ID = "app.org_id" (customer tenancy)
SESSION_VAR_DOMAIN_ID = "app.domain_id" (domain scoping)
SESSION_VAR_USER_ID = "app.user_id" (per-user identity, added by TA-13 D6)
SESSION_VAR_WORLD_ID = "app.world_id" (worlds-context scoping, ADR-098 D6)

Worlds-context artifacts are world-scoped, not domain-scoped. Per ADR-098, a world is a shared, market-centric worlds-context artifact that carries no platform tenancy columns (org_id/domain_id) — the domain↔world link is the soft platform.domains.world_id. So worlds.* tables back-stop on the world’s own identifier (world_id = current_setting('app.world_id') for world_models.id; world_id = … for its children), not on domain_id. The backstop is still dumb and the app layer is still primary (D1); the predicate’s column differs because the tenancy unit differs. Worlds artifacts are shared within a world, so any operator authorized for the world reviews them regardless of author; the RLS predicate is world_id, not the per-user created_by. created_by/operator_id are retained as provenance, not as RLS predicates. The (org, domain) → world_id resolution that binds app.world_id lives at the apps/api composition layer, the only context that reads platform.domains.

D3 — Identity, not capability, in session vars

Per TA-13 D6, session vars name identity layers (org, domain, user); they do not name capability classes (operator, admin). Capability lives in JWT scopes, not in session vars. This rule is architectural — it constrains how future RLS use cases are wired up.

D4 — Operator access to customer-tenanted data is assumed-identity, not RLS-bypass

An operator never reaches customer-tenanted tables through a globally RLS-exempt connection role. When an operator acts inside a customer org, the request assumes that single (org_id, domain_id) tenancy — app.org_id / app.domain_id are set to the target, exactly as a customer request sets its own — and RLS admits the request via the operations capability, confined to that one tenancy. RLS stays the backstop even for operators: an app-layer filtering bug cannot leak across tenants, because the session vars pin one tenancy and the policy enforces it. This closes the “privileged read path that circumvents the permission model” that the access-control model rejects elsewhere (the reason there is no org-wide observer role); the operations capability is what admits the operator to the assumed tenancy, not a standing exemption from the policy.

The assumed-identity is carried by the ADR-087 delegation token (the act claim names the operator as actor; the outer org_id/domain_id name the assumed target). What admits the operator at the database layer is an operations-capability clause on the customer-tenanted RLS policies — the operator is confined to the single assumed tenancy, never granted a cross-tenant exemption. Per D3 this is consistent: app.org_id/app.domain_id name the assumed tenancy (identity/scope); the operations capability lives in the organization_role claim, not in a session var. The functional core of this mechanism (the delegation mint endpoint, the RLS admission clause, dual-mirror revocation) is defined by ADR-087; operator surfaces that are world-scoped (world authoring) or read platform-owned cross-tenant aggregates are distinct access classes and are unaffected.

Alternatives considered

RLS-primary with direct-Supabase-SDK-from-frontend (the Supabase default model). Rejected: CVE-2025-48757 class; RLS becomes a single point of tenancy failure when the anon key is reachable; business-logic gravity scatters across policies plus frontend plus API.

RLS-only (no app-layer filter). Rejected: silent failure mode (UPDATE blocked returns 0 rows; SELECT returns empty); test coverage burden; bypass surfaces (SECURITY DEFINER, service_role, views).

App-layer only (no RLS). Rejected: loses the cheap backstop against app-layer regression; the belt-and-suspenders tax is minimal when RLS is kept simple and service_role-disciplined.

Per-tenant database isolation. Rejected at this scale: Supabase Auth binds 1:1 to a DB (per ADR-032 D1); tenant-per-DB would require a project-per-tenant which does not match the multi-tenant SaaS shape.

Consequences

Tenancy is app-layer-primary with an RLS backstop. The Supabase platform / Auth / pgvector commitments (per ADR-046) stand.
A typed TenantScopedQuery helper is the only path for tenant-scoped queries; the architecture validator rejects raw psycopg query construction in application code that touches tenant-scoped tables.
Session-var contract lives in spectral.core.db.session_vars with contract tests pinning the exact strings (app.org_id, app.domain_id, app.user_id, app.world_id). Per ADR-065 admission discipline.
request_scope carries the binding (per ADR-041): a request-scoped transaction sets SET LOCAL app.user_id plus, as the flow requires, app.org_id / app.domain_id (platform/core tenant-scoped tables) and app.world_id (worlds-context artifacts, ADR-098 D6). The operator surface binds app.world_id from the request’s target world.
service_role discipline (D2): service_role is confined to worker/system and ops-only contexts and never leaks into customer-request paths.
Codex system-design/foundations/access-control.mdx documents the access-control model alongside the OPERATIONS scope taxonomy, including the assumed-identity model for operators (operator confined to one assumed customer tenancy under RLS), not a service_role bypass.
Operator access to customer-tenanted data is assumed-identity (D4). No blanket RLS-exempt connection over customer data; the operations capability admits the operator to a single assumed (org_id, domain_id) tenancy under RLS. The functional core (delegation mint, RLS admission clause, dual-mirror revocation) is defined by ADR-087.

References

ADR-032 — storage topology (per-context schemas; per-context roles)
ADR-034 — frontend data access via API proxy (eliminates “anon key reachable”)
ADR-039 — Supabase Auth confirmation + JWT validation surface
ADR-041 — pool checkout hook binds session vars
ADR-101 D3 — SESSION_VAR_USER_ID (identity, not capability)
TA-1 disposition — SPEC-304 comment 5c9c25f0
src/spectral/core/db/session_vars.py — session-var contract surface
Codex system-design/foundations/access-control.mdx — close-pass updates
CVE-2025-48757 / Lovable incident — primary anti-RLS evidence (cited in TA-1 evidence section)

Previous
ADR-032: Storage topology — single Supabase project, three application schemas, forward-only migrations Next
ADR-034: Frontend data access via API proxy; realtime via SSE