Source Material & Distillation
Operator surface for the distillation pipeline: source material ingestion (storage + provenance anchoring), LLM-guided candidate distillation against source documents, and the distillation-request workflow. Distilled candidates flow into the conformity gate for operator review and then into the enshrinement queue. Nothing becomes enshrined without human sign-off.
- Source material ingestion — operators upload, parse, and register Authoritative-tier source
documents (IRS publications, statute excerpts, regulatory guidance) into the
SourceMaterialinventory; provenance metadata anchored on every row per ADR-065 - Provenance anchoring — every distilled candidate inherits a provenance trail back to
the
SourceMaterialrows that contributed to its extraction; the chain is end-to-end traceable (per the audit trail below) - LLM-guided distillation — operator-parameterized extraction (focus area, problem
space, expected tier) runs against selected source materials; the pipeline indexes
source documents via the
EmbeddingProviderprotocol per ADR-038 for similarity-based chunk retrieval, then invokes the LLM on retrieved chunks to draftRuleCandidaterows - Distillation-request workflow — operators (or the Operations Agent) submit a run, monitor per-run and per-source status, and review output candidates
- Conformity gating — distilled candidates pass through the conformity gate before reaching the enshrinement queue (per Evolution Loop)
Governance invariants
Section titled “Governance invariants”- Distillation is a proposal path, not a commit path. Distilled candidates are staged for operator review and pass through the conformity gate; they do not auto-enshrine.
- Source materials carry durable provenance. Candidates distilled from a source
inherit that source’s provenance chain via the
distillation_run_outputsjunction (per domain-model — DistillationRunOutput). - SourceMaterial is append-only. Metadata corrections produce a new row plus retire of the prior; URL re-parsing produces a new ingestion run, not in-place mutation. Retirement is a state transition (no row deletion). The same retire-as-state-transition rule that authoring applies to world models, rules, and rule relationships extends uniformly to source materials.
- Distillation runs have no cancel/abort. Operators wait for a terminal state
(
completedorfailed). The state machine is designed for clean extension — cancellation is added later as an additive transition out ofin_progress, not a redesign of the workflow. - Concurrent operator actions on the same source material or distillation run are
serialized at the row level — one wins; the other receives a clean conflict response.
Mirrors the row-level serialization on
ApprovalDecisionand the world-model authoring surfaces. - Every distillation-mutate API is idempotent on
(operator_id, correlation_id). Source-material ingest, source-material retire, and distillation-run submit all carry acorrelation_idrecorded on the correspondingworld_authoring_auditrow (and ondistillation_run.correlation_idfor run submissions); a re-issued request with the same correlation ID returns the original outcome. - Distillation failures are operator-visible. Silent failure is not allowed — per-source
status surfaces partial output and explicit errors; run-level status reaches a terminal
failedstate with a codedfailure_reason_coderetained on STRIP_PAYLOAD plus free-text detail stripped on disposal.
Audit trail
Section titled “Audit trail”Every operator authoring action on this surface appends a row to the
WorldAuthoringAudit record family (per
domain-model — WorldAuthoringAudit).
The discriminators introduced for distillation are:
target_type = source_material,action = ingested— operator ingests a new source materialtarget_type = source_material,action = retired— operator retires a source from the inventorytarget_type = distillation_run,action = requested— operator submits a new distillation run
The audit row is appended alongside the entity-state mutation in the same transaction; it
is never appended outside the originating use-case transaction. The
(operator_id, target_type, target_id, action, occurred_at, correlation_id) row remains
queryable for the full active window per
data-retention — operator-action records.
WorldAuthoringAudit is not agent memory (per the
records-vs-memory framing). It is
operator-scoped (app.user_id) with no workspace RLS, distinct from ApprovalDecision
(same identity domain, different action surface — enshrinement-gate decision vs authoring)
and from operations_agent_approval (platform-side, Ops-Agent call-time approval). Three
operator-action audit families remain stable across the worlds / platform split.
The DistillationRun workflow record itself lives alongside the audit row — the audit
row records the operator action (“operator X requested distillation Y at time Z”); the
DistillationRun row carries the workflow lifecycle state (status, started_at,
completed_at, failure_reason_code) and is the system of record for run progress.
UI surfaces
Section titled “UI surfaces”The operator UI surfaces this workflow as four panels:
- Source inventory — list / inspect / ingest / retire source materials; shows provenance envelope (publication, section, revision)
- Distillation request — operator-parameterized form (world model, source selection,
focus area, problem space, expected tier) submitting a new
DistillationRun - Run status — live view of run lifecycle and per-source extraction state; surfaces partial output as it lands; explicit error display on per-source failure
- Candidate review — produced
RuleCandidaterows with full provenance trail (run → source materials → candidate); operator can advance candidates to the conformity gate or retire them in place
Every UI surface has a corresponding API endpoint per the dual-occupant API/UI parity rule; the Operations Agent consumes the same endpoints to initiate and monitor runs.
API surfaces
Section titled “API surfaces”The HTTP routes consumed by both the UI and the Operations Agent (per the parity contract) include:
POST /worlds/:world_id/source-materials— operator ingest of a new source material (idempotent on(operator_id, correlation_id))POST /worlds/:world_id/source-materials/:id/retire— operator retire of an existing source material (idempotent)POST /worlds/:world_id/distillations— operator submit of a new distillation run (idempotent on(requested_by, correlation_id))GET /worlds/:world_id/distillations/:id— run status (run-level + per-source)GET /worlds/:world_id/distillations/:id/candidates— producedRuleCandidaterows with provenance trail
Auth middleware sets app.user_id via SET LOCAL per
ADR-041 D4; RLS predicates compare against
app.user_id per ADR-039. The
apps/operations deployable consumes these routes via HTTP per
ADR-047 and never imports
spectral.worlds internals; the architecture validator at STRICT=True enforces this.
Related reading
Section titled “Related reading”- Operations App overview — dual-occupant UX, context boundary
- Authoring — sibling operator surface for world-model + rule + relationship authoring (parallel audit family)
- World Model System / Evolution Loop — the governed path candidates travel
- Domain Model — SourceMaterial / DistillationRun — entity definitions
- Data Retention — POLICY_REGISTRY entries for source materials and distillation runs
- Source materials (tax-prep) — IRS publication inventory backing the initial world model