ADR-100: Rule-provenance model — four axes and authoritative-source taxonomy reconciliation
Context
There are two pictures of rule provenance in the corpus, and they do not match.
The designed model (Codex world-model.mdx “Two-dimensional provenance” + ADR-080 D4 + ADR-082 D2) is richer than a single field. A rule carries:
- a two-dimensional provenance — an authoritative-source dimension (a source-strength taxonomy, designed as Authoritative / Curated / Distilled / Observed) and a code dimension (the generated predicate’s lineage to the natural-language rule, the world-agent version, the eval-suite version, the generation-run identifier); plus
- a separate severity tier axis (
T1/T2/T3, driving suppression and aggregation); and - a separate lifecycle status axis (
candidate | enshrined | retired).
The World Model Card (Codex system-card.mdx) discloses the authoritative-source composition per version — the reader counts how many rules sit at each source-strength tier to judge how strong the version’s “established and governed before the module ran” claim is.
The implemented model ships a single domain enum ProvenanceTier = inferred | observed | asserted (src/spectral/worlds/domain/authoring/types.py, SPEC-446/447), stored on worlds.rules, with a ChangeProvenanceTier use case and a tier_changed audit action. This alphabet matches none of the four designed axes: it is neither the source-strength taxonomy (it has no Authoritative/Curated/Distilled/Observed), nor the code dimension, nor the severity tier, nor the lifecycle. It encodes an authorship/confirmation concept (inferred = system-derived, observed = operator-confirmed, asserted = operator-stated) that overlaps the lifecycle and the created_by/citation facts already recorded elsewhere.
The implementation cites ADR-026 as its authority. ADR-026 defines no provenance alphabet — it decides version-as-authority and names three restatement categories (assertion change, code regeneration, behavioral correction). The citation is unsupported.
The drift has a concrete downstream cost. The WorldModelCard publication event carries rule_health_distribution: dict[str, Any] (worlds.contracts.events.world_model_card_published), projected into the platform System Card (system-card.mdx provenance summary), which the disclosure already expects to compose over the four-tier source taxonomy. With the shipped enum the card would render the wrong alphabet — inferred/observed/asserted counts in place of a source-strength distribution. ADR-090 D2 added a fifth concern: web-research-sourced rules need a provenance marker, and D2 explicitly left the choice open (“extended Distilled with sub-classification, or a new Researched tier”).
This ADR reconciles the model to the designed four axes (a fifth — the emitted outcome — is ratified in ADR-106; see D1), fixes the source-taxonomy value set (including the web-research tier ADR-090 D2 left open), discards the unsupported enum, and corrects the citation. It specifies the target; the implementation reconciliation is deferred to a later session (D6).
Decision
D1 — Rule provenance is two-dimensional; severity and lifecycle are separate co-disclosed axes
A rule carries provenance along two dimensions:
- (a) authoritative-source dimension — where the rule came from, a source-strength taxonomy (D2).
- (b) code dimension — where the predicate code came from, the generated code’s lineage to the natural-language rule (D5).
Two further axes are co-disclosed with provenance on the World Model Card but are not provenance, and must not be conflated with it:
- severity tier —
T1/T2/T3, governing suppression and aggregation (world-model.mdx“Severity tiers”). - lifecycle status — the three-status machine
candidate → enshrined → retired.
Four axes are named here. A fifth independent axis — the rule’s emitted outcome (the four-state status a matched rule contributes: GREEN | GREEN-SKIP | YELLOW | RED) — is ratified in ADR-106, which also implements the winner_takes_all aggregation that consumes severity: severity orders which matched rule’s outcome wins; it does not generate the outcome. Each axis is independent: a T1 rule may be Distilled; an Authoritative rule may still be a candidate; the independence holds across all five axes. Collapsing any pair into one enum (as the shipped inferred/observed/asserted did) is the defect this ADR corrects.
D2 — Authoritative-source axis: a six-value source-strength taxonomy
The authoritative-source dimension has six values, strongest to weakest:
authoritative > curated > distilled > researched > observed > assistant_drafted
authoritative— sourced directly from a recognized domain authority (regulatory text, standards-body publications, published specifications). The strongest “precedes the system” claim: grounded in sources established independently of any AI system’s behavior.curated— sourced from high-quality secondary material with traceable lineage (expert-written references, peer-reviewed analysis, well-attributed domain literature). Strong, but secondary to the primary sources ofauthoritative.distilled— derived from LLM synthesis over operator-supplied primary or secondary sources without a direct quotation chain. Useful for coverage; weaker provenance, subject to drift if the underlying sources change before re-distillation.researched— derived from sources the World Agent located through bounded web research (D3), not supplied by the operator. Ranked belowdistilledbecause the source selection itself is machine-mediated.observed— derived from patterns in live decision traffic via override-pattern signals from the Customer Dashboard. It describes behavior already present in customer-flagged decisions, arriving after operational practice rather than before it.assistant_drafted(SPEC-654) — drafted by the World Agent on an operator’s direction from a chat prompt, with no source corpus behind it. The weakest source-strength: unlike every family above it, it cites no source material at all — not operator-supplied (distilled), not machine-located (researched), not observed in live traffic (observed). Its grounding is operator intent expressed in chat, captured in the rule’sprovenance_sourceenvelope (the directing operator + the chat session), not a source the reader can independently weigh. Zero cited sources is the expected state for this family — the formerdistilleddefault produced a contradiction (“distilled from source material” alongside zero cited sources) that the operator surface reported as “No recorded origin · 0 cited sources · distilled tier”; this family resolves it. The design/positioning session that motivatedassistant_draftedis recorded in ADR-103 D7.
The value set is stored as a domain enum on worlds.rules.provenance_tier. The enum is named AuthoritativeSourceTier (recommended for clarity over a corrected-but-renamed ProvenanceTier, which would still read as “the one provenance field” and invite re-collapse). Token casing is lowercase in the database and the enum (authoritative, curated, distilled, researched, observed, assistant_drafted); Codex prose may title-case. The tier is assigned at distillation/authoring time per ADR-090 D1 — a fact about how the rule was sourced, set when it is sourced.
D3 — Web-research provenance is a distinct researched tier, ranked below distilled
ADR-090 D2 left the web-research marker open between “extended Distilled with sub-classification” and “a new Researched tier.” This ADR resolves it as a new researched tier, not an extended-distilled sub-classification.
Rationale: a web-found source is machine-selected. The World Agent decides what to research and which results to adopt — more AI-mediation and weaker governance than distilled, where the operator supplied the corpus and the synthesis is over an operator-chosen body. The two are different source-governance regimes, not shades of one, so the World Model Card must count them separately for an honest composition disclosure. A sub-classification of distilled would hide researched rules inside the distilled count and overstate the version’s governance. researched ranks below distilled and above observed: weaker than operator-supplied distillation, stronger than after-the-fact traffic observation. This closes ADR-090 D2’s open choice.
D4 — The implemented inferred/observed/asserted enum is discarded; ChangeProvenanceTier and tier_changed are retired
The shipped ProvenanceTier = inferred | observed | asserted enum is discarded. The authorship/confirmation concept it encoded is not lost — it is subsumed by axes that already exist:
- who authored / confirmed it → the lifecycle status (D1) plus
created_by. - what it was sourced from → the
provenance_sourcecitation recorded per rule. - the history of those facts → the authoring audit trail.
No separate enum is needed to carry it.
The ChangeProvenanceTier use case and the tier_changed audit action are retired. Source strength is a fact set once at sourcing, not an operator confidence dial to be turned later. A genuine re-sourcing of a rule — its underlying authority changed, or a stronger source was found — is a substantive change to the rule and flows through the ADR-026 restatement mechanism (an assertion-change or rule restatement, recorded in release notes), not through an in-place tier mutation that leaves no version boundary.
The unsupported ADR-026 citation is corrected. ADR-026 governs version-as-authority and restatement; it does not define a provenance alphabet. The authority for the authoritative-source taxonomy is this ADR + ADR-082 D2 + ADR-080 D4 + Codex world-model.mdx.
D5 — Code-provenance lives in the module manifest; severity is a rule column; the card re-types to a structured summary
- Code dimension (D1(b)) is not a
worlds.rulesauthoring column. It lives in the ADR-080 D4 module manifest —world_agent_version,eval_framework_version(eval-suite version), and the generation-run identifier — and is projected onto the World Model Card from there. The code dimension is regenerated when the natural-language form or configuration dependencies change; it has a module-build cadence, not a rule-authoring cadence, which is why it belongs to the manifest and not to a rule column. - Severity tier (
T1/T2/T3) is a newworlds.rulescolumn (defaultt2), a separate axis from source provenance per D1. - World Model Card re-typing. The
rule_health_distribution: dict[str, Any]field on theWorldModelCardpublication event (worlds.contracts.events.world_model_card_published) is re-typed to a structuredprovenance_summarykeyed by the six source tiers (authoritative/curated/distilled/researched/observed/assistant_drafted), so the System Card disclosure renders a typed composition rather than an opaque dict. A content-contract test pins the six-tier alphabet so producer and consumer cannot drift off it (bilateral contract test per ADR-065 D6).
D6 — This is a pre-Stream-E reconciliation; lifecycle remains a separate axis
This reconciliation lands before Stream E. The held Stream-E plan (SPEC-493/514) assumed the distilled tier already existed in the implementation; it does not (the shipped enum is inferred/observed/asserted). Settling the source taxonomy here removes that false premise before the Stream-E rule work plans against it.
The lifecycle axis is intentionally separate from provenance and remains candidate | enshrined | retired. Enshrined means accepted into the world rule catalog, not published, deployed, frozen, or caller-visible. Publication/deployment and input-contract freeze are version-boundary concerns handled outside this ADR, notably by the action ontology reconciliation work in ADR-107.
Consequences
- The authoritative-source axis is six values (
authoritative>curated>distilled>researched>observed>assistant_drafted; the sixth added by SPEC-654), closing ADR-090 D2. The card composition disclosure has a stable alphabet to count over. - The shipped
ProvenanceTier = inferred | observed | assertedenum, theChangeProvenanceTieruse case, and thetier_changedaudit action are retired; the enum is replaced byAuthoritativeSourceTieronworlds.rules.provenance_tier. Re-sourcing flows through ADR-026 restatement. - A new
worlds.rulesseverity column (T1/T2/T3, defaultt2) makes severity a first-class axis distinct from provenance. Code provenance stays in the ADR-080 D4 manifest and is not duplicated as a rule column. - The
WorldModelCardpublication event’srule_health_distribution: dict[str, Any]re-types to a structuredprovenance_summarykeyed by the source tiers (additive-event-versioning discipline per ADR-044 D11; consumer ACL + bilateral contract test per ADR-065 D2/D4/D6). SPEC-654 adds theassistant_draftedcount as an additive field (ge=0default 0) under the same discipline — no wire-version bump. The platform System Card projection consumes the typed shape. - The ADR-026 citation in the implementation is corrected to this ADR + ADR-082 D2 + ADR-080 D4; ADR-026 itself is unchanged in substance (cite-correction only).
- Implementation reconciliation is deferred. The migration (enum rename + value set + new severity column + event re-type) and the test-reference reconciliation (~85 references to the discarded enum and
tier_changed) are a later session’s work; this ADR specifies the target. - Lifecycle is not expanded (D6). It remains a separate three-status axis from provenance, severity, outcome, and code provenance.
References
- ADR-026 — world-model version as unit of authority + restatement mechanism (citation corrected here; not substantively changed; re-sourcing flows through its restatement categories)
- ADR-044 D11 — additive payload versioning for the re-typed
WorldModelCardevent - ADR-065 D2/D4/D6 — producer-typed event payload + consumer ACL + bilateral contract test for the
provenance_summaryalphabet - ADR-080 D4 — build-provenance attestation in the module manifest; home of the code-provenance dimension (D5)
- ADR-082 D2 — version-scoped World Model Card content: tier + assertion provenance + code provenance
- ADR-090 D1/D2 — distillation-as-authoring tier assignment; web-research gap-fill (D2’s open tier choice closed here)
- Codex
world-model.mdx— two-dimensional provenance, severity tiers, and the three-status lifecycle - Codex
system-card.mdx— World Model Card provenance-composition disclosure consuming theprovenance_summary - Codex
evolution-loop.mdx— gates and publication transaction that set provenance and lifecycle