World Model System

World Model

A World Model is a Spectral-managed authoring authority for a domain. It is scoped to (org_id, domain_id), versioned, and exists independently of the platform pillar that hosts its deployed modules at decision time. The world model is a governed artifact, not a static reference. It evolves through the Evolution Loop, advances through published versions, and carries its own provenance and authority metadata.

Each enshrined action compiles to one deployable action module. Callers do not read rule content at decision time — they invoke /decide and receive a binding { status, work_frame, decision_metadata }. The world model is the source of the deterministic code that runs.

The world model shape

Every world-model version declares three first-class components.

Context schema

The typed declaration of every attribute a domain’s decisions consume. Each attribute carries a source that determines where its value comes from at runtime:

supplied — caller provides in the request body. Schema validation: missing or mistyped required attribute → YELLOW + gather_evidence_and_retry.
system_generated — Spectral captures at request entry (request_time, request_id, authenticated_caller). Cannot fail; caller cannot forge.
computed — derivation runs at composition-root entry from supplied + system_generated. Derivation throws → YELLOW + diagnostic in trace.

Rules do not distinguish source — they destructure typed values from the unified DecisionContext. The source distinction is operator/authoring-facing.

context_schema:
  amount:           { type: number, source: supplied, constraints: {min: 0, max: 1e9} }
  payment_method:   { type: string, source: supplied, enum: [wire, ach, check, credit_card] }
  vendor_id:        { type: string, source: supplied }
  request_time:     { type: timestamp, source: system_generated }
  request_id:       { type: string,    source: system_generated }
  authenticated_caller:
    type: object
    source: system_generated
    structure:
      principal:  { type: string }
      acting_for: { type: string, optional: true }   # RFC 8693 act claim
      roles:      { type: list<string> }
  business_hours:
    type: boolean
    source: computed
    derivation: "request_time falls within business_hours_window, excluding holidays"
    depends_on:
      context: [request_time]
      configuration: [business_hours_window, holidays]

Configuration block

Stable values per world-model version that derivations and rule code-generation consume. Configuration is its own primitive at the world-model level — business-hour definitions, holiday calendars, dollar thresholds, regional regulatory parameters.

Stable per world-model version — changes are a version bump; dependent derivations and rules regenerate.
Used by derivations at composition-root entry (e.g., business_hours consults business_hours_window and holidays).
Used by rules at code-gen time — thresholds like amount > high_value_threshold are baked into the generated predicate. Rules do not read configuration at runtime.
Not exposed to predicates at runtime — predicates see only typed context.

Configuration enables operator-tunable thresholds without rewriting the natural-language assertions. Operator adjusts a threshold, version bumps, world agent regenerates the dependent rules’ code, deployed module reflects the new value.

Action registry

The set of registered actions in the domain. Each action is the unit of decision-time routing: a caller’s request names the action it is about to take (wire_transfer.release, vendor.onboard, ach.release); Spectral routes to the deployed action module for (org, domain, action) and evaluates that action’s rules against the context.

Actions hold their rules directly — no separate “rule-set” primitive layered between action and rules. Each action also (optionally) carries an aggregation_mode controlling how its rules combine; default is winner_takes_all. A rule associates to one or more actions through the many-to-many action–rule registry (per ADR-104); a rule assigned to no action does not deploy.

actions:
  "wire_transfer.release":
    aggregation_mode: winner_takes_all     # default; implicit if omitted
    rules: [ T1.high_value_wire, T2.new_bank_details, ... ]
  "ach.release":
    rules: [ T1.high_value_ach, T2.new_vendor_ach, ... ]
  "vendor.onboard":
    rules: [ T1.missing_w9, T2.duplicate_vendor, ... ]

The rule primitive

A Rule is the atomic unit of an action. Each rule is one file inside the action’s module, exporting:

Metadata (static, machine-readable) — id, tier (T1/T2/T3), outcome (the status emitted when the rule matches), description, category (optional taxonomy tag), suppresses, dual provenance.
applies_when (optional) — a context-only predicate evaluated before the rule’s main predicate. The rule participates in matching only if applies_when returns true. Used for conditional activation and multi-stage classification; DRYs gating logic that would otherwise duplicate across rule predicates.
Predicate — pure function, deterministic, no I/O. Takes a typed DecisionContext, returns { matched: bool, reason?: string, trace?: object }.
Declared inputs — the typed, documented inputs the predicate consumes (name, value type, description, required). Codegen is bound declared == read (per ADR-107), so the declaration equals the predicate’s actual reads. Across an action’s rules, these declarations are converged at publish onto the action’s canonical input ontology — the authority for its published input schema, one named typed input per real quantity — rather than shipping as a name-keyed union that could list one quantity twice under two names. Once a version has published an input it is a frozen contract: convergence conserves it, and an automatic rename or re-type of it fails the publish loud for operator arbitration. An operator who wants to rename or re-type a published input does so deliberately at a new version — recording the edit (shown its contractual impact first), which the next publish honors and stamps with operator provenance (per ADR-107 D6/D9). A previously published version is never mutated in place.
Behavioral spec + tests — a spec of the inputs the outcome depends on and discriminating case-pairs (including negative cases) is extracted from the rule’s natural-language text independently of the predicate; discriminating tests are deterministically materialized from it, and the predicate is generated to pass them (per ADR-108).

The outcome is static metadata, not predicate output. Rules report whether they matched; the composition root reads metadata.outcome to determine the contributing status. This keeps generation simple and composition predictable.

Severity tiers

Each rule carries a tier that drives suppression and aggregation behavior. Distinct from the T1/T2/T3 memory tiers (see glossary for the terminology disambiguation):

T1 — unconditional hard-floor override. Any T1 match wins outright regardless of aggregation mode. Preserves the T1-unsuppressible property across all current and reserved aggregation modes.
T2 — standard rules; combine according to the action’s aggregation mode.
T3 — soft signal; combine according to the action’s aggregation mode.

Aggregation mode (action-level)

Each action carries an optional aggregation_mode in its metadata. It controls how the action’s rules combine into a single decision outcome. Today the schema accommodates the field but only one mode is implemented; future modes are deferred-but-shaped.

Mode	Status	Behavior
`winner_takes_all`	Implemented (default)	All matched rules sort by severity; highest-severity rule’s outcome wins.
`vote_of_n`	Reserved (not implemented)	Group of rules with the same outcome must reach a threshold count to fire that outcome.
`weighted_sum_threshold`	Reserved (not implemented)	Each matched rule contributes a weight; sum compared against thresholds determines outcome.

T1 rules are an unconditional hard-floor override regardless of aggregation mode. Aggregation applies only to T2/T3 rules. When two or more matched rules share the winning tier but emit different outcomes, the most-restrictive outcome binds (RED > YELLOW > GREEN-SKIP > GREEN) — a tie never silently downgrades a block. See ADR-106.

Three-status lifecycle

Every persisted rule occupies one of three status values.

Candidate — proposed and under review. The operator may request revision or send the rule back for more work; the row remains a candidate. Candidate rules can be assigned to actions as authoring membership, but that membership is inert until the rule is enshrined.

Enshrined — accepted into the world rule catalog by operator sign-off on the natural-language rule intent. Generated predicate code, behavioral tests, input declarations, and publish-time input-ontology realignment are machine-checked artifacts around that intent. Enshrined does not mean live, deployed, frozen, or exposed to callers: publish/deploy version boundaries decide what ships, and a published input contract is the freeze boundary.

Retired — discarded, rejected, superseded, or invalidated. Retired rules are preserved for audit and evolution history but no longer participate in evaluation.

The persisted state machine is Candidate → Enshrined → Retired, with terminal discard/reject moving a candidate to Retired. Request-revision/send-back keeps the rule in Candidate with reviewer notes. Action assignment is status-agnostic membership; action-specific publication and deployment gather assigned Enshrined rules only. The World Model Card snapshots the enshrined rule catalog and pins per-action membership to assigned Enshrined rule IDs.

Two-dimensional provenance

Each rule carries two-dimensional provenance — separate axes for where the rule came from and where the predicate code came from. Both dimensions are part of every enshrined rule.

Authoritative source dimension

The six-tier source-strength taxonomy, strongest to weakest:

Authoritative — sourced directly from a recognized domain authority. Regulatory text, standards-body publications, and published specifications qualify. This tier carries the strongest “precedes the system” claim: the rule is grounded in sources established independently of any AI system’s behavior.

Curated — sourced from high-quality secondary material with traceable lineage. Expert-written references, peer-reviewed analysis, and well-attributed domain literature qualify. Strong provenance, but secondary to the primary sources of the Authoritative tier.

Distilled — derived from LLM synthesis over primary or secondary sources without a direct quotation chain. Useful for coverage, but weaker provenance: a rule that LLMs proposed by reading multiple authoritative sections together and condensing the cross-reference into a single rule statement is Distilled — the underlying sources are Authoritative, but the rule text itself is the LLM’s synthesis. Subject to drift if the underlying sources change before re-distillation.

Researched — derived from LLM synthesis over sources the World Agent located through bounded web research (see Evolution Loop — Web-research gap-filling), not sources the operator supplied. Weaker than Distilled because the source selection itself is machine-mediated — the World Agent decides what to research and which results to adopt, a wider AI-mediation surface and a weaker governance regime than synthesis over an operator-chosen corpus. Stronger than Observed because it precedes deployment: the rule is established before the module runs, not inferred from traffic after the fact.

Observed — derived from patterns in live decision traffic via override-pattern signals from the Customer Dashboard. Rules describe behaviors already present in customer-flagged decisions, which means they come after operational practice rather than before it. Example: aggregating a world model’s customer review-requests by pattern surfaces a recurrent edge case that the World Agent proposes as a guardrail rule; the rule is Observed because no published authority described it before customer-flagged decisions surfaced it.

Assistant-drafted (operator-directed) — drafted by the World Agent on an operator’s instruction in chat, with no source corpus behind it. This is the weakest source-strength form: unlike every tier above it, an Assistant-drafted rule cites no source material at all — neither operator-supplied (Distilled), nor machine-located (Researched), nor observed in live traffic (Observed). Its grounding is the operator’s intent expressed in chat, recorded as the rule’s origin (the directing operator and the chat session). Because there is no source corpus, zero cited sources is the expected state for an Assistant-drafted rule, not a missing-citation defect.

The claim that holds uniformly across the source-backed tiers is “established and governed before the deployed module ran” — not “derived independently of AI systems.” The Authoritative and Curated tiers satisfy the stronger independence form. The Distilled, Researched, and Observed tiers satisfy the governed form only. An Assistant-drafted rule is governed by the operator who directed and reviewed it rather than by an external source. The World Model Card discloses the provenance composition per version so a reader can evaluate how strong the claim is.

Code dimension

Each rule’s predicate code carries its own provenance: the generated predicate’s lineage to the natural-language rule it was generated from. Regenerated when the natural-language form or configuration dependencies change; verified at the implementation-readiness gate. The code dimension is what lets the world model claim “the deployed module behaves the way the authority described, not the way the LLM thought it should.”

Why two dimensions

Each rule has two authoring artifacts that can drift independently: the natural-language rule text (operator intent, sourced from authority) and the generated predicate code (LLM-generated executable form). Tracking both keeps both honest:

Authoritative dimension survives restatement — a rule’s source citation does not change when the predicate code is regenerated.
Code dimension is regenerated when the natural-language form changes or when configuration values the rule depends on shift — the predicate is always derived from the current authority-aligned form, never from a stale snapshot.

Three-zone structure

A world model’s coverage of its domain is partitioned into Known, Unknown, and Unknowable zones. The Known zone holds enshrined rules accepted into the world rule catalog; published and deployed modules evaluate the assigned enshrined subset pinned into each version. Unknown holds rules that exist in the domain but have not yet been discovered and enshrined; Unknowable holds rules that exist but cannot be perceived with current methods.

Coverage is intentionally incomplete. Unknown and Unknowable zones are acknowledged first-class properties, not gaps to be papered over. The Evolution Loop is the mechanism for moving rules from Unknown to Known; the Unknowable zone is named so the system stays honest about its epistemic limits.

Versioning

A world model version is a coherent snapshot of the standard at a point in time, with documented changes from the prior version. Cadence is operator-triggered, not periodic: operators publish a version when accumulated enshrinements clear a coverage milestone or a release rationale (substantive scope expansion, regulatory authority change, accumulated corrections to prior versions, configuration changes that warrant a restatement). See Operations App — Version Publication for the publication workflow.

Each published version is the unit of authority per ADR-026. Decisions are deterministic over (world_model_version, supplied_context, system_generated_at_entry). Prior versions remain queryable indefinitely; the active version is the routing default for a domain, with optional per-request world_model_version pinning per ADR-077 D2.

Restatement (extended)

Restatement is the mechanism by which world-model versions name what changed and why between versions. The ADR-026 restatement mechanism extends to three categories:

Configuration restatement — a value in the configuration block changes (e.g., high_value_threshold rises from 25_000 to 50_000). Dependent rules regenerate their predicate code; the natural-language form may not change.
Rule restatement — a rule’s natural-language form changes (e.g., a clarification, a scope refinement, a correction discovered via override-pattern signals).
Action restatement — an action’s shape changes (e.g., a new rule added, a rule retired, the aggregation mode shifts).

Restatement carries the authoritative-provenance dimension forward across version transitions while letting the code dimension regenerate where needed. See ADR-026 for the unit-of-authority framing the categories extend.

What’s next

World Agent — code generation, applies_when generation, the two interaction modes
Evolution Loop — both gates, human sign-off, publication transaction
System Card — World Model Card methodology disclosure paired with the deployment-scoped System Card
Decision Execution — how deployed action modules run at decision time

Previous
Overview Next
World Agent