Skip to content
GitHub
Get Started

Operator Walkthrough

This is a narrated walkthrough — read it through; you don’t execute the steps yourself. A Spectral operations-team member (initially the founder, during dogfooding) bootstraps the first world model end-to-end.

It pairs with the first-customer walkthrough, which picks up once the world model is published and a customer attaches their first traces.

The operator — call her Maya — is starting the first world model for US individual federal tax prep. Nothing exists yet in spectral.worlds: no world model, no rules, no source materials, no published version. The goal is to reach a published v0.1.0 world model card with ~50 enshrined rules covering filing status and standard deduction (the two problem spaces in scope) so that the first-customer walkthrough can begin.

The full domain background lives in Domains — US Tax Prep. This page narrates the process, not the domain.

1 · Maya opens the Operations app and starts a world model

Section titled “1 · Maya opens the Operations app and starts a world model”

Maya logs in to the Operations app and opens the Ops Agent chat surface.

Maya: “I want to start a world model for US federal individual tax prep.”

The Ops Agent confirms the intent and asks two clarifying questions: tax year (2025) and problem-space starting point (filing status + standard deduction per the domain selection in ADR-029).

  • Ops Agent tools invoked: create_world_model, set_world_model_metadata
  • UI view: Operations app → World Models → New (rendered alongside the chat)
  • State change: WorldModel row created in spectral.worlds with status draft, tax-year = 2025, domain = us-federal-individual-tax
  • Codex detail: Operations — World Model Authoring

2 · Agent proposes scope + problem-space breakdown

Section titled “2 · Agent proposes scope + problem-space breakdown”

The Ops Agent draws on the domain reference content already in Codex and proposes a problem-space breakdown across three tiers.

  • In scope: filing status, standard deduction
  • Natural next steps: dependents, itemized deductions, credits
  • Permanently out of scope: corporate returns, state tax, international

Maya confirms the scope (filing status + standard deduction). The agent records the scoping decision on the world model so subsequent candidate drafting stays within it.

  • Ops Agent tools invoked: get_domain_reference (reads Domains — Problem Spaces), set_problem_space_scope
  • UI view: Operations app → World Models → (this one) → Scope
  • State change: WorldModel.scope set to ["filing_status", "standard_deduction"]
  • Codex detail: Domains — Problem Spaces

The Ops Agent proposes the source inventory per Domains — Source Materials.

  • Pub 501 — primary for both problem spaces
  • Form 1040 Instructions — authoritative for filing-status definitions and the standard-deduction worksheet
  • Pub 17 — cross-reference context

It offers to ingest each publication with provenance metadata attached (publication name, section taxonomy, TY 2025) and asks Maya whether she wants to add or swap any sources.

  • Ops Agent tools invoked: propose_source_materials, list_candidate_sources
  • UI view: Operations app → World Models → (this one) → Sources
  • State change: SourceProposal rows in the Operations app operator-state; no world-model mutation yet (proposals are not yet ingested)
  • Codex detail: Domains — Source Materials

4 · Maya reviews, approves some, rejects some, adds her own

Section titled “4 · Maya reviews, approves some, rejects some, adds her own”

Maya approves all three proposals, rejects the agent’s suggestion to add Pub 503 (deferred — it belongs to the dependents problem space, out of current scope), and adds the Schedule A Instructions as a cross-reference even though itemized deductions are out of scope (she wants the source available for the MFS-with-itemizing-spouse rule).

The agent acknowledges the rejections and the addition, then ingests the approved set.

  • Ops Agent tools invoked: attach_source_material (per approved source), ingest_source (per attached source)
  • UI view: Operations app → World Models → Sources → individual source action buttons (Approve / Reject / Add)
  • State change: SourceMaterial rows created in spectral.worlds with provenance envelopes keyed by {publication, section, revision}; ingestion runs as a background job
  • Codex detail: Operations — Source Material & Distillation

5 · Agent distills candidates against the sources

Section titled “5 · Agent distills candidates against the sources”

With the source set Maya curated in step 4 ingested (Pub 501, Form 1040 Instructions, Pub 17, plus the Schedule A cross-reference), the Ops Agent runs distillation against the in-scope problem spaces. It proposes rule candidates, each carrying:

  • a natural-language statement of the rule
  • a reference to the specific section of the source publication it was distilled from
  • a proposed provenance tier (Authoritative for publication-direct, Curated for cross-referenced interpretation)
  • proposed depends-on / conflicts-with / qualifies relationships against prior candidates

The first run yields ~80 candidates — the distiller is intentionally over-generative; Maya prunes. Each candidate flows into the enshrinement queue in status = candidate.

  • Ops Agent tools invoked: request_distillation (per problem space), check_distillation_status, get_candidates
  • UI view: Operations app → Enshrinement Queue → (filtered by this world model)
  • State change: RuleCandidate rows created with status = candidate, each with provenance envelope and proposed relationships
  • Codex detail: Operations — Distillation

6 · Maya reviews candidates in the enshrinement queue

Section titled “6 · Maya reviews candidates in the enshrinement queue”

This is the slowest and most important step. Maya works through the queue one candidate at a time. For each candidate, she:

  • Approves — the candidate transitions RuleCandidate → Rule (status enshrined). Relationships to prior rules are validated (acyclic depends-on, disjoint conflicts-with).
  • Rejects with reason — the candidate transitions to status = rejected. The rejection reason is recorded and surfaces to the agent so future distillation runs avoid the same pattern.
  • Requests revision — the candidate stays in status = candidate with a revision note; the agent can re-draft.

The Ops Agent can do none of these mutations itself. Every enshrinement passes through Maya’s UI click or an explicit API call she makes. The agent’s role is to summarize, cross-reference, and flag — not to promote.

Typical first-pass numbers on a fresh corpus:

  • ~40 approved as-is

  • ~20 revised (agent re-drafts, Maya approves on second pass)

  • ~20 rejected (out of scope, duplicate, or source misinterpretation)

  • Ops Agent tools invoked: summarize_candidate, get_source_excerpt, find_related_rules, suggest_revision

  • Operator actions (not agent-mediated): approve_candidate, reject_candidate, request_revision — Maya invokes these via UI or API

  • UI view: Operations app → Enshrinement Queue → candidate detail

  • State change: RuleCandidateRule (on approve), or status = rejected, or revision_requested_at set

  • Codex detail: World Model System — Evolution Loop

Maya runs additional distillation passes as she notices gaps — an edge case the first pass missed, a sub-section of a publication not cited. Each pass is smaller than the first; convergence reflects the problem space narrowing.

She periodically asks the World Agent (accessible from inside the same Operations app, per Agent Architecture — World Agent) for a read-oriented check:

Maya: “Where do you think coverage is weakest right now?”

World Agent: “MFS-with-itemizing-spouse has one rule but the conditions that trigger it aren’t fully expressed. Consider a second rule covering the considered-unmarried path at Pub 501 §MFS, drawing on the Schedule A material you cross-referenced earlier.”

Coverage is considered sufficient when the world model crosses the 50-enshrined-rule floor and the World Agent reports no critical coverage gaps inside the in-scope problem spaces.

  • Ops Agent tools invoked: request_distillation, get_coverage_summary
  • World Agent tools invoked (via separate chat or separate routing): coverage_report, identify_gaps, trace_provenance
  • UI view: Operations app → Enshrinement Queue + World Models → Coverage
  • State change: Continued RuleCandidateRule transitions
  • Codex detail: Agent Architecture, World Model System — World Agent

8 · Publication: drafts, edits, review gate, publish

Section titled “8 · Publication: drafts, edits, review gate, publish”

When coverage is sufficient, Maya tells the Ops Agent she’s ready to publish. The agent:

  • Bundles all enshrined rules into a publication draft
  • Drafts release notes describing the initial scope, the source inventory, and the deferred areas — including the dependents problem space (Pub 503 deferred at step 4) and itemized deductions beyond the MFS cross-reference — so downstream consumers understand what isn’t in the version
  • Marks the publication as draft for Maya’s review

Release notes are not marketing copy. They are the restatement mechanism (ADR-026). For v0.1.0 there are no prior versions to restate against, but the scaffolding for future-version corrections is established now. The authority claim on the v0.1.0 card acknowledges single-operator curation and the absence of an evolution track record; see System Card — Alpha-posture acknowledgment for how that calibration is made honest.

Maya edits the release notes — agent drafts are useful starting points, not final copy.

World Agent review gate. Before publication, the World Agent runs a structural and factual consistency check on the release notes — review_release_notes(version_id). It validates every rule change against the diff, every citation against ingested source materials, and every claim in the release notes against the corresponding world model card claim. Output is one of three:

  • Pass. All structural and consistency invariants hold; publication proceeds.
  • Warnings. Subtler concerns (tone drift, prose inconsistencies, possibly-stale citations). Maya can override with an audit trail; the override is recorded against the publication.
  • Block. Missing required entries, broken citations, or factual contradictions. Maya cannot override; she addresses the issue and re-submits.

The review record itself becomes part of the authority surface — the publication ships with the agent-review pass attached, so future readers can verify which review cleared the version. (See System Card and Operations — Release Notes for the complete posture.)

Maya publishes. The publication call mints the EvaluationAuthorityRef and triggers world model card generation.

  • Ops Agent tools invoked: draft_publication, draft_release_notes, edit_release_notes, request_publication
  • World Agent tools invoked: review_release_notes
  • Operator actions (not agent-mediated): Final publish click — the last governed human gate
  • UI view: Operations app → World Models → (this one) → Publish
  • State change: WorldModelVersion row with version = 0.1.0, status = published, authority_ref minted; ReleaseNotes row persisted; review-record row persisted alongside
  • Codex detail: Operations — Version Publication, Operations — Release Notes

Publication kicks off world model card generation. The card is a public artifact derived from the published version: the ~50 enshrined rules accumulated through steps 6–7, broken down by problem space, with provenance-tier distribution, coverage summary, and the authority metadata downstream consumers need to cite the version. The card carries a posture-acknowledgment footer covering single-operator curation and the limited evolution history.

The card lives as a PDF download from the Operations app card view. Customers who want to evaluate whether to trust a scan result that cited this world model retrieve and inspect it from there. Public-URL hosting at a stable address with cryptographic signing is part of the staged trust model — see System Card — Alpha and post-alpha signing for the full progression.

  • Ops Agent tools invoked: none — generation is a background job triggered by the publication call
  • UI view: Operations app → World Models → Cardsv0.1.0
  • State change: WorldModelCard metadata row created with version = 0.1.0; rendered PDF written to storage; download URL available from the card view
  • Codex detail: World Model System — System Card

By the end of this walkthrough, the following surfaces have been operated end-to-end:

SurfaceExercised by
World-model creation + scopingStep 1–2
Source-material ingestion with provenanceStep 4
Distillation pipelineStep 5
Enshrinement queue + governed promotionStep 6–7
World Agent exploration accessStep 7
Publication + authority mintingStep 8
World Agent release-notes review gateStep 8
World model card generationStep 9
Ops Agent task surfaceSteps 1–8
Dual-occupant UI-agent parityEvery step (UI actions mirror agent actions)

Relationship to the first-customer walkthrough

Section titled “Relationship to the first-customer walkthrough”

This walkthrough ends at v0.1.0 published. The first-customer walkthrough picks up from there: a customer attaches traces, the Stage 1 “Observe and Recommend” posture kicks in, and the trust progression climbs toward Stage 3 managed optimization over time.