Operations

Operations Agent

The Operations Agent is the task-oriented collaborator the Spectral operations team uses to drive world-model-lifecycle work inside the Operations app. Where the World Agent is read-oriented (understand the world), the Operations Agent is write-oriented (act on the world-authoring surface).

Role

Scoped to the apps/operations surface. The Ops Agent runs operator-initiated tasks that would otherwise require manual UI navigation and API stitching:

World-model authoring. Drafting rule candidates from Authoritative-tier source material, staging them into the Evolution Loop, reviewing promotion queues.
Distillation. Proposing condensations of redundant rules, flagging rules with drifted provenance chains, kicking off LLM-guided distillation runs against source documents.
Publication. Generating WorldModelCard drafts, reviewing release notes, coordinating version publication.
Operational observability. Summarizing scan-volume trends, convergence-signal health, customer adoption of world-model-grounded vs customer-directed stimuli.

The canonical role description lives in Agent Architecture — Operations Agent.

Tool surface

Scoped to the Operations-app API. The Ops Agent does not reach into worlds or platform internals — every tool call is an HTTP request to apps/api (the same path the human operator’s UI takes) with the same auth and RBAC.

The table below is the authoritative tool list. Each tool is classified read / suggest / mutate, and mutations are further classified by whether call-time approval is required or whether approval lives at the enshrinement/publication gate only.

Classification rule

read — no mutation, no approval. The agent may call freely.
suggest — produces candidate output (text, draft, proposal) that the operator reviews before acting. No call-time approval; the operator reviewing the output is the gate.
mutate — writes state.
- Gated by enshrinement/publication. Mutation is not itself authoritative — the governed enshrinement or publication gate is where human authority is exercised. No call-time approval required.
- Gated at call-time. Mutation has authority or audit sensitivity that the enshrinement gate does not cover (retracting a published rule, anchoring a new Authoritative-tier source, publishing a version). Operator must approve inline in the chat surface before the tool call executes.

The tool surface

Category	Tool	Class	Call-time approval?
World-model lifecycle	`list_world_models`	read	—
	`get_world_model_status`	read	—
	`create_world_model`	mutate	Yes (first time) / pre-approvable per workspace scope
	`retire_world_model`	mutate	Yes — always
Source materials	`search_sources`	read	—
	`list_attached_sources`	read	—
	`attach_source_material`	mutate	Yes — Authoritative-tier anchoring matters
	`ingest_source`	mutate	No — ingestion is a background job; output gated at enshrinement
Distillation	`request_distillation`	suggest	No — candidates are gated at enshrinement
	`check_distillation_status`	read	—
	`get_candidates`	read	—
Rule authoring (candidates only)	`propose_rule_candidate`	mutate	No — candidate is not authoritative
	`revise_candidate`	mutate	No — still in candidate state
	`retire_rule`	mutate	Yes — retiring a published rule is version-level
Candidate review (read-only for agent)	`list_pending_candidates`	read	—
	`get_candidate_detail`	read	—
	`get_conformity_gate_status`	read	—
	`approve_candidate` / `reject_candidate` / `request_revision`	—	Operator-only — not on the agent’s surface
Publication	`list_enshrined_rules`	read	—
	`draft_release_notes`	suggest	No — operator edits before commit
	`draft_world_model_card`	suggest	No — operator edits before commit
	`request_publication`	mutate	Yes — the version-authority event
Status / health	`get_latest_card`	read	—
	`get_rule_health_signal`	read	—
	`get_coverage_summary`	read	—
DLQ inspection (ADR-060 D7)	`list_dlq_events`	read	—
	`get_dlq_event_detail`	read	—
	`replay_dlq_event`	mutate	Yes — call-time approval; calls `core.outbox_replay()`
Cluster triage (ADR-060 D8)	`list_failure_clusters`	read	—
	`get_cluster_detail`	read	—
	`triage_cluster`	mutate	Yes — call-time approval; updates `platform.rule_candidates_pending` operator-managed columns
Bridge into worlds	`ask_world_agent`	read	— — composes via in-process DI through workers entrypoint per agent tool invocation

What stays operator-only

approve_candidate, reject_candidate, and request_revision are not on the Ops Agent’s tool surface. These are the governed enshrinement-gate actions and they are exclusively human-operator actions (UI click or explicit operator API call). The Ops Agent can summarize a candidate, cross-reference, and flag — but it cannot promote.

Similarly, the final publish click is operator-only; request_publication prepares the publication bundle and requires the operator’s call-time approval, after which the publication is durable. There is no “agent auto-publishes” path.

The boundary is structural, not procedural: the Ops Agent is a draft and propose layer; the human is the authority gate. That separation is what makes the world-model authority claim hold externally — the agent’s drafts can be sophisticated, but every promotion to Enshrined and every published version carries an operator’s identity in the audit record. Removing that gate would mean Spectral itself can author the standard customers cite, which collapses the authority-isolation argument the Codex makes in How Spectral Works — Why two pillars.

Tool infrastructure inherited from the Spectral Agent

The Ops Agent reuses the Spectral Agent’s tool infrastructure rather than defining a parallel one. Every tool above is decorated with the same @observed_tool decorator the Spectral Agent uses (per ADR-060) — same OTel span shape, same (workspace_id, purpose) accounting, same correlation-ID propagation. Tool errors raise the same four-class ToolError taxonomy (ToolUserError, ToolPolicyError, ToolTransientError, ToolTerminalError); the LangGraph orchestrator routes errors through the same LLM-mediated handling path so the Ops Agent decides retry / modify / surface / abandon the same way the Spectral Agent does. There is no Ops-Agent-specific error class and no Ops-Agent-specific observability wrapper — the Operations app inherits the platform’s agent-tool conventions wholesale.

How call-time approval is captured

Inline in the chat surface. When the agent is about to invoke a mutate-with-call-time-approval tool, it emits a structured approval request (tool name, parameters, natural-language summary of what will happen) via LangGraph’s interrupt() per agent tool invocation / Approval mechanism. The operator approves or denies with a single interaction. Rejections abort the tool call and surface back to the agent so it can revise.

Every call appends an operations_agent_approval row capturing (operator_id, conversation_id, tool_name, args_summary, decision, decided_at, correlation_id) — records, not memory. The row is operator-scoped (no workspace RLS) and append-only, mirroring the approval_decision shape from the operational control plane (see evolution-loop / Reviewer surface) in the same identity domain (both app.user_id-keyed, operator-scoped) but with a different action surface (Ops-Agent call-time approval vs enshrinement-gate decision). Distinct from the Spectral Agent’s workspace-scoped agent_approval table — same record shape, different identity scope.

This mirrors the Spectral-Agent approval-ladder model (per Agent Architecture — Human-in-the-loop approval) but is scoped to the operator seat.

Framework-layer composition — bridge tools live in `apps/*` per ADR-065 D5

The Ops Agent never imports spectral.worlds.*. Bridge tools live in apps/* framework deliverables, never in caller-context code (per ADR-065 D3 + D5, ADR-063, and ADR-060).

For the ask_world_agent tool specifically:

The callee-owned OHS Protocol WorldAgentRunner lives in spectral.worlds.contracts.protocols.world_agent (per ADR-065 D3 — see Protocols for the catalog).
The concrete implementation lives in spectral.worlds.application.
The bridge tool — the ask_world_agent callable that the Ops Agent invokes — lives in apps/workers (or a future apps/mcp-world-bridge/ for an MCP server, per ADR-065 D5). The bridge imports WorldAgentRunner (framework-to-context, allowed under validator rule 7) and composes into the Ops Agent’s tool list via DI at workers startup.
The Ops Agent (in spectral.platform.application) sees an opaque tool callable — never the Protocol or its implementation. Caller-context code never imports another context.

Platform-internal Protocols consumed by Ops Agent tools (OutboxReader, OutboxReplayer from spectral.platform.application.operations_agent.outbox) follow the standard within-context DI pattern; they don’t cross a context boundary, so they don’t need the framework-layer bridge. The workers entrypoint composes both surfaces — bridges into worlds and within-platform dependencies — into the agent’s tool registry. Each tool factory (ask_world_agent, list_dlq_events, get_dlq_event_detail, replay_dlq_event, list_failure_clusters, get_cluster_detail, triage_cluster) is unit-testable against mock Protocols.

Memory

Operations-app scoped. Tracks the operator’s in-flight tasks, pending reviews, and recent signal digests. Independent from both Spectral and World Agent memory.

The logical model is operator-keyed, three tiers, with no cross-operator sharing. Schema location is platform.operations_agent_memory per ADR-059 D1: contexts are code boundaries, not schema boundaries; all Operations Agent code lives under src/spectral/platform/, so memory lands in the platform schema alongside the Spectral Agent’s memory tables.

Tiers

The Ops Agent parameterizes the universal interaction / session / persistent lifecycle from ADR-058 D1 and agent memory primitives. The tier names are universal; the agent-domain scope each tier inherits (chat thread / operator session / operator) is the Ops-Agent-specific parameterization.

Tier	Scope parameterization	Example contents
Interaction	Current chat thread	In-progress task parameters, tool-call arguments, the operator’s current intent
Session	One operator’s currently active operator-session	Pending reviews the operator has been working through, recent signal digests they asked about, draft release notes they started
Persistent	One operator, durable across operator-sessions	Pinned preferences (which problem spaces they focus on), long-running task state (a draft world-model authoring effort that spans weeks), bookmarked candidates

Persistent-tier typology distribution: procedural-dominant with semantic permitted (signal digests asked about, durable inferences). No persistent-episodic produced.

Storage

Lives in the platform schema as platform.operations_agent_memory per ADR-059 D1. The Ops Agent runs in workers per Runtime placement; the memory is internal to platform-side application code. RLS policy keyed to operator_id (FK → core.users.id) via current_setting('app.user_id')::uuid per ADR-059 D6 — the session var names identity (SESSION_VAR_USER_ID = "app.user_id", added to spectral.core per ADR-065 admission discipline), not capability. The API auth middleware (and the workers transaction-setup path the Ops Agent runtime uses) sets SET LOCAL app.user_id = <jwt_subject> in operator-context transactions per ADR-041 D3 — apps/operations is a frontend SPA and never opens DB sessions.

App-layer enforcement is primary; RLS is the backstop applied consistently to internal-only data (no skipping on the basis of internal-only-ness).

The schema mirrors ADR-058 — a single table with joint tier + typology discriminator columns, plus per-(tier, typology) partial HNSW indexes on the embeddings retrievable table.

Retention

Retention is scope-inheritance, not direct TTL registration:

Interaction: lifetime of the chat thread. Removed when the thread ends, unless action-linked.
Session: inherits the operator session’s lifecycle. 30-day idle defines session-close; compounding-and-archive runs at close. Session-tier rows without action-linkage are removed at scope-end.
Persistent: registers a RetentionPolicy directly per data retention. Procedural memory is decay-exempt; semantic memory uses validity-window decay.

The audit posture is action-linkage-based — any-tier memory load-bearing for audit (action-linkage to rule promotion, change-set acceptance, published decision) is retained regardless of tier.

Version-spanning

Ops Agent memory is operator-keyed, not world-model-version-keyed. An operator’s memory persists across world-model versions because the operator’s workflow state (tasks, preferences, long-running efforts) does not have a version dependency. This is intentionally different from the World Agent’s persistent tier, which anchors on world_id for discovery continuity within a domain across versions (with world_version carried as contextual provenance per ADR-058 D5) — the Ops Agent’s durability is about operator continuity, not domain continuity.

Two operators working on the same world model each have their own Ops Agent memory. Neither can read the other’s. The operations team is small, but memory privacy is preserved regardless — different operators may have different preferences, different in-flight drafts, and different working hypotheses; commingling would collapse that clarity.

If operators need to share context (a pending review they want a colleague to pick up), the mechanism is the Operations-app’s shared task queue, not the agent’s private memory.

Non-negotiable boundaries — workshop framing

The Ops Agent’s memory is a workshop, not a canonical-content cache, per ADR-059 D4. It holds thought-in-progress plus workflow meta-state — never canonical content. Rule interpretation lives with the rule corpus and its tools; the agent reasons with rules to drive workflow rather than over rules to produce semantic claims, so the primary defense against rule-content shadowing is agent-design discipline, not a structural trigger.

Does not mirror world-model rule content. Rules remain authoritative in the world model itself; the Ops Agent accesses rules by reference. Semantic memory holds workflow meta-knowledge (“operator was working on conformity-gate coverage for World X”), not rule content (“the conformity gate says Y”).

The trigram-similarity trigger against worlds.rules.body_text (the same contract as ADR-058 D8 — BEFORE INSERT OR UPDATE OF body_text, 100-character floor, 0.85 similarity reject, scoped by world_id_context) is a defense-in-depth backstop here per ADR-059 D4 — the primary defense is the workshop-discipline framing above; the trigger catches doctrine drift rather than carrying the load. Trigger fires only on typology = semantic; procedural and episodic memory skip it (operator-pattern and conversation-transcript shapes do not shadow rule body).
Does not mirror scan data. Scan traces and evaluation results remain authoritative in platform; the Ops Agent accesses summaries by API call.
Does not persist customer PII. Customer context in Ops Agent conversations is ephemeral; durable customer data stays in workspace-scoped tables under RLS.

What the Operations Agent is NOT

Not the World Agent. Task-oriented vs read-oriented. See the boundary table in Agent Architecture — Operations Agent vs World Agent.
Not customer-facing at any tier. Every output is internal. Customer-facing artifacts (WorldModelCards, release notes) ship through the platform’s publication path, not through the agent’s conversational surface.
Not a free-form world-model mutator. Every mutation that touches the rule corpus passes through the Evolution Loop’s governed promotion gate with human sign-off.
Not a replacement for the human operator. The Ops Agent accelerates operator work; it does not remove the operator from the authority loop.

Interaction with the World Agent

Both agents are accessible to the operator inside the Operations app. The World Agent is accessible in the app but lives in spectral.worlds; the Ops Agent is owned by the app. They do not share memory, do not share tools, and do not delegate tasks to each other.

The operator is the integration point. The operator summons each agent for its own purpose:

Ops Agent: “Curate these three candidates into the enshrinement queue.”
World Agent: “Where do you think coverage is weakest in Schedule C?”

Interaction pattern — hybrid (separate surfaces + Ops-invokes-World tool)

Default: two separate chat surfaces. The Operations app renders the Ops Agent and the World Agent as two distinct chat surfaces. The operator explicitly switches between them. There is no unified supervisor-routed chat.

Extension: ask_world_agent(question) tool. The Ops Agent carries a single read-oriented tool that opens a synthesized question to the World Agent and surfaces the answer in the Ops chat. Used when the Ops Agent’s task planning genuinely needs exploration context (e.g., drafting a release-notes narrative benefits from the World Agent’s coverage summary).

No cross-state. The ask_world_agent tool is stateless from the World Agent’s perspective. The World Agent’s session-tier memory is scoped to its own chat surface; a tool-invoked question does not contribute to its session memory and does not read from it. What the World Agent knows about the world model is what the world model contains — the tool call gets an answer grounded in current world-model state, no more.

Why not unified supervisor routing

A supervisor-routed chat hides which agent is speaking. The distinction between do (Ops) and explore (World) is load-bearing for operator coherence — the operator must always know whether they are acting or reflecting. Routing obscures that. The separate-surfaces default makes the distinction structural.

Why not “World Agent is only a tool”

Making the World Agent accessible only as an Ops Agent tool would remove the operator’s ability to just explore. Sometimes the operator wants to sit with the World Agent and ask open-ended coverage questions without a task in mind. That use case is the whole point of the World Agent’s operator-facing access; collapsing it into Ops-only access would lose it.

When the operator chooses which

Operator intent	Surface
”Do this for me” (curate, promote, draft, publish)	Ops Agent chat
”What do you think about…” (coverage, gaps, provenance weakness)	World Agent chat
Ops Agent needs coverage context mid-task (e.g., drafting release notes)	Ops Agent invokes `ask_world_agent` internally; operator sees the exchange in the Ops chat

The operator’s pattern is expected to be: default into the Ops Agent, occasionally side-trip to the World Agent for exploration, let the Ops Agent pull in World Agent context when tasks need it.

Operations App overview — architectural placement, dual-occupant parity
Agent Architecture — the three-agent topology
World Agent — the read-oriented agent operators also access

Previous
Overview Next
World Model Authoring