Operations Agent
The Operations Agent is the task-oriented collaborator the Spectral operations team uses to drive world-model-lifecycle work inside the Operations app. Where the World Agent is read-oriented (understand the world), the Operations Agent is write-oriented (act on the world-authoring surface).
Scoped to the apps/operations surface. The Ops Agent runs operator-initiated tasks that would otherwise require manual UI navigation and API stitching:
- World-model authoring. Drafting rule candidates from Authoritative-tier source material, staging them into the Evolution Loop, reviewing promotion queues.
- Distillation. Proposing condensations of redundant rules, flagging rules with drifted provenance chains, kicking off LLM-guided distillation runs against source documents.
- Publication. Generating WorldModelCard drafts, reviewing release notes, coordinating version publication.
- Operational observability. Summarizing scan-volume trends, convergence-signal health, customer adoption of world-model-grounded vs customer-directed stimuli.
The canonical role description lives in Agent Architecture — Operations Agent.
Tool surface
Section titled “Tool surface”Scoped to the Operations-app API. The Ops Agent does not reach into worlds or platform
internals — every tool call is an HTTP request to apps/api (the same path the human operator’s
UI takes) with the same auth and RBAC.
The table below is the authoritative tool list. Each tool is classified read / suggest /
mutate, and mutations are further classified by whether call-time approval is required or
whether approval lives at the enshrinement/publication gate only.
Classification rule
Section titled “Classification rule”read— no mutation, no approval. The agent may call freely.suggest— produces candidate output (text, draft, proposal) that the operator reviews before acting. No call-time approval; the operator reviewing the output is the gate.mutate— writes state.- Gated by enshrinement/publication. Mutation is not itself authoritative — the governed enshrinement or publication gate is where human authority is exercised. No call-time approval required.
- Gated at call-time. Mutation has authority or audit sensitivity that the enshrinement gate does not cover (retracting a published rule, anchoring a new Authoritative-tier source, publishing a version). Operator must approve inline in the chat surface before the tool call executes.
The tool surface
Section titled “The tool surface”| Category | Tool | Class | Call-time approval? |
|---|---|---|---|
| World-model lifecycle | list_world_models | read | — |
get_world_model_status | read | — | |
create_world_model | mutate | Yes (first time) / pre-approvable per workspace scope | |
retire_world_model | mutate | Yes — always | |
| Source materials | search_sources | read | — |
list_attached_sources | read | — | |
attach_source_material | mutate | Yes — Authoritative-tier anchoring matters | |
ingest_source | mutate | No — ingestion is a background job; output gated at enshrinement | |
| Distillation | request_distillation | suggest | No — candidates are gated at enshrinement |
check_distillation_status | read | — | |
get_candidates | read | — | |
| Rule authoring (candidates only) | propose_rule_candidate | mutate | No — candidate is not authoritative |
revise_candidate | mutate | No — still in candidate state | |
retire_rule | mutate | Yes — retiring a published rule is version-level | |
| Candidate review (read-only for agent) | list_pending_candidates | read | — |
get_candidate_detail | read | — | |
get_conformity_gate_status | read | — | |
approve_candidate / reject_candidate / request_revision | — | Operator-only — not on the agent’s surface | |
| Publication | list_enshrined_rules | read | — |
draft_release_notes | suggest | No — operator edits before commit | |
draft_world_model_card | suggest | No — operator edits before commit | |
request_publication | mutate | Yes — the version-authority event | |
| Status / health | get_latest_card | read | — |
get_rule_health_signal | read | — | |
get_coverage_summary | read | — | |
| DLQ inspection (ADR-060 D7) | list_dlq_events | read | — |
get_dlq_event_detail | read | — | |
replay_dlq_event | mutate | Yes — call-time approval; calls core.outbox_replay() | |
| Cluster triage (ADR-060 D8) | list_failure_clusters | read | — |
get_cluster_detail | read | — | |
triage_cluster | mutate | Yes — call-time approval; updates platform.rule_candidates_pending operator-managed columns | |
| Bridge into worlds | ask_world_agent | read | — — composes via in-process DI through workers entrypoint per agent tool invocation |
What stays operator-only
Section titled “What stays operator-only”approve_candidate, reject_candidate, and request_revision are not on the Ops Agent’s
tool surface. These are the governed enshrinement-gate actions and they are exclusively
human-operator actions (UI click or explicit operator API call). The Ops Agent can summarize a
candidate, cross-reference, and flag — but it cannot promote.
Similarly, the final publish click is operator-only; request_publication prepares the
publication bundle and requires the operator’s call-time approval, after which the publication
is durable. There is no “agent auto-publishes” path.
The boundary is structural, not procedural: the Ops Agent is a draft and propose layer; the human is the authority gate. That separation is what makes the world-model authority claim hold externally — the agent’s drafts can be sophisticated, but every promotion to Enshrined and every published version carries an operator’s identity in the audit record. Removing that gate would mean Spectral itself can author the standard customers cite, which collapses the authority-isolation argument the Codex makes in How Spectral Works — Why two pillars.
Tool infrastructure inherited from the Spectral Agent
Section titled “Tool infrastructure inherited from the Spectral Agent”The Ops Agent reuses the Spectral Agent’s tool infrastructure rather than defining a parallel one.
Every tool above is decorated with the same @observed_tool decorator the Spectral Agent
uses (per ADR-060) — same OTel span shape, same
(workspace_id, purpose) accounting, same correlation-ID propagation. Tool
errors raise the same four-class ToolError taxonomy (ToolUserError, ToolPolicyError,
ToolTransientError, ToolTerminalError); the LangGraph orchestrator routes errors through the same
LLM-mediated handling path so the Ops Agent decides retry / modify / surface / abandon the same
way the Spectral Agent does. There is no Ops-Agent-specific error class and no Ops-Agent-specific
observability wrapper — the Operations app inherits the platform’s agent-tool conventions wholesale.
How call-time approval is captured
Section titled “How call-time approval is captured”Inline in the chat surface. When the agent is about to invoke a mutate-with-call-time-approval
tool, it emits a structured approval request (tool name, parameters, natural-language summary of
what will happen) via LangGraph’s interrupt() per
agent tool invocation / Approval mechanism.
The operator approves or denies with a single interaction. Rejections abort the tool call and
surface back to the agent so it can revise.
Every call appends an operations_agent_approval row capturing
(operator_id, conversation_id, tool_name, args_summary, decision, decided_at, correlation_id) —
records, not memory. The row is operator-scoped (no workspace RLS) and append-only, mirroring the
approval_decision shape from the operational control plane (see
evolution-loop / Reviewer surface)
in the same identity domain (both app.user_id-keyed, operator-scoped) but with a different action
surface (Ops-Agent call-time approval vs enshrinement-gate decision). Distinct from the Spectral
Agent’s workspace-scoped agent_approval table — same record shape, different identity scope.
This mirrors the Spectral-Agent approval-ladder model (per Agent Architecture — Human-in-the-loop approval) but is scoped to the operator seat.
Framework-layer composition — bridge tools live in apps/* per ADR-065 D5
Section titled “Framework-layer composition — bridge tools live in apps/* per ADR-065 D5”The Ops Agent never imports spectral.worlds.*. Bridge tools live in apps/* framework
deliverables, never in caller-context code (per
ADR-065 D3 + D5,
ADR-063, and
ADR-060).
For the ask_world_agent tool specifically:
- The callee-owned OHS Protocol
WorldAgentRunnerlives inspectral.worlds.contracts.protocols.world_agent(per ADR-065 D3 — see Protocols for the catalog). - The concrete implementation lives in
spectral.worlds.application. - The bridge tool — the
ask_world_agentcallable that the Ops Agent invokes — lives inapps/workers(or a futureapps/mcp-world-bridge/for an MCP server, per ADR-065 D5). The bridge importsWorldAgentRunner(framework-to-context, allowed under validator rule 7) and composes into the Ops Agent’s tool list via DI at workers startup. - The Ops Agent (in
spectral.platform.application) sees an opaque tool callable — never the Protocol or its implementation. Caller-context code never imports another context.
Platform-internal Protocols consumed by Ops Agent tools (OutboxReader, OutboxReplayer from
spectral.platform.application.operations_agent.outbox) follow the standard within-context DI
pattern; they don’t cross a context boundary, so they don’t need the framework-layer bridge. The
workers entrypoint composes both surfaces — bridges into worlds and within-platform dependencies —
into the agent’s tool registry. Each tool factory (ask_world_agent, list_dlq_events,
get_dlq_event_detail, replay_dlq_event, list_failure_clusters, get_cluster_detail,
triage_cluster) is unit-testable against mock Protocols.
Memory
Section titled “Memory”Operations-app scoped. Tracks the operator’s in-flight tasks, pending reviews, and recent signal digests. Independent from both Spectral and World Agent memory.
The logical model is operator-keyed, three tiers, with no cross-operator sharing. Schema
location is platform.operations_agent_memory per
ADR-059 D1: contexts are code boundaries, not schema
boundaries; all Operations Agent code lives under src/spectral/platform/, so memory lands in
the platform schema alongside the Spectral Agent’s memory tables.
The Ops Agent parameterizes the universal interaction / session / persistent lifecycle from ADR-058 D1 and agent memory primitives. The tier names are universal; the agent-domain scope each tier inherits (chat thread / operator session / operator) is the Ops-Agent-specific parameterization.
| Tier | Scope parameterization | Example contents |
|---|---|---|
| Interaction | Current chat thread | In-progress task parameters, tool-call arguments, the operator’s current intent |
| Session | One operator’s currently active operator-session | Pending reviews the operator has been working through, recent signal digests they asked about, draft release notes they started |
| Persistent | One operator, durable across operator-sessions | Pinned preferences (which problem spaces they focus on), long-running task state (a draft world-model authoring effort that spans weeks), bookmarked candidates |
Persistent-tier typology distribution: procedural-dominant with semantic permitted (signal digests asked about, durable inferences). No persistent-episodic produced.
Storage
Section titled “Storage”Lives in the platform schema as platform.operations_agent_memory per
ADR-059 D1. The Ops Agent runs in workers per
Runtime placement; the memory is internal
to platform-side application code. RLS policy keyed to operator_id (FK → core.users.id) via
current_setting('app.user_id')::uuid per ADR-059 D6 — the
session var names identity (SESSION_VAR_USER_ID = "app.user_id", added to spectral.core
per ADR-065 admission discipline), not capability. The API auth middleware (and the workers
transaction-setup path the Ops Agent runtime uses) sets SET LOCAL app.user_id = <jwt_subject>
in operator-context transactions per ADR-041 D3 —
apps/operations is a frontend SPA and never opens DB sessions.
App-layer enforcement is primary; RLS is the backstop applied consistently to internal-only data (no skipping on the basis of internal-only-ness).
The schema mirrors ADR-058 — a single table with joint
tier + typology discriminator columns, plus per-(tier, typology) partial HNSW indexes on
the embeddings retrievable table.
Retention
Section titled “Retention”Retention is scope-inheritance, not direct TTL registration:
- Interaction: lifetime of the chat thread. Removed when the thread ends, unless action-linked.
- Session: inherits the operator session’s lifecycle. 30-day idle defines session-close; compounding-and-archive runs at close. Session-tier rows without action-linkage are removed at scope-end.
- Persistent: registers a
RetentionPolicydirectly per data retention. Procedural memory is decay-exempt; semantic memory uses validity-window decay.
The audit posture is action-linkage-based — any-tier memory load-bearing for audit (action-linkage to rule promotion, change-set acceptance, published decision) is retained regardless of tier.
Version-spanning
Section titled “Version-spanning”Ops Agent memory is operator-keyed, not world-model-version-keyed. An operator’s memory
persists across world-model versions because the operator’s workflow state (tasks, preferences,
long-running efforts) does not have a version dependency. This is intentionally different from
the World Agent’s persistent tier, which anchors on world_id for discovery continuity within
a domain across versions (with world_version carried as contextual provenance per
ADR-058 D5) — the Ops Agent’s durability is about
operator continuity, not domain continuity.
Privacy — no cross-operator sharing
Section titled “Privacy — no cross-operator sharing”Two operators working on the same world model each have their own Ops Agent memory. Neither can read the other’s. The operations team is small, but memory privacy is preserved regardless — different operators may have different preferences, different in-flight drafts, and different working hypotheses; commingling would collapse that clarity.
If operators need to share context (a pending review they want a colleague to pick up), the mechanism is the Operations-app’s shared task queue, not the agent’s private memory.
Non-negotiable boundaries — workshop framing
Section titled “Non-negotiable boundaries — workshop framing”The Ops Agent’s memory is a workshop, not a canonical-content cache, per ADR-059 D4. It holds thought-in-progress plus workflow meta-state — never canonical content. Rule interpretation lives with the rule corpus and its tools; the agent reasons with rules to drive workflow rather than over rules to produce semantic claims, so the primary defense against rule-content shadowing is agent-design discipline, not a structural trigger.
-
Does not mirror world-model rule content. Rules remain authoritative in the world model itself; the Ops Agent accesses rules by reference. Semantic memory holds workflow meta-knowledge (“operator was working on conformity-gate coverage for World X”), not rule content (“the conformity gate says Y”).
The trigram-similarity trigger against
worlds.rules.body_text(the same contract as ADR-058 D8 —BEFORE INSERT OR UPDATE OF body_text, 100-character floor, 0.85 similarity reject, scoped byworld_id_context) is a defense-in-depth backstop here per ADR-059 D4 — the primary defense is the workshop-discipline framing above; the trigger catches doctrine drift rather than carrying the load. Trigger fires only ontypology = semantic; procedural and episodic memory skip it (operator-pattern and conversation-transcript shapes do not shadow rule body). -
Does not mirror scan data. Scan traces and evaluation results remain authoritative in platform; the Ops Agent accesses summaries by API call.
-
Does not persist customer PII. Customer context in Ops Agent conversations is ephemeral; durable customer data stays in workspace-scoped tables under RLS.
What the Operations Agent is NOT
Section titled “What the Operations Agent is NOT”- Not the World Agent. Task-oriented vs read-oriented. See the boundary table in Agent Architecture — Operations Agent vs World Agent.
- Not customer-facing at any tier. Every output is internal. Customer-facing artifacts (WorldModelCards, release notes) ship through the platform’s publication path, not through the agent’s conversational surface.
- Not a free-form world-model mutator. Every mutation that touches the rule corpus passes through the Evolution Loop’s governed promotion gate with human sign-off.
- Not a replacement for the human operator. The Ops Agent accelerates operator work; it does not remove the operator from the authority loop.
Interaction with the World Agent
Section titled “Interaction with the World Agent”Both agents are accessible to the operator inside the Operations app. The World Agent is
accessible in the app but lives in spectral.worlds; the Ops Agent is owned by the app.
They do not share memory, do not share tools, and do not delegate tasks to each other.
The operator is the integration point. The operator summons each agent for its own purpose:
- Ops Agent: “Curate these three candidates into the enshrinement queue.”
- World Agent: “Where do you think coverage is weakest in Schedule C?”
Interaction pattern — hybrid (separate surfaces + Ops-invokes-World tool)
Section titled “Interaction pattern — hybrid (separate surfaces + Ops-invokes-World tool)”Default: two separate chat surfaces. The Operations app renders the Ops Agent and the World Agent as two distinct chat surfaces. The operator explicitly switches between them. There is no unified supervisor-routed chat.
Extension: ask_world_agent(question) tool. The Ops Agent carries a single read-oriented
tool that opens a synthesized question to the World Agent and surfaces the answer in the Ops
chat. Used when the Ops Agent’s task planning genuinely needs exploration context (e.g., drafting
a release-notes narrative benefits from the World Agent’s coverage summary).
No cross-state. The ask_world_agent tool is stateless from the World Agent’s perspective.
The World Agent’s session-tier memory is scoped to its own chat surface; a tool-invoked
question does not contribute to its session memory and does not read from it. What the World
Agent knows about the world model is what the world model contains — the tool call gets an
answer grounded in current world-model state, no more.
Why not unified supervisor routing
Section titled “Why not unified supervisor routing”A supervisor-routed chat hides which agent is speaking. The distinction between do (Ops) and explore (World) is load-bearing for operator coherence — the operator must always know whether they are acting or reflecting. Routing obscures that. The separate-surfaces default makes the distinction structural.
Why not “World Agent is only a tool”
Section titled “Why not “World Agent is only a tool””Making the World Agent accessible only as an Ops Agent tool would remove the operator’s ability to just explore. Sometimes the operator wants to sit with the World Agent and ask open-ended coverage questions without a task in mind. That use case is the whole point of the World Agent’s operator-facing access; collapsing it into Ops-only access would lose it.
When the operator chooses which
Section titled “When the operator chooses which”| Operator intent | Surface |
|---|---|
| ”Do this for me” (curate, promote, draft, publish) | Ops Agent chat |
| ”What do you think about…” (coverage, gaps, provenance weakness) | World Agent chat |
| Ops Agent needs coverage context mid-task (e.g., drafting release notes) | Ops Agent invokes ask_world_agent internally; operator sees the exchange in the Ops chat |
The operator’s pattern is expected to be: default into the Ops Agent, occasionally side-trip to the World Agent for exploration, let the Ops Agent pull in World Agent context when tasks need it.
Related reading
Section titled “Related reading”- Operations App overview — architectural placement, dual-occupant parity
- Agent Architecture — the three-agent topology
- World Agent — the read-oriented agent operators also access