Agent Architecture
Spectral ships three distinct agents at 0.3.0. Each has a different audience, a different authority surface, and a different architectural home. Keeping their roles separate — and their state isolated — is what keeps the world-model authority credible, the optimization pipeline responsive, and the operator surface trustworthy.
The three agents at a glance
Section titled “The three agents at a glance”| Spectral Agent | World Agent | Operations Agent | |
|---|---|---|---|
| Audience | Customers | Operations team (internal) | Operations team (internal) |
| Shape of interaction | Conversational: explains scan verdicts, advises on frameworks, troubleshoots failures, guides onboarding | Read-oriented: ask what it knows about the world model, where coverage is thin, what its signal stream has surfaced | Task-oriented: curate this rule, promote this candidate, trigger this distillation, drive a publication |
| Has write authority? | Yes — on the customer-facing tool surface (scan analysis, framework guidance), customer-approved at every step | No — proposes rule candidates through the governed Evolution Loop; never mutates the world model directly | Yes — on the Operations app tool surface (authoring, distillation, publication, observability) |
| Scope of authority | One workspace at a time; customer-keyed | The world model it resides in | Operations-app state (curation queues, publication pipeline, observability surfaces) |
| Where its code lives | spectral.platform — supervisor + 4 specialists (LangGraph) | spectral.worlds — one resident agent per world model | apps/operations — operator seat (TanStack Start UI + LangGraph runtime) |
| Where its runtime lives | apps/workers (per ADR-060) | apps/workers | apps/workers |
| Memory keying | Workspace-scoped (per-user conversation isolation lives on the conversation table, not on the memory tiers) | World-model-scoped (one memory store per world model) | Operator-scoped + per-task scope |
The three agents do not share state, do not share memory, and do not share tools. Where they
touch the same operator seat (World Agent and Operations Agent are both reachable from the
Operations app), the operator is the integration point — there is no hidden routing layer
between agents. The World Agent never speaks to a customer; the Spectral Agent never reads world-
model authoring state; the Operations Agent never owns world-model authority. Cross-agent
information flow is shaped by the event substrate (Event System)
and by apps/workers as the framework-layer composition seam — not by direct agent-to-agent
calls.
For the doctrine on calls and notifications between contexts, see Contract Surfaces and ADR-065.
Spectral Agent
Section titled “Spectral Agent”The Spectral Agent is the customer’s conversational interface to the optimization platform. It analyzes scan results, troubleshoots failures, advises on evaluation frameworks, and guides new customers through onboarding.
For the decision rationale and alternatives, see ADR-007.
Supervisor + specialist model
Section titled “Supervisor + specialist model”Built on LangGraph with the Deep Agents pattern. A single supervisor receives all customer messages and delegates to specialist subagents based on intent:
Customer message → Supervisor (routes by intent) → Specialist subagent (focused tools + system prompt) → Tool calls (closed-over repository access) ← Specialist findings ← Supervisor synthesizes customer-facing responseThe supervisor carries no tools itself. It delegates to at most two specialists per turn, then synthesizes their findings into a single response.
Specialist subagents
Section titled “Specialist subagents”| Specialist | When used | Tools |
|---|---|---|
| scan-analyst | Scan outcomes, scores, verdicts, changesets, failure patterns | list_recent_scans, get_scan_detail, get_scan_scores, get_changeset_detail |
| onboarding-guide | New customers, setup questions, trace ingestion | get_onboarding_status, get_trace_ingestion_guide |
| framework-advisor | Evaluation framework questions, rubric tuning, objective function | get_framework_config, get_rubrics, get_world_model_context, get_system_card, get_evaluation_framework_provenance |
| troubleshooter | Scan failures, configuration issues, data quality, recurring errors | diagnose_scan_failure, check_workspace_health, get_recent_errors |
Each specialist has a focused system prompt. Specialists are conditionally registered: if no tools are provided at construction time, the specialist is omitted from the graph. This supports incremental rollout without code changes.
Task lifecycle
Section titled “Task lifecycle”Every interaction follows the same lifecycle, whether initiated by the customer or by an event:
pending → processing → (complete | failed)Conversation— channel-agnostic message container. Tracksinitiated_by(customer, agent, system) and optionaltrigger_event_idfor event-driven conversations.AgentTask— message from API to worker. The API writes tasks toagent_tasks; the worker subscribes via Supabase Realtime and processes them through the LangGraph graph.ConversationMessage— a single message withrole(user, assistant, system, tool),content,channel_origin, and optionaltool_calls.
Event-driven proactive conversations
Section titled “Event-driven proactive conversations”The agent does not only respond to customer questions. Domain events trigger agent-initiated conversations:
| Event | Trigger | Agent behavior |
|---|---|---|
verdict.issued | Verdict engine finalizes a scan verdict | Proactive conversation rendering the verdict and next-step recommendations. Fires on every scan regardless of autonomy mode — verdict and CompositeScore are always stored and surfaced; in observe_only the agent does not propose applying a ChangeSet because none is created. |
approval.required | Changeset-lifecycle handler raises a ChangeSet for review | Proactive conversation prompting the customer to review the proposed changeset, with explainability + agent performance card attached. |
supervisor.recommendation.issued | Supervisor surfaces a directional recommendation | Creates or updates a proactive conversation carrying the recommendation narrative with mode_classification (ACTIVE / PLATEAU / FRONTIER / NO_DATA). |
Event handlers — OnVerdictIssuedHandler, OnApprovalRequiredHandler, and
OnSupervisorRecommendationHandler — create or update a system-initiated Conversation, add a
context message summarizing the event, and queue an AgentTask for the graph to process.
Analysis flows back to the customer through the notification system. The agent is a consumer of
these events only; it does not emit them. (The event names refine the broad event categories
sketched in ADR-007 under the contract-surface
doctrine of ADR-065.)
Human-in-the-loop approval
Section titled “Human-in-the-loop approval”Tools are categorized by mutation impact:
| Category | Approval required | Examples |
|---|---|---|
read | No | List scans, get scores, check health |
suggest | No | Recommend framework changes, explain verdicts |
mutate | Yes | Apply changeset, update framework, modify config |
Mutation tool calls trigger a LangGraph interrupt. The system creates an AgentApproval
record with the proposed action and payload. The customer approves or rejects through the API
or Slack interactive buttons, resuming the agent with Command(resume=decision).
AgentApproval is the per-tool-call interrupt mechanism for mid-conversation proposed mutations
and is distinct from autonomy-mode changeset approval. Changeset approval (per
optimization engine — autonomy governance) is an
asynchronous gate on a packaged ChangeSet’s promotion; AgentApproval is a
synchronous LangGraph-interrupt() checkpoint inside a single conversation turn. The two
mechanisms share UX intent (consent before action) but have separate code paths and lifecycles —
conflating them is a doctrinal error.
Customers can pre-approve specific action types via WorkspaceAgentAuthorization, bypassing
the interrupt for that operation. The pre-authorization surface is a UX parallel to autonomy-mode
auto-acceptance — both let trusted action types skip an explicit consent step — but the underlying
mechanisms remain distinct.
Tool architecture
Section titled “Tool architecture”Tools use a closed-over dependency injection pattern. Factory functions accept repository protocols and return plain callables:
def make_scan_analyst_tools( *, scan_repo: ScanRepository, eval_result_repo: EvalResultRepository, changeset_repo: ChangeSetRepo, failure_cluster_repo: FailureClusterRepository,) -> list[Callable[..., str]]: return [ _make_list_scans(scan_repo), _make_get_scan_detail(scan_repo, failure_cluster_repo), _make_get_scan_scores(eval_result_repo), _make_get_changeset_detail(changeset_repo), ]The tool factory lives in the application layer; LangGraph graph construction lives in infrastructure. Each tool is unit-testable by passing mock repositories to the factory — no LangGraph, database, or LLM required.
Memory
Section titled “Memory”The Spectral Agent uses the universal three-tier memory schema (interaction / session / persistent) per ADR-058 D1, parameterized for the Spectral Agent’s scan-domain anchors (per-cycle interaction, per-run session, per-workspace persistent). ADR-058 supersedes the earlier ADR-018 Spectral-specific Cycle / Run / Workspace vocabulary; the three durability tiers are universal across all three Spectral agents and the per-agent parameterization lives in agent memory primitives.
The Spectral Agent does not reach into world-model memory, and the World Agent does not reach into Spectral memory. The only information flow between worlds and platform is the event-driven signal path per ADR-017.
Two distinct memory-write paths feed the Spectral Agent’s interaction-tier (T1) memory, both
through the spectral_agent_memory gateway. The gateway is the will-be repository surface owned
by the Spectral Agent’s memory implementation epic per
ADR-058 D15; the protocol-level discipline below is the
contract regardless of when the gateway lands.
- Scan-pipeline event handler writes. The
OnScanCompletedHandlerwrites T1 entries summarizing scan outcomes (verdict,CompositeScore, ChangeSet shape) — these describe what the scan produced. TheOnScanCompletedFeedbackHandlersimilarly routes feedback signals into T1 / T2 alongside its workspace-scopedFeedbackSignalrecords. - Agent runtime per-tool-call writes. The
observed_tooldecorator capturesToolCallMetadataat every tool invocation during conversation; the tool body’s reasoning and per-call workflow meta-state route through the gateway into T1 — these describe what the agent did mid-conversation.
Compounding (interaction → session → persistent) lives in the gateway and runs at cycle-end and run-end scan-event boundaries; the persistent tier is reasoning-shaped (workshop discipline per agent memory primitives), not a cache of canonical content.
Channels & notifications
Section titled “Channels & notifications”Conversations are channel-agnostic. ConversationChannelBinding maps a conversation to a
channel-specific reference (Slack thread timestamp, web session ID, etc.). Currently implemented
adapters:
| Adapter | Location | Delivery |
|---|---|---|
| In-app | infrastructure/agent/channels/in_app_adapter.py | Supabase Realtime |
| Slack | infrastructure/agent/channels/slack_adapter.py | Slack Web API + Events API; threads map to conversations |
infrastructure/agent/channels/email_adapter.py | Pluggable EmailSender, digest aggregation |
A single conversation can span multiple channels — start in Slack, continue in the web dashboard.
The NotificationService orchestrates multi-channel delivery: persists in-app baseline, resolves
per-workspace preferences, dispatches to each configured adapter, logs individual failures
without blocking other channels.
State management
Section titled “State management”LangGraph manages conversation state through its built-in checkpointing system:
- Checkpointer:
LangGraphCheckpointerwrapsAsyncPostgresSaver - Schema isolation: Checkpoint tables live in a dedicated
langgraphPostgreSQL schema (framework-owned;AsyncPostgresSaver.setup()provisions; the Supabase CLI migration pipeline does not touch it). Per ADR-043 D7. - Thread mapping: LangGraph
thread_idmaps directly toconversation_id - Repository gateway: All
AsyncPostgresSavercalls flow through the singlespectral.platform.agent.CheckpointerGatewayper ADR-043 D8. - Same-transaction participation (ADR-043 D9):
AsyncPostgresSaver.aputruns on the request-scope connection fromspectral_platform.db.request_scopeper connection pooling — avoiding torn-write risk between business ops and checkpoint writes. - Lifecycle: Checkpointer initialized once at worker startup; the compiled graph is reused across requests.
- Encryption posture: Checkpointer payloads are encrypted at rest via Supabase storage
encryption; key rotation, KMS posture, and recovery drill procedures live in
docs/runbooks/checkpointer-encryption.md.
The orchestrator is stateless per-request; all conversation state is managed by the checkpointer.
Runtime placement (workers)
Section titled “Runtime placement (workers)”All three Spectral agent runtimes (Spectral / Ops / World) live in apps/workers per
ADR-060 D-runtime. apps/api is thin — auth + AgentTask
dispatch via outbox + SSE streaming proxy.
Streaming pattern. Workers consumes AgentTask events, runs the LangGraph orchestrator,
streams output via Supabase Realtime channel keyed by conversation_id. apps/api proxies the
Realtime channel as SSE to the client. Two hops (workers → Realtime → API → SSE); latency is
negligible vs LLM token latency.
AgentTask dispatch via outbox (per ADR-044 D12).
platform.agent_tasks carries business state (status, HITL approval linkage, conversation_id,
result back-reference, retention). Dispatch flows through core.outbox with
event_type='agent.task.dispatched'. Workers listens on the channel, reads the payload, pulls
the agent_tasks row, executes. Approval interrupts use LangGraph interrupt() to suspend the
run; the checkpointer persists state; an operator response (HTTP into apps/api) resumes via
Command(resume=...).
Framework-layer composition seam. Workers IS the framework-layer composition seam where tool dependencies wire via DI per Contract Surfaces. Agent code never imports another context; the workers entrypoint imports both worlds and platform at startup and injects implementations into agent tool factories.
Error handling: LLM-mediated, LangGraph circuit breaker
Section titled “Error handling: LLM-mediated, LangGraph circuit breaker”ADR-060 D2 + D3 specifies:
- Tool errors flow back to the LLM as tool messages with the four-class
ToolErrortaxonomy (ToolUserError,ToolPolicyError,ToolTransientError,ToolTerminalError) plus a human-readable description. - The LLM decides next action: retry as-is, retry with modified args, surface to operator, abandon.
- LangGraph’s orchestrator-level recursion limit (default 25; configurable per agent) caps runaway loops as the circuit breaker.
- No agent-layer retry budget. Tool implementations may include single-retry-on-transient-IO as an implementation detail; that is not contract.
See agent tool invocation for the cross-cutting tool contract.
What the Spectral Agent is not
Section titled “What the Spectral Agent is not”- Not a world-model authority. It reads world-model context through worlds’s producer-owned
contract surfaces (
spectral.worlds.contracts.events.*for notification flow,spectral.worlds.contracts.protocols.*for synchronous calls per ADR-065 D2 + D3) but has no authorship surface over rules. - Not a silent actor. Every mutation either passes through the approval ladder or is explicitly pre-authorised per action type.
- Not an Operations Agent. Customer-facing only.
World Agent
Section titled “World Agent”Internal resident of each world model. Not customer-facing at any tier; accessible to operators through the Operations app as a read-oriented exploration surface. Full specification lives in the World Model System / World Agent page; the section here covers its role relative to the other two agents.
Each world model has exactly one resident World Agent. The World Agent:
- Explores domain coverage against signal-stream observations
- Proposes rule candidates through the governed Evolution Loop (never mutates directly)
- Surfaces coverage gaps and provenance weakness to operators on demand
- Maintains discovery continuity across world model versions through version-spanning memory
Tool surface
Section titled “Tool surface”The World Agent’s tool surface is scoped to its own world. It has unrestricted read access to the world model’s rule corpus, to the three-source EvalSet corpus (per ADR-022), and to its own three-tier memory. It does not have write access to the rule corpus — rule promotion runs through the Evolution Loop’s governed pipeline, not through the agent.
Memory
Section titled “Memory”Universal three-tier lifecycle (interaction / session / persistent) per
ADR-058 D1, with World Agent anchors:
agent_interaction_id (interaction), agent_session_id (session), and world_id (persistent —
durable across world model versions per ADR-058 D2). Each row also carries a
typology enum(episodic, semantic, procedural) discriminator per ADR-058 D3 (transient tiers are
episodic; the persistent tier holds semantic + procedural about the world’s domain). Schema and
behavioral details in agent memory primitives and
world-agent.
Memory contains reasoning, exploration history, and discovery observations; it never contains rule
content. Rules live in the world model itself; the World Agent has read access to them rather than
a memorised copy. The reference-only invariant (ADR-058 D14) is enforced by the body_text
trigram-similarity trigger on worlds.world_agent_memory (ADR-058 D8).
Customer-facing posture
Section titled “Customer-facing posture”Never. The customer-facing prohibition is absolute. Operators access the World Agent through
the Operations app; customers do not see its outputs directly. When world-model reasoning needs
to reach a customer, it goes through the WorldModelCard (formal artifact) or through the Spectral
Agent (which reads world-model context via worlds’s producer-owned contract surfaces in
spectral.worlds.contracts.* per ADR-065).
What the World Agent is not
Section titled “What the World Agent is not”- Not a customer-facing surface
- Not an accessor of Spectral memory at any tier
- Not a decision-maker in enshrinement — human sign-off governs promotion
- Not the Operations Agent — see the boundary table below
Operations Agent
Section titled “Operations Agent”Net-new in 0.3.0. Lives in apps/operations and is the operator’s task-oriented collaborator for
world-model authoring, distillation, publication, and observability. Where the World Agent is
read-oriented (understand the world), the Operations Agent is write-oriented (act on the
world-authoring surface).
- World-model authoring. Drafting rule candidates from Authoritative sources, staging them into the Evolution Loop, reviewing promotion queues.
- Distillation. Proposing condensations of redundant rules, flagging rules with drifted provenance chains.
- Publication. Generating WorldModelCard drafts, reviewing release notes (per ADR-026), coordinating version publication.
- Operational observability. Summarising scan-volume trends, convergence-signal health, customer adoption of world-model-grounded vs customer-directed stimuli.
Memory
Section titled “Memory”Operations-app scoped. Tracks the operator’s in-flight tasks, pending reviews, recent signal digests. Independent from both Spectral and World Agent memory; does not mirror world-model rule content (that remains authoritative in the world model itself, accessed by reference).
Customer-facing posture
Section titled “Customer-facing posture”Never. Operations-only. Every output of the Operations Agent is internal. When operator work produces a customer-facing artifact (a published WorldModelCard, a release note), the artifact goes out through the platform’s normal publication path — not through the agent’s conversational surface.
What the Operations Agent is not
Section titled “What the Operations Agent is not”- Not the World Agent. Task-oriented vs read-oriented; see the Boundary table below.
- Not customer-facing at any tier.
- Not a free-form world-model mutator. Every mutation that touches the rule corpus passes through the Evolution Loop’s governed promotion gate.
Boundary & interaction patterns
Section titled “Boundary & interaction patterns”Operations Agent vs World Agent
Section titled “Operations Agent vs World Agent”Both live at the operator’s seat inside the Operations app. Keeping their roles distinct is what makes the operator interaction coherent rather than schizophrenic.
| Ops Agent | World Agent | |
|---|---|---|
| Purpose | Perform operational tasks on the Operations-app tool surface | Explore and reflect on the world model |
| Interaction shape | Task: curate this rule, promote this candidate, trigger this process | Read: what do you know, where are you uncertain, how confident are you |
| Write authority? | Yes — on the Operations-app surface | No — proposes through the Evolution Loop, never mutates directly |
| Scope | Operations-app state (curation queues, promotion actions, observability) | The world model it resides in |
| Memory | Operations-app operator-state | Three-tier world-model memory |
The operator using both is the integration point. The Ops Agent does not delegate reasoning to the World Agent; the World Agent does not take tasks from the Ops Agent. Each is summoned by the operator for its own purpose, and the operator stitches the two conversations together when needed.
The exact routing / handoff / context-sharing pattern between operator, Ops Agent, and World
Agent is specified at
Operations App — Operations Agent / Interaction pattern.
The hybrid default (separate surfaces + a read-oriented ask_world_agent tool on the Ops Agent)
preserves operator clarity about do vs explore while letting Ops tasks pull in coverage
context when needed.
Why two agents and not one with mode-switching
Section titled “Why two agents and not one with mode-switching”A single richer agent with mode-switching would conflate the do and explore modes the operator actually exercises separately. The cost of two agents (two memory systems, two tool surfaces, operator cognitive load to summon the right one) is real but bounded; the cost of one agent is that the act-vs-reflect boundary becomes a runtime mode the agent must self-regulate, and the world-model authority claim becomes harder to defend (an agent that can mutate the world model will be asked to skip the gate, even if a mode flag normally prevents it). The two-agent split makes the boundary structural rather than procedural — the World Agent has no write authority on the rule corpus by construction, not by convention. That structural guarantee is load-bearing for the same reason the customer-side platform / world-model-system isolation is load-bearing: the standard’s authority survives because the system that builds it cannot reach into the standard.
Spectral Agent ↔ other agents
Section titled “Spectral Agent ↔ other agents”No shared state, no direct communication.
- Spectral Agent reads world-model context only through worlds’s producer-owned contract
surfaces (
spectral.worlds.contracts.events.*for notifications,spectral.worlds.contracts.protocols.*for synchronous calls per ADR-065 D2 + D3) — and the locally-projected snapshots that materialize from them. Never through world-model internals, never through the World Agent’s memory. - Flow from platform → worlds is event-only, per
ADR-017. Events carry
observations (
platform.failure_cluster.detected,scan.convergence.delta,memory.observation.promoted); they do not carry agent state. - The World Agent consumes signal events as one of its exploration inputs. The Spectral Agent is not in that loop.
What the topology guarantees
Section titled “What the topology guarantees”- Authority isolation. The world model’s authority rests on its rule corpus + evolution governance, not on any agent’s conversational output. If all three agents went down tomorrow, the WorldModelCard would still be authoritative for its published version.
- Customer-safety. Customers talk only to the Spectral Agent. Operator reasoning (which can be exploratory, uncertain, and provenance-dependent) never leaks directly to the customer surface.
- Operator coherence. An operator always knows which agent they are talking to and why — Ops Agent for do this, World Agent for what do you think about this. The two never bleed.
Code map (Spectral Agent)
Section titled “Code map (Spectral Agent)”| Component | Location | Layer |
|---|---|---|
| Domain entities | spectral.platform.domain.agent.models | Domain |
| Domain events | spectral.platform.domain.shared.events | Domain |
| Tool factories | spectral.platform.application.agent.tools | Application |
| Notification service | spectral.platform.application.agent.notifications | Application |
| Event handlers | spectral.platform.application.agent.on_scan_completed (handler class lives in application; runtime is apps/workers per ADR-060) | Application |
| Authorization service | spectral.platform.application.agent.authorization | Application |
| Email digest service | spectral.platform.application.agent.email_digest | Application |
| LangGraph orchestrator | spectral.platform.infrastructure.agent.langgraph_orchestrator | Infrastructure |
| Checkpointer adapter | spectral.platform.infrastructure.agent.checkpointer | Infrastructure |
| Channel adapters | spectral.platform.infrastructure.agent.channels | Infrastructure |
World Agent implementation lives in spectral.worlds; Operations Agent runs in apps/workers per ADR-060 (Runtime placement), with the Operations app surface in apps/operations.
Each context’s agent-level code is structured the same way (domain / application / infrastructure)
but the modules are local to their context.
Next steps
Section titled “Next steps”- World Model System / World Agent — the full World Agent specification (memory tiers, exploration patterns, operator chat framing)
- Operations App — the Operations-app surface that hosts the Operations Agent
- Optimization Engine — the scan pipeline the Spectral Agent analyzes
- Access Control — roles and scopes that govern agent permissions