Decisions

ADR-007: LangGraph Agent Architecture

Status: Accepted (2026-04-03) — three-agent topology expansion (Spectral / World / Operations Agents) per ADR-058, ADR-059, ADR-060; event-name refinements for proactive conversations under ADR-065 contract-surface doctrine. Carry-forward confirmation in ADR-043 for the conversation-persistence portion.

Context

The Spectral Agent is the conversational interface through which customers interact with the optimization platform — asking about scan results, troubleshooting failures, reviewing evaluation frameworks, and onboarding new workspaces.

The original implementation was a single-turn assistant using direct Anthropic API calls with a flat tool list. This had several limitations:

No conversation memory. Each request was stateless. The agent could not reference previous messages in a conversation, making multi-turn interactions impossible.
Monolithic tool surface. All tools were available to every request regardless of intent. The agent frequently called irrelevant tools, wasting tokens and producing confused responses.
No proactive behavior. The agent could only respond to explicit customer questions. It had no mechanism to initiate conversations when significant events occurred (scan completions, anomalies, configuration changes).
No approval workflow. Any tool the agent called executed immediately. There was no way to gate mutation operations behind human approval, which blocked the autonomy roadmap.
No multi-channel support. The assistant was hard-wired to the web dashboard. Adding Slack or email required rewriting the conversation model.

These limitations directly blocked the product roadmap: proactive insights after scans, Slack integration, and the autonomy ladder (where customers progressively grant the agent permission to take actions on their behalf).

Decision

Rebuild the Spectral Agent on LangGraph with a supervisor + specialist subagent architecture using the Deep Agents pattern.

Supervisor + Specialists

A single supervisor agent receives all customer messages and delegates to specialist subagents based on intent:

Specialist	Responsibility	Tools
scan-analyst	Interprets scan results, verdicts, scores, changesets, failure patterns	`list_recent_scans`, `get_scan_detail`, `get_scan_scores`, `get_changeset_detail`
onboarding-guide	Walks new customers through workspace setup	`get_onboarding_status`, `get_trace_ingestion_guide`
framework-advisor	Explains and recommends evaluation framework changes	`get_framework_config`, `get_rubrics`
troubleshooter	Diagnoses scan failures, configuration issues, data quality problems	`diagnose_scan_failure`, `check_workspace_health`, `get_recent_errors`

The supervisor carries no tools itself — it routes to at most two specialists per turn, then synthesizes their findings into a single customer-facing response.

Event-Driven Proactive Conversations

Domain events trigger agent-initiated conversations without customer prompting:

ScanCompletedEvent → proactive scan analysis with verdict summary
AnomalyDetectedEvent → investigation of quality regressions, cost spikes
WorkspaceConfigChangedEvent → review of configuration impact

Event handlers create system-initiated Conversation entities with AgentTask records. The worker picks up tasks via Supabase Realtime and processes them through the same LangGraph graph.

Human-in-the-Loop Approval

Tools are categorized as read, suggest, or mutate. Read and suggest tools execute freely. Mutate tools trigger a LangGraph interrupt, creating an AgentApproval record. The customer approves or rejects via the API (or Slack interactive buttons), which resumes the agent with Command(resume=decision).

Per-workspace WorkspaceAgentAuthorization records allow customers to pre-approve specific action types, bypassing the interrupt for trusted operations.

Channel-Agnostic Conversation Model

Conversations and messages are channel-agnostic domain entities. Channel adapters bind to conversations via ConversationChannelBinding:

Web adapter — in-app chat via Supabase Realtime
Slack adapter — bidirectional threads via Events API + Web API
Email adapter — notification delivery with digest aggregation

A single conversation can span multiple channels (start in Slack, continue in web). The domain model never references a specific channel.

State Management

LangGraph manages conversation state via its built-in checkpointer. The LangGraphCheckpointer adapter wraps AsyncPostgresSaver with a dedicated langgraph PostgreSQL schema, isolated from the application’s public schema. Thread IDs map to conversation IDs.

Tool Architecture

Tools use a closed-over dependency injection pattern: factory functions accept repository protocols and return plain callables. The LangGraph graph receives pre-built tool lists at construction time. This keeps the application layer framework-agnostic — tool factories live in spectral.application.agent.tools, while the graph construction lives in spectral.infrastructure.agents.langgraph_orchestrator.

Alternatives Considered

LangChain AgentExecutor (single agent with all tools): The existing approach, scaled up. Would preserve the monolithic tool surface problem and provide no natural way to specialize agent behavior by intent. Multi-turn state would require custom memory management outside the framework.

Custom state machine (no framework): Would give full control but requires implementing conversation threading, checkpointing, tool execution, and interrupt/resume from scratch. The maintenance burden is disproportionate given that LangGraph provides all of these as primitives.

CrewAI / AutoGen: Both support multi-agent orchestration but are opinionated about agent-to-agent communication patterns. LangGraph’s graph-based approach gives more control over routing and state management, and the Deep Agents pattern maps directly to the supervisor + specialist model.

Consequences

Positive

Specialist expertise. Each subagent has a focused system prompt and curated tool set, producing higher-quality responses than a generalist with 15+ tools.
Conversation continuity. LangGraph checkpointing provides multi-turn memory with no custom state management code.
Proactive engagement. The event-driven model lets the agent surface insights without waiting for customer questions — scan summaries and anomaly alerts.
Approval workflow. The interrupt/resume primitive maps directly to human-in-the-loop approval, unblocking the autonomy ladder.
Multi-channel support. The channel-agnostic conversation model supports web, Slack, and email through adapter implementations, not architectural changes.
Testability. Tool factories accept protocol-typed repositories, so specialist tools are testable without LangGraph, database, or LLM infrastructure.

Negative

Framework coupling. LangGraph and Deep Agents are runtime dependencies. The orchestrator is confined to spectral.infrastructure.agents to limit blast radius, but a framework migration would require rewriting graph construction and checkpointing.
Debugging complexity. Multi-agent delegation adds indirection. When the agent produces a bad response, you must determine whether the supervisor routed incorrectly or the specialist reasoned incorrectly.
Checkpointer schema. The dedicated langgraph PostgreSQL schema adds migration surface area managed by the LangGraph library rather than our migration tooling.
Latency overhead. Supervisor → specialist delegation adds at least one extra LLM call per request compared to a single-agent approach.

Previous
ADR-006: API Versioning Strategy and RESTful Conventions Next
ADR-008: Migrate Scan Pipeline LLM Provider to Pydantic AI