Agents

Agent Tool Invocation

Spectral runs three LangGraph-driven agents — Spectral Agent, World Agent, Operations Agent. Per-agent tool registries live in:

agent-architecture.mdx — Spectral Agent
world-agent.mdx — World Agent
operations-agent.mdx — Operations Agent (canonical tool surface)

This page describes the cross-cutting contract that all three share: envelope, error taxonomy, retry behavior, observability, approval mechanism, and framework-layer composition pattern. Decision lineage in ADR-060.

Runtime placement

All three agent runtimes live in apps/workers. apps/api is thin: authentication, AgentTask dispatch via outbox, and SSE streaming proxy. Workers consumes AgentTask events, loads checkpointer state (agent-architecture.mdx), runs the LangGraph orchestrator, executes tool calls, writes memory, and streams output via Supabase Realtime channel keyed by conversation_id; apps/api proxies the Realtime channel as SSE to the client.

Approval interrupts use LangGraph’s interrupt() to suspend the run; the checkpointer persists state; an operator response (HTTP into apps/api) resumes via Command(resume=...).

Tool envelope

Tools are plain async callables produced by closed-over-DI factories. Cross-cutting metadata is captured at call time via a lightweight ToolCallMetadata pydantic value object emitted by the observed_tool decorator. There is no ToolCallEnvelope wrapper around every tool body.

spectral.core.tools.metadata.ToolCallMetadata:

tool_name: str
agent_name: str
latency_ms: int
ok: bool
error_class: str | None
trace_id: str
started_at: datetime
ended_at: datetime

The observed_tool decorator emits one ToolCallMetadata per call to structlog and OTel (per observability-stack) and integrates with LLM cost tracking when the tool body invokes an LLM.

Error taxonomy

Four classes in spectral.core.tools.errors:

ToolUserError — invalid input from user/operator (bad args, missing context); user-visible
ToolPolicyError — policy/scope/approval denied; user-visible with framing
ToolTransientError — infrastructure transient (DB blip, brief LLM provider rate-limit)
ToolTerminalError — non-recoverable (invariant violation)

The taxonomy’s role is to shape what the LLM sees via the tool message. Tool errors flow back to the LLM as tool messages with the error class plus a human-readable description. The LLM decides next action: retry as-is, retry with modified args, surface to operator, abandon. LangGraph’s orchestrator-level recursion limit (default 25; configurable per agent) caps runaway loops as the circuit breaker.

There is no agent-layer retry budget. Tool implementations may include single-retry-on-transient-IO as an implementation detail (e.g., a DB connection blip); that is not contract.

Approval mechanism

Mutate-with-call-time-approval tools use LangGraph’s interrupt() to suspend the run with a standardized payload. spectral.core.tools.approval.ToolApprovalRequest:

tool_name: str
agent_name: str
args_summary: str — sanitized; PII-stripped; safe to display to the operator
effect_description: str — human-readable description of what will change
correlation_id: UUID

Operator response paths:

Approve: Command(resume=ApprovalGranted(...)) resumes the run; the tool body executes.
Deny: the tool aborts with ToolPolicyError(reason=APPROVAL_DENIED).
Request revision: the agent revises the proposed action and re-emits the approval request.

All paths audit-logged through the observability substrate.

Framework-layer composition

When a tool needs data or behavior in another context, the call flow goes through DI at the framework-layer composition seam — never via SQL grants. See Contract Surfaces for the canonical pattern.

The reference example is WorldAgentRunner (in spectral.worlds.contracts.protocols.world_agent per ADR-065 D3 — callee-owned OHS Protocol):

ask(question: str, *, world_id: UUID) -> str — stateless mode (no session, no memory)
chat(message: str, *, session_id: UUID, world_id: UUID) -> str — stateful mode

Implementation lives in spectral.worlds.application. Per ADR-065 D5, bridge tools (e.g. an Ops-Agent ask_world_agent tool callable) live in apps/* framework deliverables, never in caller-context code; the bridge imports worlds.contracts.protocols.world_agent.WorldAgentRunner (framework-to-context, allowed under validator rule 7) and is composed into the caller agent’s tool list via DI at workers startup. The caller agent (Ops Agent in spectral.platform.application) sees an opaque tool callable — never the Protocol or its implementation. Auto-generated docs for the Protocol live under Protocols.

The pattern is forward-compatible for additional bridge tools when the flow shape is genuinely call (caller needs a result back). For eventual-consistency reads of facts already recorded elsewhere — e.g., the Operations app’s view of rule-candidate outcomes, the World Agent’s view of T3 memory bodies — notification flow + consumer-side local projection (per ADR-065 D4 + ADR-064 D3) is the canonical choice; no Reader Protocol is minted. See Contract Surfaces for the simplest-fit ladder.

Operator-surface tool patterns

Two operational tool families ship under the Operations Agent surface:

DLQ inspection (per ADR-060 D7)

list_dlq_events(handler_name?, age_range?, limit) — read
get_dlq_event_detail(event_id) — read; returns event payload + sanitized failure history
replay_dlq_event(event_id, reason: str) — mutate with call-time approval; calls core.outbox_replay(); the reason field is captured in the audit log

Backed by OutboxReader / OutboxReplayer protocols injected at the workers entrypoint.

Cluster triage (per ADR-060 D8)

list_failure_clusters(severity?, status?) — read
get_cluster_detail(cluster_id) — read; returns snapshot + linked failures
triage_cluster(cluster_id, status, notes) — mutate with call-time approval; updates platform.rule_candidates_pending operator-managed columns

Symmetric with DLQ replay under the Ops Agent’s mutate-with-call-time-approval pattern.

Workshop discipline at the tool → memory boundary

Tool outputs containing canonical content (rule body, scan trace, customer PII) are not round-tripped into agent memory verbatim. Per the agent-memory-primitives workshop framing, agent memory holds thought-in-progress and workflow meta-state — not canonical content.

The repository gateway enforces typology-driven classification at the memory-write path; the trigram trigger backstops doctrine drift. There is no separate sanitization decorator on tools — discipline lives in the memory-write path, not the tool surface.