Agent Tool Invocation
Spectral runs three LangGraph-driven agents — Spectral Agent, World Agent, Operations Agent. Per-agent tool registries live in:
agent-architecture.mdx— Spectral Agentworld-agent.mdx— World Agentoperations-agent.mdx— Operations Agent (canonical tool surface)
This page describes the cross-cutting contract that all three share: envelope, error taxonomy, retry behavior, observability, approval mechanism, and framework-layer composition pattern. Decision lineage in ADR-060.
Runtime placement
Section titled “Runtime placement”All three agent runtimes live in apps/workers. apps/api is thin: authentication, AgentTask dispatch via outbox, and SSE streaming proxy. Workers consumes AgentTask events, loads checkpointer state (agent-architecture.mdx), runs the LangGraph orchestrator, executes tool calls, writes memory, and streams output via Supabase Realtime channel keyed by conversation_id; apps/api proxies the Realtime channel as SSE to the client.
Approval interrupts use LangGraph’s interrupt() to suspend the run; the checkpointer persists state; an operator response (HTTP into apps/api) resumes via Command(resume=...).
Tool envelope
Section titled “Tool envelope”Tools are plain async callables produced by closed-over-DI factories. Cross-cutting metadata is captured at call time via a lightweight ToolCallMetadata pydantic value object emitted by the observed_tool decorator. There is no ToolCallEnvelope wrapper around every tool body.
spectral.core.tools.metadata.ToolCallMetadata:
tool_name: stragent_name: strlatency_ms: intok: boolerror_class: str | Nonetrace_id: strstarted_at: datetimeended_at: datetime
The observed_tool decorator emits one ToolCallMetadata per call to structlog and OTel (per observability-stack) and integrates with LLM cost tracking when the tool body invokes an LLM.
Error taxonomy
Section titled “Error taxonomy”Four classes in spectral.core.tools.errors:
ToolUserError— invalid input from user/operator (bad args, missing context); user-visibleToolPolicyError— policy/scope/approval denied; user-visible with framingToolTransientError— infrastructure transient (DB blip, brief LLM provider rate-limit)ToolTerminalError— non-recoverable (invariant violation)
The taxonomy’s role is to shape what the LLM sees via the tool message. Tool errors flow back to the LLM as tool messages with the error class plus a human-readable description. The LLM decides next action: retry as-is, retry with modified args, surface to operator, abandon. LangGraph’s orchestrator-level recursion limit (default 25; configurable per agent) caps runaway loops as the circuit breaker.
There is no agent-layer retry budget. Tool implementations may include single-retry-on-transient-IO as an implementation detail (e.g., a DB connection blip); that is not contract.
Approval mechanism
Section titled “Approval mechanism”Mutate-with-call-time-approval tools use LangGraph’s interrupt() to suspend the run with a standardized payload. spectral.core.tools.approval.ToolApprovalRequest:
tool_name: stragent_name: strargs_summary: str— sanitized; PII-stripped; safe to display to the operatoreffect_description: str— human-readable description of what will changecorrelation_id: UUID
Operator response paths:
- Approve:
Command(resume=ApprovalGranted(...))resumes the run; the tool body executes. - Deny: the tool aborts with
ToolPolicyError(reason=APPROVAL_DENIED). - Request revision: the agent revises the proposed action and re-emits the approval request.
All paths audit-logged through the observability substrate.
Framework-layer composition
Section titled “Framework-layer composition”When a tool needs data or behavior in another context, the call flow goes through DI at the framework-layer composition seam — never via SQL grants. See Contract Surfaces for the canonical pattern.
The reference example is WorldAgentRunner (in spectral.worlds.contracts.protocols.world_agent per ADR-065 D3 — callee-owned OHS Protocol):
ask(question: str, *, world_id: UUID) -> str— stateless mode (no session, no memory)chat(message: str, *, session_id: UUID, world_id: UUID) -> str— stateful mode
Implementation lives in spectral.worlds.application. Per ADR-065 D5, bridge tools (e.g. an Ops-Agent ask_world_agent tool callable) live in apps/* framework deliverables, never in caller-context code; the bridge imports worlds.contracts.protocols.world_agent.WorldAgentRunner (framework-to-context, allowed under validator rule 7) and is composed into the caller agent’s tool list via DI at workers startup. The caller agent (Ops Agent in spectral.platform.application) sees an opaque tool callable — never the Protocol or its implementation. Auto-generated docs for the Protocol live under Protocols.
The pattern is forward-compatible for additional bridge tools when the flow shape is genuinely call (caller needs a result back). For eventual-consistency reads of facts already recorded elsewhere — e.g., the Operations app’s view of rule-candidate outcomes, the World Agent’s view of T3 memory bodies — notification flow + consumer-side local projection (per ADR-065 D4 + ADR-064 D3) is the canonical choice; no Reader Protocol is minted. See Contract Surfaces for the simplest-fit ladder.
Operator-surface tool patterns
Section titled “Operator-surface tool patterns”Two operational tool families ship under the Operations Agent surface:
DLQ inspection (per ADR-060 D7)
Section titled “DLQ inspection (per ADR-060 D7)”list_dlq_events(handler_name?, age_range?, limit)— readget_dlq_event_detail(event_id)— read; returns event payload + sanitized failure historyreplay_dlq_event(event_id, reason: str)— mutate with call-time approval; callscore.outbox_replay(); thereasonfield is captured in the audit log
Backed by OutboxReader / OutboxReplayer protocols injected at the workers entrypoint.
Cluster triage (per ADR-060 D8)
Section titled “Cluster triage (per ADR-060 D8)”list_failure_clusters(severity?, status?)— readget_cluster_detail(cluster_id)— read; returns snapshot + linked failurestriage_cluster(cluster_id, status, notes)— mutate with call-time approval; updatesplatform.rule_candidates_pendingoperator-managed columns
Symmetric with DLQ replay under the Ops Agent’s mutate-with-call-time-approval pattern.
Workshop discipline at the tool → memory boundary
Section titled “Workshop discipline at the tool → memory boundary”Tool outputs containing canonical content (rule body, scan trace, customer PII) are not round-tripped into agent memory verbatim. Per the agent-memory-primitives workshop framing, agent memory holds thought-in-progress and workflow meta-state — not canonical content.
The repository gateway enforces typology-driven classification at the memory-write path; the trigram trigger backstops doctrine drift. There is no separate sanitization decorator on tools — discipline lives in the memory-write path, not the tool surface.
See also
Section titled “See also”- Agent architecture — runtime placement details; streaming pattern
- Contract Surfaces — the broader pattern
- Operations Agent — DLQ + cluster triage tools in context
- Agent memory primitives — workshop framing
- Event substrate — DLQ underlying
- LLM platform — cost tracking integration
- Observability stack — span propagation; structlog
- ADR-060 — decision lineage