Skip to content
GitHub
Agents

Agent Tool Invocation

Spectral runs three LangGraph-driven agents — Spectral Agent, World Agent, Operations Agent. Per-agent tool registries live in:

This page describes the cross-cutting contract that all three share: envelope, error taxonomy, retry behavior, observability, approval mechanism, and framework-layer composition pattern. Decision lineage in ADR-060.


All three agent runtimes live in apps/workers. apps/api is thin: authentication, AgentTask dispatch via outbox, and SSE streaming proxy. Workers consumes AgentTask events, loads checkpointer state (agent-architecture.mdx), runs the LangGraph orchestrator, executes tool calls, writes memory, and streams output via Supabase Realtime channel keyed by conversation_id; apps/api proxies the Realtime channel as SSE to the client.

Approval interrupts use LangGraph’s interrupt() to suspend the run; the checkpointer persists state; an operator response (HTTP into apps/api) resumes via Command(resume=...).


Tools are plain async callables produced by closed-over-DI factories. Cross-cutting metadata is captured at call time via a lightweight ToolCallMetadata pydantic value object emitted by the observed_tool decorator. There is no ToolCallEnvelope wrapper around every tool body.

spectral.core.tools.metadata.ToolCallMetadata:

  • tool_name: str
  • agent_name: str
  • latency_ms: int
  • ok: bool
  • error_class: str | None
  • trace_id: str
  • started_at: datetime
  • ended_at: datetime

The observed_tool decorator emits one ToolCallMetadata per call to structlog and OTel (per observability-stack) and integrates with LLM cost tracking when the tool body invokes an LLM.


Four classes in spectral.core.tools.errors:

  • ToolUserError — invalid input from user/operator (bad args, missing context); user-visible
  • ToolPolicyError — policy/scope/approval denied; user-visible with framing
  • ToolTransientError — infrastructure transient (DB blip, brief LLM provider rate-limit)
  • ToolTerminalError — non-recoverable (invariant violation)

The taxonomy’s role is to shape what the LLM sees via the tool message. Tool errors flow back to the LLM as tool messages with the error class plus a human-readable description. The LLM decides next action: retry as-is, retry with modified args, surface to operator, abandon. LangGraph’s orchestrator-level recursion limit (default 25; configurable per agent) caps runaway loops as the circuit breaker.

There is no agent-layer retry budget. Tool implementations may include single-retry-on-transient-IO as an implementation detail (e.g., a DB connection blip); that is not contract.


Mutate-with-call-time-approval tools use LangGraph’s interrupt() to suspend the run with a standardized payload. spectral.core.tools.approval.ToolApprovalRequest:

  • tool_name: str
  • agent_name: str
  • args_summary: str — sanitized; PII-stripped; safe to display to the operator
  • effect_description: str — human-readable description of what will change
  • correlation_id: UUID

Operator response paths:

  • Approve: Command(resume=ApprovalGranted(...)) resumes the run; the tool body executes.
  • Deny: the tool aborts with ToolPolicyError(reason=APPROVAL_DENIED).
  • Request revision: the agent revises the proposed action and re-emits the approval request.

All paths audit-logged through the observability substrate.


When a tool needs data or behavior in another context, the call flow goes through DI at the framework-layer composition seam — never via SQL grants. See Contract Surfaces for the canonical pattern.

The reference example is WorldAgentRunner (in spectral.worlds.contracts.protocols.world_agent per ADR-065 D3 — callee-owned OHS Protocol):

  • ask(question: str, *, world_id: UUID) -> str — stateless mode (no session, no memory)
  • chat(message: str, *, session_id: UUID, world_id: UUID) -> str — stateful mode

Implementation lives in spectral.worlds.application. Per ADR-065 D5, bridge tools (e.g. an Ops-Agent ask_world_agent tool callable) live in apps/* framework deliverables, never in caller-context code; the bridge imports worlds.contracts.protocols.world_agent.WorldAgentRunner (framework-to-context, allowed under validator rule 7) and is composed into the caller agent’s tool list via DI at workers startup. The caller agent (Ops Agent in spectral.platform.application) sees an opaque tool callable — never the Protocol or its implementation. Auto-generated docs for the Protocol live under Protocols.

The pattern is forward-compatible for additional bridge tools when the flow shape is genuinely call (caller needs a result back). For eventual-consistency reads of facts already recorded elsewhere — e.g., the Operations app’s view of rule-candidate outcomes, the World Agent’s view of T3 memory bodies — notification flow + consumer-side local projection (per ADR-065 D4 + ADR-064 D3) is the canonical choice; no Reader Protocol is minted. See Contract Surfaces for the simplest-fit ladder.


Two operational tool families ship under the Operations Agent surface:

  • list_dlq_events(handler_name?, age_range?, limit) — read
  • get_dlq_event_detail(event_id) — read; returns event payload + sanitized failure history
  • replay_dlq_event(event_id, reason: str) — mutate with call-time approval; calls core.outbox_replay(); the reason field is captured in the audit log

Backed by OutboxReader / OutboxReplayer protocols injected at the workers entrypoint.

  • list_failure_clusters(severity?, status?) — read
  • get_cluster_detail(cluster_id) — read; returns snapshot + linked failures
  • triage_cluster(cluster_id, status, notes) — mutate with call-time approval; updates platform.rule_candidates_pending operator-managed columns

Symmetric with DLQ replay under the Ops Agent’s mutate-with-call-time-approval pattern.


Workshop discipline at the tool → memory boundary

Section titled “Workshop discipline at the tool → memory boundary”

Tool outputs containing canonical content (rule body, scan trace, customer PII) are not round-tripped into agent memory verbatim. Per the agent-memory-primitives workshop framing, agent memory holds thought-in-progress and workflow meta-state — not canonical content.

The repository gateway enforces typology-driven classification at the memory-write path; the trigram trigger backstops doctrine drift. There is no separate sanitization decorator on tools — discipline lives in the memory-write path, not the tool surface.