Skip to content
GitHub
Platform

Explainability

Spectral cannot be an incomprehensible black box. Trust is earned partly through the sidecar operating model (Spectral never touches production), but equally through explainability as a day-1 feature.

Every recommendation must be backed by reasoning. Every change must be traceable. Every decision must be auditable.

Spectral asks customers to change their production AI systems based on its recommendations. The people reviewing those recommendations — engineers, product owners, leadership — need to understand why before they’ll act.

  • Engineers need to understand the technical reasoning to validate that a change is sound
  • Product people need to understand the expected impact to prioritize adoption
  • Leadership needs to see a clear narrative of improvement to justify continued investment

Without explainability, Spectral is just another optimization tool asking for blind trust. With it, Spectral is a collaborative partner that shows its work.

The primary surface for explainability is the Change Set primitive. Every Change Set — whether it proposes changes (proposed), recommends no action (validated), or is rejected (rejected) — carries a full Explainability entity assembled at changeset creation by assemble_explainability, which transforms the scan pipeline output into four components:

  • summary — A human-readable narrative explaining what changed and why. This is the first thing anyone reads. It answers: “What did Spectral find, and what does it recommend?” For validated changesets, the summary is the canonical string "No changes warranted". For rejected changesets, the rejection reason is appended to the decision log.
  • decision_log — Detailed reasoning structured as one DecisionLogEntry per strategy per scan phase. Each entry captures: phase, strategy, candidate evaluated, why selected or rejected, and evidence references back into verdict_evidence.
  • verdict_evidence — The structured VerdictEvidence schema (see Verdict Explainability below) carrying both authority scores plus blend.
  • interaction_reasoning — Team-level reasoning that only exists across agents: “We changed Agent A’s context assembly because it was causing truncation that degraded Agent B’s downstream accuracy” lives here. It is a system-level insight, not attributable to either agent alone.

Within a Change Set, each agent’s versioned configuration carries enough context for an engineer to understand what changed for that specific agent. The structural decomposition of prompt templates (base text, context selection, chain-of-thought, few-shot examples) makes changes precise and reviewable.

But the reasoning for agent-level changes lives in the Change Set’s explainability, not on the agent entry itself. This is because the reason for changing Agent A is often “its effect on Agent B” — a team-level concern.

When Spectral evaluates a system and recommends no changes, the resulting Change Set (with validated status) still carries full explainability: the canonical "No changes warranted" summary plus the same four-component Explainability shape — what was evaluated, what was considered, what verdict_evidence shows, and why no changes were warranted. This is not an empty response — it’s evidence that the system was analyzed and found to be performing well (or that proposed changes didn’t meet the improvement threshold).

This matters for trust: the customer needs to know that silence means “we looked and it’s fine,” not “the system isn’t working.”

The chain of Change Sets — each referencing its Baseline — creates a readable narrative of how the customer’s system evolved over time. Leadership can trace the story: here’s where we started, here’s what Spectral found at each step, here’s the cumulative impact.

This is the “prove it” half of Spectral’s value proposition.

The optimization engine’s verdict phase produces structured VerdictEvidence carrying both evaluation authorities plus the blended view, captured into the changeset’s Explainability at creation time. The two-authority enrichment (per optimization-engine: Two-authority evaluation) exposes the per-authority detail rather than a single composite number.

VerdictEvidence carries:

  • Per-authority before / afterworld_model_score_before, world_model_score_after, rubric_score_before, rubric_score_after. The customer can see each authority’s view independently.
  • Blended viewblended_delta_before, blended_delta_after, blend_ratio (derived from aggregate stimulus weights, not configurable), and convergence_delta.
  • Per-dimension breakdownper_dimension shows exactly which quality dimensions improved on each authority, not just an overall number.
  • Per-track scoressynthetic_scores and conformance_scores decompose the evidence by the synthetic and conformance tracks per Optimization Engine — Two-track architecture.
  • Bootstrap confidence intervals providing statistical rigor (“we are 95% confident the improvement is between X% and Y%”).
  • Go / caution / no-go decisions with clear thresholds, attached to the VerdictResult.
  • Holdout validation results (see Performance Card below) proving improvements generalize to unseen data.

This is not “we think it’s better” — it’s “we can prove it’s better, with statistical confidence, across two independent evaluation authorities, on data the optimization process never saw.”

Every changeset carries an AgentPerformanceCard constructed at create_changeset time. The card surfaces:

  • Holdout validation outcomes — pass / fail per scenario; statistical detail attached.
  • Three scenario coverages — golden path coverage, edge-case coverage, regression-set coverage.
  • Go rate — percentage of evaluations that meet the GO threshold across the holdout set.
  • Score trajectory — historical composite-score series across the changeset’s baseline lineage.

The card’s evaluation_authority_ref (opaque, from spectral.core per ADR-030) is stamped from the verdict result at create_changeset time.

The diagnosis phase produces structured failure clusters that map directly to actionable understanding:

  • What’s failing: Named failure patterns with frequency and confidence bounds
  • Why it’s failing: Root cause analysis at the workspace level
  • Which agents are affected: Mapping from failure clusters to specific agents
  • What to do about it: Recommendations tied to each cluster

Diagnosis operates at the workspace level — it explains how failures in one agent cascade through the system. This is the multi-agent insight that single-agent tools cannot provide.

As the Memory System matures, explainability extends to the Spectral Agent’s reasoning process itself. When a recommendation is informed by a persistent-tier semantic observation per ADR-018 (Spectral Agent’s three-tier memory), the customer should be able to see that: “This recommendation is informed by a pattern Spectral has observed across similar agent architectures.” The specific content of persistent-tier memory is available; the cross-workspace generalization happens via the world-signal event path to spectral.worlds per ADR-018, not by storing cross-workspace observations directly.

  • Event System — the substrate that carries verdict events, world signal events, and the AgentPerformanceCard publication path that makes explainability artifacts citable across surfaces.
  • Optimization Engine — the verdict pipeline and two-authority CompositeScore whose explainability surfaces this page describes.
  • Memory System — the three-tier observation lifecycle whose persistent-tier reasoning the customer-facing rationale taps into.
  • System Card — the AgentPerformanceCard and WorldModelCard artifacts that carry methodology disclosure alongside the verdict outcome.