Skip to content
GitHub
Reference

Glossary

A reference for terms used across the Codex, ADRs, and source code. Where a concept has a canonical Codex page, the entry links there; this glossary is a fast lookup, not a substitute for the design pages.

If a term is missing here, it is either inline-defined where used or doctrinally subsumed by an entry below — open an issue if you hit something that should be added.


Anti-Corruption Layer. The consumer-side translation step that converts a producer’s typed event payload into a local model the consumer reasons over. Applied at every event consumer between contexts to prevent producer types from leaking into consumer code. See Contract Surfaces.

The framework-layer composition seam where all three agents (Spectral, World, Operations) run their LangGraph orchestrators. Tool dependencies that cross between contexts wire via DI here; each agent’s own code stays inside a single context. See Architecture.

The platform-side card surface (one per agent) summarizing a customer agent’s optimization history. Linked to a WorldModelCard via EvaluationAuthorityRef so scores attribute to a specific authority version. See System Card.

The externally-grounded source of truth a world model claims to mirror (e.g., IRS publications, statute excerpts, regulatory guidance). Authority is structural, not organizational — what makes a world model trusted is methodological disclosure, not a stamp.

The workspace-level execution policy for accepted ChangeSets — one of observe_only, recommend, manual, or bounded_auto. Distinct from the customer-trust Integration tier; see Optimization Engine — Autonomy mode vs integration tier.


A paired test in tests/contracts/ that imports both the producer’s typed event payload and the consumer’s local model and pins the wire shape via a round-trip check plus a syrupy snapshot. The only directory exempt from inter-context import discipline. See Contract Surfaces.

A framework-layer tool authored under apps/* that wraps a callee context’s OHS Protocol and wires it into a caller agent’s tool list via DI in apps/workers. The ask_world_agent tool exposed to the Operations Agent is the canonical example. See Architecture.


A versioned bundle of optimization changes (prompt edits, tool changes, agent-config deltas) the customer can review and apply to their own agent. Output of the Optimize phase. See Optimization Engine.

The (world_model_score, rubric_score, blended_delta, convergence_delta, ...) tuple emitted by every phase that produces a score. Two-authority blending is by stimulus-weight at scan time, not workspace configuration. See Optimization Engine — CompositeScore.

The acceptance check at the enshrinement boundary: a RuleCandidate is enshrined into the world model only if it conforms to the world model’s structural and authoritative invariants. The integrity control between unconfirmed candidates and authoritative rules.

The data-classification taxonomy on every LLM call: PLATFORM (customer content), OPERATIONS (Spectral-operated reasoning), or SYNTHETIC (test-agent-generated content). Drives payload-stripping discipline at the OTel Collector. See Observability Stack.

One of Spectral’s two operator surfaces: the customer-facing platform and the internal-only Operations app, each with its own audience and authority direction. See Control planes.

One of the three code boundaries enforced by the architecture validator: spectral.core (substrate), spectral.worlds (world-modeling domain), spectral.platform (customer-facing platform). worlds and platform do not import each other; both depend on core. In steady-state docs we refer to contexts by name (worlds, platform, core); “context” appears as an architectural noun only when introducing the three or in doctrine discussion. See Architecture and ADR-031.


The integer stamp on every outbox row and NOTIFY message that gates which worker generation processes which event. Allocated atomically per deploy via INSERT INTO core.deployments RETURNING generation and exposed as SPECTRAL_GENERATION to each service. See Deployment topology.

A subject area a world model covers (e.g., US individual tax preparation). Each domain has its own world model and is decomposed into problem spaces. See World Model.

The Operations-app parity principle: the operator UI and the Operations Agent are first-class occupants of the same surface, carry the same authority, and share the same audit trail. See Operations App.


A concrete probe instance generated from an EvaluationFramework for a specific scan request. Each EvalSet is statistically unique per scan (ADR-028). The agent runs against EvalSet stimuli during the Observe phase. Distinct from EvaluationFramework, which is the customer-directed parameterization.

“Evaluation” carries three distinct meanings in the Codex; context disambiguates:

  • Evaluation phase — one of the seven phases inside a Scan, where ScanTraces are scored by both authorities and a CompositeScore is produced.
  • EvaluationFramework — the customer-directed parameterization (rubric + dimensions + guidance) shaping how a scan evaluates the agent.
  • Evaluation run — a single Scan end-to-end (less precise; prefer “scan” in new prose).

A reference value linking a system card or scan result, in platform, to a specific authoritative world-model version owned by worlds. Carries metadata only — no content crosses between contexts through it (per ADR-030).

The customer-directed parameterization of evaluation: rubric, scoring guidance, dimension weights. Customer-steerable. The framework is the request shape; the EvalSet is the response. See Eval Generation.

The substrate envelope from spectral.core.events.envelope that wraps every published event with metadata (id, occurred_at, generation, correlation). Distinct from the producer-typed Published Payload it carries. See Event Substrate.


A detected pattern of related failures across scans, surfaced by the Diagnose phase. Every detected cluster emits a platform.failure_cluster.detected event; the World Agent applies a consumer-side promotion-threshold filter (frequency, effect size, actionable) and seeds rule candidates in the World Model on the higher bar.


The customer-trust progression axis (Stage 1 / Stage 2 / Stage 3) describing how deeply Spectral is integrated into a customer’s workflow. Distinct from Autonomy mode, which is the workspace-level execution policy. See How Spectral Works.


JWT verification done in-process against the Supabase JWKS endpoint — signature, expiry, audience, and issuer checks — paired with a mirror-based revocation check. The default auth posture across FastAPI services and frontend Pages Functions. See Access Control.


The auth pattern for /version/detail and similar staff or CI surfaces: extract a key from Authorization: Bearer or X-API-Key, validate against an env-var registry, mint an internal JWT with a scoped issuer. See Deployment topology.


Open-Host Service Protocol. The callee-owned typed Protocol in <context>.contracts.protocols.* that defines what a callee context publishes to other contexts as a callable interface. Implementations live in the callee’s application layer; bridge tools wire these into caller agents via DI in apps/workers. See ADR-065 D3.

The Spectral-staff-only LLM agent that runs in apps/operations, drives world-model authoring, distillation, and publication. Sees an opaque ask_world_agent callable for reads of World Agent reasoning from worlds. Operator-keyed memory. See Operations Agent.

The internal-only operator console at apps/operations, served from ops.runspectral.com, where Spectral staff author, distill, and publish the world models the platform evaluates against. Distinct from the Operations Agent (the LLM agent inside the app) and the Operations team (the audience). See Operations App.

The Spectral-staff audience role for the Operations app — the humans who author and publish world models. Distinct from the Operations Agent (the LLM agent) and the Operations app (the surface).


The TanStack Start auth posture for staff frontends: server-side session, scope checks against OPERATIONS_SCOPES, and JWKS-local validation of the Supabase JWT. Used by apps/operations and the docs-codex Pages Function. See Frontend Architecture.

The test-agent parameterization axis that varies span output across (instrumentation framework × LLM-vendor span shape) cells. CI runs the diagonal slice per push; the full matrix runs nightly. See Test agents.

A sub-area inside a domain (e.g., dependents, filing status, deductions inside US tax prep). Operators decompose a domain into problem spaces during world-model authoring. See Problem Spaces.

The source-strength taxonomy on every world-model claim. Four tiers, strongest to weakest: Authoritative (canonical sources like IRS publications), Curated (operator-validated derivations), Distilled (LLM-derived from authoritative sources with operator review), Observed (signals from production scans). The claim that holds uniformly across tiers is “established and governed before evaluating your system” — not “derived independently of AI.”

The producer-owned typed event payload module in <context>.contracts.events.*. Sole source of truth for the wire shape of an event between contexts; auto-generates Codex documentation and is consumed via consumer-side ACL. See ADR-065 D2.

An enum identifying the purpose of an LLM call (scoring, detection, reasoning, world_agent, etc.). Drives quota accounting and content-class resolution at the composition root. See LLM Platform.


The doctrine that canonical activity records (InterventionLog, RegressionRecord, RubricDivergenceRecord, FailureCluster, FeedbackSignal) are produced by system functions; agent memory holds the agent’s reasoning about those records. Records hold what the system did; memory holds how the agent reasoned about it. See Memory System.

A Render-managed shared secrets group, one per environment (spectral-staging-runtime, spectral-production-runtime), read by every service in that environment at startup. Distinct from per-service env vars, which carry code-coupled values atomic with deploy. See Secrets management.

The scoring contract inside an EvaluationFramework: dimensions, weights, hard-constraint floors. Consumed by the rubric scorer during the Evaluate phase. Customer-steerable.

A single enshrined claim within a world model. Each rule has a status (Candidate / Provisional / Pending_approval / Enshrined / Retired) and a provenance tier. See World Model.

A proposed rule under review, not yet enshrined. Surfaced via the Evolution Loop from world signals (failure clusters, promoted memory observations). See Evolution Loop.


Distilled trace data — the input units to the scan pipeline. A Sample is one unit of agent behavior; a Sample Set is a curated group exercised together. See System Design — Overview.

A single end-to-end optimization-engine run for a workspace’s agent — observe, calibrate, diagnose, evaluate, optimize, safety-check, verdict. Produces a Verdict and (when warranted) a ChangeSet. Distinct from the conversational evaluation phase inside the pipeline. See Optimization Engine.

The trace record of one stimulus-and-response within a scan, carrying provenance fields for which stimulus source produced it (world-model-grounded, customer-directed probe, mutation). Distinct from OtelTraceOtelTrace is the inbound OTEL span the customer sends to the ingest endpoint; ScanTrace is what the optimization pipeline produces internally during evaluation. See Optimization Engine — Observe phase.

Spectral’s deployment posture relative to the customer’s agent system: alongside the customer’s runtime, never in the request hot path. The customer’s agent calls Spectral; Spectral never proxies the customer’s traffic. See How Spectral Works.

The customer-facing LLM agent in spectral.platform. Explains scan verdicts, asks for HITL approval before mutations, surfaces optimization recommendations. Reactive to verdict.issued, approval.required, and supervisor.recommendation.issued events. See Agent Architecture.

The combined card surface published with each world-model version: a WorldModelCard (methodology, scope, authority disclosure) plus an AgentPerformanceCard rendered in platform. Authority is structural, not organizational. See System Card.

The platform-internal projection of a published world-model version’s authority metadata, materialized from the worlds publish event so the platform can render system-card attributions without reaching back across the context boundary. See System Card.


T1 / T2 / T3 memory (interaction / session / persistent)

Section titled “T1 / T2 / T3 memory (interaction / session / persistent)”

The three-tier agent memory model. T1 = interaction-scoped (single LLM turn); T2 = session-scoped (single conversation or operator session); T3 = persistent (across sessions; may cross between contexts via memory-to-worlds events). The numbered tiers are the Spectral Agent’s parameterization of the universal interaction/session/persistent lifecycle, not a separate tier model. See Memory System.

The four-class agent-tool error model — ToolUserError, ToolPolicyError, ToolTransientError, ToolTerminalError — that surfaces tool failures to the LLM rather than retry middleware, letting the model decide retry / modify / surface / abandon. See Agent Tool Invocation.

The comparative scoring mechanism (per ADR-020) used inside the Optimize phase to choose among candidate ChangeSet mutations. Two-authority composite scoring keeps tournaments grounded.

An OTEL trace ingested from a customer’s running agent. Permanent record (OtelTrace in the domain model); never modified after ingestion.

The parallel scoring discipline that evaluates every scan against both the World Model (authoritative correctness) and the Rubric (customer-steerable preference) and blends the result via CompositeScore. See Optimization Engine.


The pass/fail outcome of a scan, emitted by the Verdict phase. Outcomes: go / caution / nogo / observe_only. Triggers verdict.issued and (when autonomy mode warrants) approval.required events. See Optimization Engine — Verdict.


A customer’s organizational unit. RLS-scoped; one workspace = one isolated data scope for customer content. The unit of agent observation, evaluation, and ChangeSet authoring.

The customer’s own LLM agent under observation. Spectral evaluates this agent, never replaces it. Spectral’s role is sidecar, never in the hot path.

The internal-only LLM agent in spectral.worlds, one per world model. Read-oriented — proposes rule candidates through the Evolution Loop; never mutates directly. Customers cannot reach it; operators reach it via the Operations app or via the ask_world_agent tool. See World Agent.

The authoritative, externally-grounded standard for a domain — the customer’s “map” of the territory their AI agent operates in. Composed of rules, problem spaces, source materials, and provenance attribution. See World Model.

An immutable snapshot of a world model. Versions are the unit of authority — agents are evaluated against a specific version, not a moving target. Version transitions don’t coordinate platform and worlds (per ADR-030).

A platform-emitted event carrying a generalized signal — failure cluster, promoted memory observation — from a customer scan back to worlds as input to the Evolution Loop. The customer-reality → operator-authority direction of the feedback loop. See Evolution Loop.

The methodology disclosure published with each world-model version: scope, sources, decision log, authority basis. Operator-authored, not auto-generated. See System Card.