Skip to content
GitHub
Customer

Dashboard Reads

Generated

This page is generated from qa/customer/specs/dashboard-reads.md — the source of truth. Edit the spec, not this page.

Last run: not yet recorded (run the replay suite to populate status).

Overview

The customer dashboard presents the deployed world model and its decisions. Sign-in lands on the All-workspaces portfolio at /; entering a domain (via the chrome’s workspace dropdown) reveals the four section tabs — Overview, Decisions, Rules, Trust — each a read-only projection of the customer’s tenant data. The version-scoped Ruleset Card is reached at /world-model-card (no tab; linked from the Trust version-history timeline) and carries the action registry. Every surface carries the non-dismissible alpha-preview footer. This spec verifies the landing, the domain entry, and that each surface renders its read.

Preconditions

  • Signed in as the seeded QA customer (see CUST-AUTH-GATING scenario 2).
  • The customer’s domain is bound to a deployed world model version (seeded; produced by the operator journey + seed_customer_fixture).

Scenarios

1. Sign-in lands on the All-workspaces portfolio

  • After sign-in, stay on /
  • Expected: The portfolio landing renders with a card per member workspace; no section tabs are shown.

2. Entering a domain reveals the section tabs

  • From the portfolio, pick the first domain in the chrome’s workspace dropdown
  • Expected: The tab nav shows Overview, Decisions, Rules, and Trust — and no Actions tab.

3. Overview renders

  • Inside a domain, open the Overview tab (/)
  • Expected: The overview surface renders its content (not a blank page).

4. Decisions read renders

  • Open the Decisions tab (/decisions)
  • Expected: The decisions surface renders the recent-decisions read (a feed, or an explicit empty state).

5. Rules read renders

  • Open Rules tab (/world-model)
  • Expected: The surface renders the deployed version’s content (version metadata + rule corpus), or an explicit empty state.

6. Trust read renders

  • Open the Trust tab (/system-card)
  • Expected: The trust surface renders the deployment-scoped operational record, or an explicit empty state.
  • Navigate directly to /world-model-card
  • Expected: The Ruleset Card renders the version-scoped methodology record — authority basis, where the rules come from, the action registry, and gates passed — or an explicit empty state. Reachable as a deep link even though it has no tab.
  • Visit the portfolio and each tab in turn
  • Expected: The non-dismissible alpha-preview footer is present on every surface.

Test Data

LabelValueNotes
Deployed world(bound via seed_customer_fixture)Customer domain → deployed version

Overview surface

Self-contained scenario block appended for the redesigned domain Overview: the live eyebrow, the trust metrics, the attention banner (held + blocked), the “How decisions resolved” outcome bar + per-state grid, and the recent-decisions feed that deep-links into a decision. Verified inside a domain (the QA customer’s seeded domain is bound to a deployed version with recorded decisions).

O1. Overview leads with the live eyebrow and trust metrics

  • Inside a domain, open the Overview tab (/)
  • Expected: The “Spectral is live for this workspace” eyebrow renders, and the trust metrics block shows Decisions made, Active ruleset, and Rules in force.

O2. The “How decisions resolved” breakdown renders

  • On the Overview, find the outcome breakdown
  • Expected: The four-segment outcome bar and the per-state grid render, each state labelled with its four-state human label (Proceed / Hold / Blocked / No action needed) — not color alone — with a count and percent. The header carries the real producer window label (“Last 7 days” — the windowed statistics default). Each per-state grid cell carries a producer-derived percent sub-line ending in % — the percent is read off the System Card, never recomputed in the surface.
  • On the Overview, find the recent-decisions feed
  • Expected: The feed renders (or an explicit empty state); each row links to /decisions?decision_id=…, and clicking a row opens that decision’s deep-dive.

Scenario block — “Rules” surface

The re-skinned “Rules” tab (/world-model) presents the active ruleset as customer-facing transparency: the active ruleset version, a “where these rules come from” source breakdown, and the rules themselves as a single plain-language flat list (no problem-space grouping at v0). Each rule shows a human importance label (off the severity tier — not the raw t1/t2 code), and a source citation where one is projected. The surface is read-only and leads with human vocabulary (“Rules”, “Active ruleset”) — never the internal “world model”/“enshrine” terms. These scenarios extend CUST-DASHBOARD-READS scenario 5.

R1. Rules surface leads with the active-ruleset version

  • Inside the deployed domain, open Rules tab (/world-model)
  • Expected: The “Active ruleset” card renders the version (e.g. v…) and a rules-in-force count; the page title reads “Rules”, not “world model”.

R2. Rules render as a flat plain-language list with importance + source

  • On Rules tab, inspect the rules list
  • Expected: A flat list (rule-list) renders one row per rule, each with a plain statement, a human importance label (Critical / Standard / Minor — not a raw t1/t2/t3 code), and — where projected — a source citation. No problem-space group headings appear.

R3. Rules surface is read-only

  • On Rules tab, inspect the surface below the shell header
  • Expected: No edit affordances (no inputs, textareas, or action buttons) appear in the surface content.

· Trust (System Card) — deployment-scoped operational record

Appended self-contained block. The Trust tab is the System Card: the deployment-scoped operational record per ADR-082. It establishes the credibility of the operation — the deterministic-decision guarantee, how a decision is made, the trust stats, the live status distribution, the version-history timeline (each entry links to that version’s Ruleset Card via /world-model-card?version=N), the release notes, and a System Card export. Provenance / methodology / gate composition live on the World Model Card, NOT here — ADR-082 rejected merging the two cards.

T1. The decision guarantee and pipeline render

  • Inside a domain, open the Trust tab (/system-card)
  • Expected: The deterministic-decision guarantee (“No language model is in the decision path.”) and the four-step “How a decision is made” pipeline both render.

T2. The live status distribution renders in the four-state language

  • On the Trust tab
  • Expected: The “How decisions resolved” block renders the operational decision record — a total plus the four-state outcomes (Proceed / Hold / Blocked / No action needed) by shape + label + color. Each per-outcome stat carries a producer-derived percent sub-line ending in %, and the block is window-labelled (“Last 7 days” — the windowed statistics default).
  • On the Trust tab, locate the ruleset version history
  • Expected: Each version entry is a link to that version’s Ruleset Card (/world-model-card?version=N); following the active-version link reaches the Ruleset Card surface.

T4. Provenance / methodology is NOT on Trust (ADR-082 split)

  • On the Trust tab
  • Expected: No methodology disclosure, restatement (“What changed”) history, or “Where these rules come from” provenance block appears — that content lives on the Ruleset Card surface.

T5. The per-action breakdown renders, window-labelled

  • On the Trust tab, locate the “By action” section
  • Expected: A window-labelled “By action” section renders. Because the seed fires real /decide calls (a GREEN and a YELLOW on the review_income action), at least one action row renders with its total and the four per-state outcome counts (Proceed / No action needed / Hold / Blocked, each by shape + label, never color alone); if no action has been recorded in the window, the explicit empty state renders instead.

T6. Decision latency renders honestly

  • On the Trust tab, locate the “Decision latency” section
  • Expected: A window-labelled “Decision latency” section renders. Because the seeded /decide calls each persist a measured latency, the p50/p95/p99 percentile line renders; when no decision in the window carries a recorded latency the section says so honestly (“Decision latency is not yet reported for this window.”) — never a fabricated 0.

T7. The System Card export control is present and triggers a download

  • On the Trust tab, locate the “Export the System Card” section
  • Expected: A “Download System Card” control is present. Clicking it triggers the offline-artifact download (the browser download event fires) — the offline System Card with an embedded Ruleset Card snapshot — and no export error is surfaced.

Scenario block — Decisions: log, deep-dive, feedback (/622/625)

Self-contained block appended for the redesigned Decisions surface (the searchable feed, the trust-payload deep-dive, and the customer’s write actions). It drives the same seeded QA customer + deployed domain as the scenarios above.

D1. The decisions log offers search + outcome filter chips

  • Enter the first domain, open the Decisions tab (/decisions)
  • Expected: A search field and the outcome filter chips (All · Proceed · Hold · Blocked · No action needed) render above the recent-decisions feed.

D2. The feed lists decisions with a four-state outcome

  • On /decisions, look at the recent-decisions table
  • Expected: Each row shows the action and the four-state outcome pill (shape + label + color), or an explicit empty state if the domain has no decisions yet. No raw decision UUID is shown as a column.

D3. Opening a decision shows the trust-payload deep-dive

  • Click “View detail” on the first decision in the feed
  • Expected: The deep-dive renders the outcome banner, “What the agent was told to do”, “Inputs considered”, and “Which rules applied — and why” — not a raw JSON dump.

D4. The technical detail is tucked into a collapsible block

  • On the deep-dive, find the “Technical detail” disclosure
  • Expected: A collapsible “Technical detail” block is present (collapsed by default) and, when expanded, names the ruleset and its deterministic basis. The “no language model in the decision path” guarantee now lives only on the Trust / System Card, not per-decision (/ RD-9).

D5. The deep-dive exposes the two write actions

  • On the deep-dive, look at the feedback section (“Trust this decision?”)
  • Expected: A “Mark noteworthy” button and a “Request a review” form (with a reason field) are present.

D6. Requesting a review posts and confirms honestly

  • Enter a reason and submit “Request a review”
  • Expected: The request posts successfully (a submitted confirmation appears) and the sent signal is listed with an honest “Sent to Spectral’s policy team — they’ll follow up” line — never a fabricated resolved/closed status.

D7. The feed and deep-dive name the calling agent, masking a UUID principal

  • On /decisions, look at the feed’s Agent column; then open a decision’s deep-dive
  • Expected: Each feed row carries an Agent cell, and the deep-dive carries an “agent” line. The seeded decisions are recorded against a UUID-shaped principal (the decision API key’s user), so the displayed agent is masked to a short handle with the full value in the cell’s title — never a raw 36-character UUID rendered as visible text. A null/absent principal reads as an em dash.

Scenario block — My flagged decisions

Self-contained block for the “My flagged decisions” module on the Decisions surface (decisions-flagged.tsx). It lists the customer’s own flagged decisions with their feedback lifecycle status (Submitted → Under review → Resolved) and deep-links each row into the decision. The seed fires decisions but submits no flag by default, so the section renders its explicit empty state until a “Request a review” is submitted; F2 drives that submission through the D6 flow, then asserts the row appears.

F1. The Decisions surface renders the “My flagged decisions” section

  • Enter the first domain, open the Decisions tab (/decisions)
  • Expected: A “My flagged decisions” section renders. With no flag submitted it shows the explicit empty state (“You have not flagged any decisions for review yet.”); once a review has been submitted (D6 / F2 persist a flag to the shared fixture) the section lists the lifecycle rows instead — either is correct.

F2. Submitting a review flags the decision with a lifecycle status pill

  • Open the first decision’s deep-dive, submit “Request a review” with a reason, then return to the Decisions surface and re-read the flagged section
  • Expected: A flagged row appears for the decision, carrying a lifecycle status pill reading one of Submitted / Under review / Resolved (a freshly submitted flag reads “Submitted”), and the row deep-links into the decision (“View decision”). No raw decision UUID is shown as visible column text.