Decisions

ADR-023: Holdout strategy

Status: Accepted (2026-04-20)

Source: migrated from planning/swms-decisions.md ADR-033 as part of SPEC-270.

Context

Holdout serves two distinct purposes that are easy to conflate.

First, it detects whether performance on seen instances tracks genuine rule conformance or surface pattern matching on the generation distribution — an instance-level concern.

Second, it detects whether an agent that conforms to a region of the domain can handle a specific rule it was not explicitly trained against — a rule-level concern that is only meaningful when the visible rule set covers the surrounding territory densely enough for the held-out rule to be a reasonable generalization target.

Applying a single holdout policy to both concerns conflates them and produces weak signals for both. The design interview resolved this by defining a two-layer strategy.

Decision

Holdout is structured as two layers: instance-level universal holdout and rule-level conditional holdout gated on peer coverage. Sample hashing with embedding-based semantic similarity prevents generation-time leakage into the optimization set. Holdout configuration is managed, not static.

Instance-level holdout

Applied universally across all rules.
A fraction of generated instances per rule is reserved from the optimization loop.
Detects whether performance on seen instances tracks genuine rule conformance or surface pattern matching on the generation distribution.

Rule-level holdout

Applied selectively — only to rules with sufficient peer coverage.
A rule is eligible for rule-level holdout only when the visible rule set covers its domain territory densely enough that an agent genuinely conforming to the domain should be able to handle the held-out rule without having seen it specifically.
Peer coverage is determined by coverage density assessment. At launch this is driven by domain topic tags and reviewer judgment. As the world model matures, embedding-based clustering takes over as the coverage density signal.

Sample hashing

Generated instances are hashed and compared against the holdout registry before release to the optimization loop.
Matches and near-matches are suppressed and regenerated.
Similarity operates at the semantic level using embedding-based similarity with a configurable threshold, not at the token level with exact matching.
The same embedding infrastructure serves both peer coverage assessment and holdout hash comparison.

Managed configuration

Holdout configuration is reviewed and updated as world model coverage changes. It is not a one-time setting.
As peer coverage density increases, the eligible set for rule-level holdout expands. As new rules enter with thin peer coverage, they receive instance-level holdout only until density grows around them.

Consequences

The holdout system has two distinct code paths and two distinct eligibility checks, not one unified policy. The instance-level path applies to every rule; the rule-level path gates on coverage density.
Embedding-based infrastructure is shared between coverage density assessment and holdout similarity matching. This consolidates the model footprint and ensures the two signals use the same notion of semantic proximity.
Rule-level holdout eligibility changes over time as world model coverage grows. The holdout registry and eligibility assessment are live artifacts, not fixed at launch.
At launch, rule-level holdout eligibility is driven by topic tags plus reviewer judgment. Embedding-based clustering displaces this as it matures; the transition does not require a holdout redesign because the managed configuration envelope accommodates it.
The earlier world-model-system-internal decision that holdout operates at the eval layer rather than the world model layer (captured in the World Model System Codex under system-design/world-model-system/eval-generation/) is preserved. This ADR refines how the eval-layer holdout is structured into two layers.

Previous
ADR-022: Eval generation architecture Next
ADR-025: System card authority basis