ADR-027: Eval corpus as internal world asset
Status: Accepted (2026-04-20)
Source: migrated from
planning/swms-decisions.mdADR-037 as part of SPEC-270.
Context
The eval corpus — the pool of generated instances per rule from which EvalSets are drawn — is a world model asset. It must not leak structural information across the context boundary. Internal identifiers, corpus position signals, instance shape metadata, or any information that would allow spectral.platform to infer holdout boundaries or corpus organization would enable eval corpus distillation: a sophisticated customer accumulating inferred patterns from eval shape, identifiers, and data to reconstruct the underlying corpus structure.
Decision
The eval corpus and holdout registry are strictly internal to packages/worlds. What crosses the context boundary is a fully sanitized EvalSet — generated instances with no internal identifiers, no corpus position signals, and no shape metadata. The attribution envelope carries world model version and rule references only. packages/spectral receives eval instances as opaque inputs, not as slices of a known corpus. packages/spectral has no read or write access to the holdout registry.
Consequences
- EvalSets delivered to
spectral.platformcontain no internal corpus identifiers or structural metadata. - The attribution envelope is the only reference between contexts — world model version and rule identifiers, never corpus internals.
- Holdout validation is a
spectral.worlds-internal process.spectral.platformdoes not know which instances are holdouts. - The eval corpus is a world model version asset: it persists for the life of the version and is archived (not deleted) on version retirement.
- Holdout instances persist for the life of the version and are excluded from active generation. They are accessed only during explicit holdout validation runs.