Skip to content
GitHub
Decisions

ADR-107: The action's canonical input ontology, reconciled at publish

Context

A rule’s predicate consumes caller-supplied inputs by destructuring the decision context (context.get("…")). At v0 nothing declared those inputs: the published per-action schema was reconstructed by walking the enshrined predicate’s AST and extracting the literal context.get keys — a names-only derivation (ADR-089 D2 / ADR-104 D3/D5, the SPEC-575 deriver context_schema_from_rules). Attribute types and descriptions were explicitly deferred — “a follow-up paired with an authoring surface” — because no per-attribute vocabulary was authored anywhere and a deterministic deploy forbids LLM type inference.

Per-rule input declarations (D1/D2 below) closed the typing and drift gaps: a rule now declares its inputs as typed, documented metadata, and codegen is verified declared == read. But they left one gap open. An action is served by many rules, and the action’s schema was a name-keyed union of those declarations. Two rules can name the same real-world quantity differently — rule A reads gross_income, rule B reads total_income for the same thing — and the union, keyed by name, lists both as separate required inputs. Both rules ship; at runtime a correct caller that sends one gets a YELLOW-incomplete on the other (the /decide incomplete-input floor). The action’s input contract is incoherent across its rule set. This was the open architectural problem the prior version of this ADR named “not in scope (SPEC-712)”.

This ADR closes it. The action carries a persisted canonical input ontology — one named, typed input per real quantity — that is the authority for its published schema. A reconciliation pass at the publish boundary converges divergent rule vocabularies onto that ontology, holistically over the action’s assigned enshrined rule set, conserving anything a prior published version already exposed. Within-rule typing (D1/D2) is unchanged; what is new is cross-rule vocabulary coherence and where it is established.

Decision

D1 — A rule declares its inputs as typed, documented metadata

A rule carries an input_declaration — the set of inputs its predicate consumes, each a DeclaredInput with a name, a type (string | number | boolean | enum, with allowed_values for enum), a required flag, and a human description. It is authored/derived alongside the predicate and stored on worlds.rules.input_declaration (a JSONB column, migration 20260620010000), mirroring the storage discipline of the rule’s other authored axes (ADR-106 D1). It is () for candidates and for rules authored before the protocol.

D2 — Codegen is bound to the declaration: declared == read

Predicate codegen does not merely produce a declaration — it is gated on the declaration matching the predicate’s actual reads. The generate→check→repair loop (ADR-083 D2 AST analysis) rejects a candidate whose set of context.get keys is not exactly the set of declared input names: an undeclared read or a declared-but-unread key routes back into the bounded repair loop with feedback, and on exhaustion the loop raises (terminal, nothing enshrined). The declaration is therefore a contract the predicate is verified against, not a scrape of it — the inversion of the v0 derivation.

D3 — The action carries a persisted canonical input ontology — the authority

An action owns a persisted ontology (worlds.actions.ontology, a JSONB column): one entry per canonical input, each carrying a name, type, required flag, enum allowed_values, description, and provenance — the contributing_rules (the rules that establish or use the entry) and, when the name/type was set by a deliberate operator decision rather than derived from the rules, an operator_decision record (D9: who decided, when, the intent, and the prior name/type). For a reconciler-derived entry the “how” is implicit in the contributing rules under the rename map (D4); for an operator-decided entry it is recorded explicitly. This ontology — not a deploy-time computation — is the authority for the action’s published input schema.

Deploy sources input_schema.json from the persisted ontology. roll_up_input_union over the deploying rules’ declarations is retained as a deterministic drift check: deploy fails (OntologyDriftError, a 4xx — a well-formed request against an unshippable world state, never a 500) if the rules’ union diverges from the persisted ontology, so authoring and deploy agree by construction. The deployed bundle still ships a self-verifying input-contract test member that re-checks the schema, so the invariant is carried in the content-addressed artifact, not only asserted at deploy. There is no union fallback: an action that has deploying rules but an empty ontology is itself drift — an empty-ontology schema can never equal the rules’ non-empty union, so deploy fails loud (OntologyDriftError). Publish-time reconciliation (D4/D5) sets every published action’s ontology, so a reconciled world never reaches deploy with an empty one; there is no additive pre-reconciliation fallback (no backwards compatibility).

D4 — Reconciliation converges the vocabulary: one canonical name and one type per quantity

Reconciliation establishes the ontology from the action’s rule declarations in two layers:

  • Deterministic type-merge (within a name). Where rules share an input name the merge is the prior one-type-per-key rule: same type merges (required OR-merged; enum allowed_values must agree); string + enumenum (an enum is a constrained string — not a conflict; takes the enum, carrying its allowed_values; a real dogfood case, filing_status); genuinely incompatible types (number vs string/enum, conflicting enum sets) raise (InputUnionConflictError), never a silent coercion.
  • Semantic name-convergence (across names). Two rules naming one real quantity differently are converged onto a single canonical input name. This is the one place a deterministic rule cannot decide — “is total_income the same quantity as gross_income?” is a semantic judgment — so an LLM proposes the canonical names + a per-rule rename map only. Everything else — the type, required, enum values, and provenance of each canonical entry — is derived deterministically from the rules’ declarations under that rename map (reusing the type-agreement guard above). The model never invents a type or a value set.

This closes the cross-rule vocabulary defect the name-keyed union structurally could not: the union saw two names and published two inputs; reconciliation recognizes one quantity and publishes one.

D5 — Reconciliation runs at the publish boundary, isolated from the deterministic gate

Vocabulary coherence is a whole-action property: it needs the action’s complete assigned-enshrined rule set, which first exists at the version boundary — not at the approval of any single rule. So reconciliation runs at publish, the seam where the system already asserts whole-action quality (it is a sibling of ADR-108’s behavioral-completeness gate).

It runs as a reconciliation phase that precedes the deterministic mint and the completeness gate, holistically per declared action over that action’s assigned enshrined rules. The LLM naming call (D4) is cassette-pinned, id-free (rules carry ordinal labels, never UUIDs) and memoized — skipped when an action’s (rule_id, declared-names) set is unchanged since its last reconcile — so a steady-state republish makes no LLM call. The deterministic work — the realignment (D6), the consistency gate, the type-merge, the drift check, and the mint/freeze itself — stays LLM-free and reproducible. ADR-108 D5’s property therefore holds unchanged: the publish decision (the enforcement gate) is deterministic; reconciliation is a generation phase before the gate, not part of it.

The earlier conform-at-approve trigger is retired. It gated on approval-time action membership, but action assignment is status-agnostic authoring membership and may be incomplete before publish. The publish boundary is the first place the system has the whole assigned-enshrined rule set for an action, so approve no longer reconciles; it enshrines one rule with its own typed declaration (D1/D2 unchanged).

D6 — The freeze boundary is the published input, not the enshrined rule

Converging a name means rewriting the affected rules. Reconciliation realigns a rule’s four artifacts — predicate source, applies-when, input declaration, and behavioral spec (with its test suite re-materialized from the rewritten spec) — to the canonical names. A pure rename is deterministic and behavior-preserving (an AST + spec rewrite), gated by re-running the rule’s behavioral suite (the consistency gate, ConsistencyGateError on a break); a rename that collapses two distinct reads onto one canonical name regenerates the predicate (the codegen repair loop), then re-gates.

This rewrites enshrined-but-unpublished rules freely — retiring the build’s earlier “the agent never rewrites an enshrined rule’s inputs” invariant. That invariant conflated enshrined with frozen and made the defect unclosable in the normal case: when both divergent rules are already enshrined, closing the defect requires renaming at least one of them. The real boundary is the published input. A name that any published version exposed is conserved: an automatic reconciliation that would drop, re-type, or rename it fails loud (PublishedOntologyViolationError) for operator arbitration (D7). An enshrined-but-unpublished input name has been exposed to no caller, so harmonizing it before its first publish changes no contract — it is free.

The freeze is against automatic change, not against the operator’s deliberate intent. Renaming or re-typing a published input is a legitimate contractual change at a new version revision — never automatic, never in-place — so the operator can deliberately make it through the recorded-edit path (D9), which the next publish honors instead of failing loud.

D7 — Fail-loud operator arbitration at publish; renames are surfaced, never silent

Reconciliation surfaces, as publish-blocking conditions to the operator (mapped to 4xx): low-confidence synonym determinations (the reconciler’s flags), published-input violations (D6), a realignment that breaks a rule’s behavioral suite (the consistency gate), and a multi-action collision (a rule shared across actions that demand different canonical names for one read key — rare; the operator aligns the actions’ vocabularies or splits the rule). The operator — not the LLM — is the arbiter of “are these the same quantity,” and publish is exactly when the operator is making a deliberate version decision.

Critically, the planned harmonization is shown before it is applied (“publishing will rename rule B total_incomegross_income”). A reconciliation that silently rewrote a reviewed rule at the freeze would violate the review-integrity property this codebase upholds — the artifact the operator reviewed is the artifact that deploys (ADR-108; the SPEC-652 generate-once-persist-reuse discipline). Surfacing the rename, at the operator-driven publish, preserves it.

This arbitration has two faces. One is resolving ambiguity the reconciler raised — the operator nudges, the agent regenerates. The other is initiating a change the operator already intends — deliberately renaming or re-typing a published input. The fail-loud guard blocks the latter from happening automatically; D9 is the surface through which the operator authorizes it.

D8 — The canonical ontology is versioned into the world-model card

At publish, each action’s reconciled ontology is snapshotted into the WorldModelCard alongside the action set (ADR-104 D1’s snapshot_action_set), riding the tolerated-unknown-key payload pattern (ADR-065 D4 — no bilateral wire bump). This gives “published inputs are frozen” (D6) a precise, versioned referent: the conserved set fed to reconciliation is the prior card’s ontology for that action, not the mutable live row. A published version’s input contract becomes historically provable, consistent with the action set already being snapshotted there.

D9 — Deliberate operator edit of a published input, honored at the next version

An operator can deliberately rename and/or re-type a published input — the change the freeze (D6) blocks from happening automatically. The operator records the decision as a PendingInputEdit on the action (worlds.actions.pending_input_edits, at most one per input); the next publish honors it: the reconciliation rewrites the conserved (prior-card, D8) entry to the recorded target shape before the freeze check, so the guard passes for that input, the rename propagates across the affected rules via the same realignment (D6), and the new name/type becomes the frozen contract from that version forward. The consumed edit is cleared on publish success only — a publish that fails a later gate leaves it intact for retry.

Four properties hold:

  • Targeted. Only the named input is unfrozen; every other published input on the action stays frozen and still fails loud on any non-deliberate drift (D6).
  • Provenance. The honored edit stamps an OperatorOntologyDecision (D3) onto the resulting ontology entry — who decided, the intent, and the prior name/type — and that provenance is snapshotted into the card (D8), so the deliberate change is permanently attributable.
  • Every other gate still runs. D9 only relaxes the published-input freeze check; behavioral completeness (ADR-108), the consistency gate, the type-agreement merge, and low-confidence arbitration all still apply. A re-type in particular requires the contributing rules to already declare the new type — else the deterministic type-merge (D4) fails the publish loud — and the impact preview surfaces that precondition.
  • Version boundary, never in-place. The change takes effect only at the next published version; previously published versions are untouched. This mirrors the ADR-026 model — the world-model version is the unit of authority, so a deliberate change is a new version boundary, not a mutation of what already shipped.

Before recording, the operator is shown the contractual impact (a dry run): the affected rules, the /actions schema delta (old → new name/type), and that the change is breaking for existing callers. The operator decides against that impact; the recorded edit is the authorization the publish reads. Recording it is an authoring-time action, decoupled from the publish transaction, so it is durable and replay-safe (deterministic — the publish reads the recorded decision, it is not re-derived by the LLM).

Consequences

  • An action presents one coherent, typed, documented input contract across its whole rule set: two rules expressing one quantity converge on one canonical input, so /actions and /decide no longer split a quantity into two required keys. The cross-rule vocabulary defect (the prior “not in scope (SPEC-712)”) is closed.
  • Within-rule guarantees are unchanged: codegen drift is still structurally impossible at the declared-key level (D2); the per-key type-merge (D4) still holds.
  • ADR-104 D1/D3/D5 are amended in lockstep: an action carries a persisted canonical ontology; deploy sources input_schema.json from it and retains the rules’ union only as a drift check. The “an action carries no input contract of its own / the schema is the deploy-time union” framing is overturned.
  • ADR-108 is amended in lockstep: the publish flow now runs a reconciliation phase before the deterministic enforcement gate. The enforcement gate and the mint stay deterministic and reproducible; what changes is that the publish flow carries a cassette burden for the (memoized, id-free) naming call. ADR-108’s “no LLM in the enforcement path” holds; its incidental “no cassette burden at publish” framing is refined — the burden is in the reconciliation phase, not the gate, and is zero for a steady-state republish.
  • Operator authority is explicit and well-placed: the human resolves genuine ambiguity at the version boundary (D7), and never edits code — the rewrite is deterministic and gated.
  • A published input is frozen against automatic change but not against the operator’s deliberate intent: the operator can rename or re-type one at a new version through a recorded edit (D9), shown its contractual impact first and honored by the next publish with attributable provenance (D3). The escape hatch the fail-loud guard (D6/D7) reserved is now realized, without weakening the freeze for any input the operator did not deliberately name.
  • Deploy stays deterministic and LLM-free; OntologyDriftError becomes a backstop that passes by construction once an action is reconciled.
  • This contract remains the substrate ADR-108 builds on: behavioral completeness asserts “every intended input measurably matters” over inputs that are now also cross-rule-coherent.

Alternatives considered

Conform the candidate at approve (the retired build). Rejected. It cannot fire — a candidate has no assigned action at approve in the canonical flow (ADR-104). And even if assignment were forced earlier, a conform-only-one-candidate pass never rewrites an enshrined sibling, so two already-enshrined rules with divergent names can never converge — it relocates the defect rather than closing it.

Reconcile at rule→action assignment time. A serious alternative — it has the tightest failure locality (a conflict surfaces at the assign that introduced it) and keeps publish fully deterministic. Rejected as the primary trigger because it introduces an eager-invariant discipline (coherence maintained continuously, like a DB constraint) the system uses nowhere else, and it changes assignment’s deliberately cheap, idempotent, reversible character into a heavyweight, LLM-bearing, failable operation. The publish boundary is where the platform already asserts whole-action quality (ADR-108), and reconciliation is order-independent there by construction. Kept as the documented runner-up: the reconcile/realign engine is identical either way, so the trigger can move if dogfood shows the publish-time harmonization preview is too coarse.

Keep the name-keyed union; no semantic reconciliation. Rejected — this is the pre-SPEC-732 position; it leaves the defect open (the union keys by name and cannot recognize that two names are one quantity).

Snapshot nothing / read the live ontology at deploy only. Rejected for the freeze referent (D8): without a versioned ontology, “conserve published inputs” has no immutable set to compare against, and a republish could silently move a name a live caller depends on.

References

  • ADR-104 D1/D3/D5 — the action registry + per-action deploy; amended in lockstep (the action carries a persisted ontology; deploy sources input_schema.json from it, union is a drift check)
  • ADR-108 — the behavioral-completeness publish gate this reconciliation is a sibling of; amended in lockstep (a reconciliation phase precedes the deterministic enforcement gate)
  • ADR-089 D2 — the /actions per-action schema this ontology publishes
  • ADR-083 D2 — the codegen-time AST analysis the declared == read gate and the deterministic realignment compose with
  • ADR-080 D1 — the content-addressed per-action bundle input_schema.json rides
  • ADR-065 D4 — the tolerated-unknown-key payload pattern the card ontology snapshot (D8) rides
  • ADR-106 D1 — the rule authored-axis storage discipline input_declaration and the ontology follow
  • ADR-026 — the world-model version as the unit of authority; the deliberate published-input edit (D9) takes effect at a new version boundary, never as an in-place mutation