Normalize

Structural repair after apply, before commit.

A transaction is a sequence of steps. Each step applies cleanly in isolation, but the sequence can leave the document in a shape that should be repaired before anyone observes it: an empty text-flow block with no text leaf to anchor a caret, adjacent text leaves with identical marks, or schema-level fallout such as children or marks that are no longer allowed in their parent. Reaching into the step layer to prevent every case makes the step layer hostile.

Normalization runs after apply and before commit. Rules look at the post-apply document, decide whether they want to replace it with a repaired version, and emit a NormalizerRuleResult. The loop runs in deterministic passes until it reaches a fixed point. If it exceeds the configured pass limit, the engine throws NormalizationDidNotConvergeError — that is a bug in a rule, not a state the user can reach.

Worked example

Shipped helper — document-level schema repair:

const rule = createSchemaDocRepairRule(canonicalSchema, {
  meta: { source: "docs" },
});

createSchemaDocRepairRule(...) closes over the schema, runs repairDoc(...) only for document targets, and emits a replacement only when repair changes the document.

Every rule that fires emits a normalized trace event:

normalized
  ruleId: "schema:repair-doc"

normalized
  ruleId: "schema:repair-fragment"

Rules that return null do not emit.

Contract

From packages/core/src/normalize.ts:

interface NormalizerRule {
  readonly id: string;
  normalize(
    target: NormalizeTarget,
    context: NormalizerContext,
  ): NormalizerRuleResult | null;
}
  • Pure. Same input, same output.
  • Idempotent. normalize(normalize(doc)) equals normalize(doc).
  • Convergent. Deterministic passes reach a fixed point. Failure to converge within the configured limit throws.
  • Target-preserving. A doc rule returns a doc target; a fragment rule returns a fragment target.

What normalization repairs

  • Empty blocks that need a zero-length text leaf to host a caret.
  • Adjacent text leaves with identical marks merged into one run.
  • Structural fallout from steps whose isolated inverses cannot anticipate neighbor state.
  • Schema repair via repairDoc and repairFragment (the two built-in rule IDs are schema:repair-doc and schema:repair-fragment).

What normalization does not do

  • It does not invent content.
  • It does not touch text characters except to merge adjacent identical runs.
  • It does not change user intent. If an intent would have produced the exact same document, the normalizer would not run it through a different one.

API

createNormalizer(config?: NormalizerConfig): Normalizer;

createSchemaDocRepairRule(schema, options?): NormalizerRule;
createSchemaFragmentRepairRule(schema, options?): NormalizerRule;

replaceNormalizedDoc(doc, extras?): NormalizerRuleResult;
replaceNormalizedFragment(fragment, extras?): NormalizerRuleResult;

The engine builds a default Normalizer when one is not supplied and calls normalizer.normalizeDoc(appliedDoc, { source: "transaction", meta }) after each apply.