RIGOR
Implementation
Diff Engine (Normative – v0.1)
Status: Normative Scope: Defines the deterministic computation of structural differences between two Canonical Graph instances.
This document formalizes the internal algorithm, contracts, guarantees, and constraints of the Diff Engine.
1. Purpose
The Diff Engine computes a ChangeSet representing the structural differences between:
- A previous Canonical Graph (G₁)
- A current Canonical Graph (G₂)
It is responsible for:
- Detecting additions
- Detecting removals
- Detecting modifications
- Classifying change types
- Producing a deterministic ChangeSet
It MUST NOT:
- Perform validation
- Perform migrations
- Mutate either graph
- Apply changes
2. Input and Output Contracts
2.1 Input
diff(previous: CanonicalGraph, current: CanonicalGraph): ChangeSetBoth graphs MUST be canonicalized prior to diff.
The engine MUST reject non-canonical graphs.
2.2 Output
The engine MUST return:
ChangeSet {
status: "no-change" | "changed"
changes: OrderedCollection<Change>
summary: ChangeSummary
}status MUST be "no-change" if and only if no structural differences exist.
3. Core Diff Model
Diff operates on:
- Node identity
- Node attributes
- Edge relationships
Comparison MUST be:
- Deterministic
- Order-independent
- Based on canonical paths and node IDs
4. Identity Matching
Nodes MUST be matched by:
- Canonical path
- Stable identity key (if defined)
If a node exists in G₂ but not in G₁ → ADD If a node exists in G₁ but not in G₂ → REMOVE If exists in both → compare attributes and edges
5. Change Types
Each Change MUST be one of:
- ADD_NODE
- REMOVE_NODE
- MODIFY_ATTRIBUTE
- ADD_EDGE
- REMOVE_EDGE
- REORDER (only if semantically ordered)
Change type set MUST be stable and versioned.
6. Change Object Schema
Each change MUST include:
Change {
type: ChangeType
path: CanonicalPath
nodeType: NodeType
attribute?: string
oldValue?: any
newValue?: any
metadata?: object
}Fields MUST be present only when relevant.
7. Diff Algorithm (Normative Behavior)
The algorithm MUST:
Phase 1 – Node Set Comparison
- Build node maps keyed by canonical path
- Identify added nodes
- Identify removed nodes
Phase 2 – Attribute Comparison
For matched nodes:
- Compare attribute sets
- Detect added attributes
- Detect removed attributes
- Detect modified attributes
Attribute comparison MUST be deep and deterministic.
Phase 3 – Edge Comparison
For matched nodes:
- Compare outgoing edges
- Detect additions and removals
- Detect semantic reorder (if applicable)
Phase 4 – Normalization
- Sort changes deterministically
- Remove redundant changes
- Finalize ChangeSet
8. Deterministic Ordering Rules
Changes MUST be ordered by:
- Canonical path (lexicographic)
- ChangeType (stable internal precedence)
- Attribute name (if applicable)
The same input graphs MUST produce identical ChangeSets.
9. Deep Equality Rules
Deep comparison MUST:
- Compare scalars strictly
- Compare mappings recursively
- Compare unordered sets by sorted canonical form
- Compare ordered lists by position
Floating point comparison MUST be exact unless Specification defines tolerance.
10. Semantic Reorder Handling
For semantically ordered collections:
- A change in order MUST produce REORDER
- Implementation MUST detect minimal reorder delta
For unordered sets:
- Order differences MUST NOT produce changes
11. Structural Hash Optimization (Optional)
If Canonical Graph includes structural hash:
- Hash mismatch MAY trigger deep comparison
- Hash equality MAY short-circuit diff
Hash use MUST NOT affect correctness.
12. Summary Object
ChangeSummary MUST include:
ChangeSummary {
totalChanges: number
additions: number
removals: number
modifications: number
reorders: number
}Counts MUST be consistent with ChangeSet.
13. Performance Expectations
The Diff Engine SHOULD operate in:
O(N + E)
Where:
- N = nodes
- E = edges
Attribute comparison SHOULD avoid quadratic behavior.
Memory usage SHOULD scale linearly.
14. Large Graph Considerations
The engine SHOULD:
- Use hash maps for node lookup
- Avoid nested full scans
- Avoid redundant comparisons
Streaming diff is OPTIONAL and MUST remain deterministic if implemented.
15. Integration with Versioning Engine
The ChangeSet MUST be consumable by:
- Versioning Engine
- Migration Engine
- CLI output
The Diff Engine MUST NOT interpret version bump rules.
16. Error Handling
Diff SHOULD NOT produce semantic errors.
It MAY produce fatal errors if:
- Graph invariants are violated
- Canonical path collision detected
- Non-canonical graph provided
Such errors MUST be deterministic.
17. Stability Across Versions
Change types and output structure MUST be versioned.
Breaking changes in ChangeSet schema MUST increment RIGOR major version.
18. Non-Goals
The Diff Engine does NOT:
- Validate correctness
- Classify semantic severity
- Decide version bump
- Apply migrations
- Generate patches
It only describes structural differences.
19. Compliance Criteria
An implementation is compliant if:
- It produces deterministic ChangeSets
- It correctly identifies add/remove/modify
- It ignores non-semantic ordering
- It preserves stable ordering of changes
- It operates only on canonicalized graphs
20. Summary
The Diff Engine:
- Compares two Canonical Graphs
- Produces deterministic structural ChangeSet
- Is order-independent
- Is immutable and pure
- Is foundational for evolution and versioning