Matching Architecture

Building on the foundational matching concepts, this article explores the technical architecture and design principles behind matching in ONE MDM. Understanding how matching works under the hood will ensure you get more consistent results from your matching configuration.

Matching architecture

Matching logic is configured in your MDM project and consequently in a matching plan using a Matching step.

As explained in more detail in How the Matching Step Works, the Matching step uses a hierarchical partition > key rule > matching rule structure to efficiently process records. In other words, incoming records are first split into disjoint groups (so-called partitions) and each partition has two sets of rules:

Key rules that identify clusters of records eligible for evaluation with relevant matching rules.
Matching rules that ensure a highly optimized record-to-record comparison.

Configuration scale and data flow

In practice, this hierarchy creates a progressive data refinement process:

Partitions (typically 1-7): Split your entire dataset into broad categories of records that should never be matched together, such as separating person records from company records. Each partition can contain many thousands of records of the same kind.
Key rules (typically 3-7 per partition): Within each partition, create many small clusters of potentially similar records based on shared characteristics like phone area codes or email domains. This dramatically reduces the number of record-to-record comparisons needed.
Matching rules (typically 2-4 per key rule): Within each key rule cluster, apply specific comparison logic to determine which records actually represent the same entity and should receive the same master ID.

This progressive narrowing (from potentially millions of records, to partition groups, to key rule clusters, to final matching groups) is what makes the matching process both scalable and accurate.

Note that records in different partitions are never matched together and there is no cross-partition rules evaluation. In addition, each part (partition, key rule, matching rule) can be computed in parallel.

All the inputs gathered in the outlined phases create a graph of records with identified links. As a result, matched groups are created and assigned a unique identity, depending on the quality of these links. This is determined using the matching rule quality and their order (confidence score is not used).

To learn more about the graph of records and how the link quality is assessed, see How matching results are computed.

Proposal rules are evaluated the same way as matching rules, with two key differences:

Links are used to create a matching proposal (and not add a record into a matching group, as is the case with matching rules).
This phase of matching is serialized.

Design principles

Matching in ONE MDM follows these key design principles, which directly influence the matching approaches and identity management you’ll use in practice.

Rule-based deterministic matching

ONE MDM uses a rule-based, deterministic approach to matching to ensure predictable results.

This means that matching decisions are made using predefined business rules and logic, ensuring consistent outcomes even when approximative or relative matching rules are used (such as fuzzy string matching or similarity thresholds).

Identity stability

ONE MDM ensures that matched records remain consistent over time. Once a record identity is assigned to a master record, it remains unchanged even as the underlying data continues to evolve. This prevents the negative impact of data aging, where records of different ages contain information of varying quality or completeness.

Since the original matching decision was correctly made based on the data available at the time, and since the records refer to the same real-world entity, the matched records are not split or reassigned to different identities only because their data has changed over time.

Building on the Cameron Smith example we used in Matching: if Cameron gets married and changes their name to Cameron Johnson, plus gets a new phone number, ONE MDM will maintain the same master ID and update the golden record rather than creating a new identity.

When merges do occur, ONE MDM maintains consistency by reusing existing master IDs rather than creating new ones. When there is a request to merge records, one of the master IDs is retained and others are discarded.

Generic identifiers like UUID or NanoID cannot replace the master ID.

Merge protection

Matching never automatically merges two existing identities to keep identity stability as this could cause unintended consolidation in downstream systems.

When an incoming record could potentially match multiple existing identity groups, the matching algorithm selects the best match and assigns the record to that group.

For other potential matches, the application creates matching proposals instead of automatically merging, preserving evidence of potential relationships while allowing data stewards to make informed decisions.

Performance and maintenance

To optimize matching performance, we recommend following the best practices provided or enabling parallelism, if applicable.

Performance and matching accuracy can also be affected after modifying partition logic and key rules. To learn how to manage these changes, see Impact of Matching Configuration Changes.

Note that, in order to process data incrementally, matching uses a technical repository. After major configuration changes, the repository might require maintenance and cleanup

For detailed performance optimization guidance, see Matching Performance Best Practices.

Next steps

Now that you understand the architectural foundations:

How the Matching Step Works - Dive deeper into how the matching algorithm processes your rules and makes decisions.
Matching Configuration - Start configuring matching for your specific requirements.
Matching Performance Best Practices - Find out how to optimize matching for performance and accuracy.

Was this page useful?