AI Matching Configuration
The following properties configure the Matching Manager microservice and are provided either through the Configuration Service, in the Matching Manager deployment, or in the configuration file ai-matching-matching-manager/etc/application.properties
.
Matching Manager
Property | Data Type | Default Value | Description |
---|---|---|---|
|
|
|
The gRPC port of the server where MDC is running. |
|
|
|
The IP address or the URL of the server where the MDC is running. |
|
|
|
The network address to which the Matching Manager gRPC server should bind. |
|
|
|
The port where the gRPC interface of the Matching Manager microservice is running. |
|
|
|
The network address to which the Matching Manager HTTP server should bind. |
|
|
|
The dedupe clustering decision threshold that functions as a compromise between precision and recall. The value needs to be between 0 and 1. Increasing the value means a higher precision and lower recall, that is, fewer MERGE proposals and more SPLIT proposals. Inversely, decreasing the value results in a lower level of precision and higher recall. |
|
|
|
The number of groups or clusters that are processed in a single batch when proposals are generated during the AI Matching evaluation. A higher number means that the processing is more efficient but requires more memory (RAM). |
|
|
|
The number of proposals that are processed in a single batch when proposals are scored during the AI Matching evaluation. A higher number means that the processing is more efficient but requires more memory (RAM). |
|
|
|
The number of records that are uniformly sampled from all the records fetched from MDM. Those records are the only ones used for initializing and training the AI Matching model. |
|
|
|
The number of records that the AI Matching selects out of the records covered by the property |
|
|
|
The maximum number of columns in one extracted rule. A higher number means that the extracted rules can be more complex, that is, use more columns, but the rule extraction might take significantly longer. |
|
|
|
The maximum number of confident positive pairs that are considered for rule extraction. A higher number means that rule extraction is significantly slower, but the results could be more precise. |
|
|
|
The minimum desired model quality after the training phase finishes. The value needs to be between 0 and 1, which represents the correctness (quality) of the trained model based on the user provided pairs during the training process. The higher the value, the more stringent requirements are for continuing to next steps after the training phase. The model quality can be improved by checking already provided pairs or providing additional pairs. |
|
|
|
The maximum number of folds (splits) used in model quality evaluation after training. Must be set to a non negative integer. A higher value makes the evaluation more precise but also slower (roughly max_cross_validation_folds seconds). Values higher than the actual number of labeled training pairs do not have any effect. If set to 0 or 1, the cross-validation part of the evaluation is skipped (that is, the model is evaluated - both trained and tested - only on all labeled training pairs). |
Was this page useful?