Configuring AI Matching
The following properties configure the Manager and Worker microservices, and are provided either through the Configuration Service, in the Manager and Worker deployment, or in the configuration file ai-matching-matching-manager/etc/application.properties
.
General configuration
Property | Data type | Description |
---|---|---|
|
String |
The location of the default Default value: |
|
String |
The location of the Default value: |
|
String |
The location of the etc folder of the microservice.
The Default value: |
|
String |
The location of the Default value: |
|
String |
The location of the Default value: |
|
String |
The location of the Default value: |
|
String |
The location of the root folder of the microservice. Some configuration paths are defined relatively to this path. The default value of this property can be overwritten only through environment variables, otherwise the change is ignored. Default value: |
|
String |
The location of the Default value: |
Health
Property | Data type | Description |
---|---|---|
|
Number |
The timeout period during which the microservice and its subcomponents need to report as running, otherwise the whole microservice becomes unhealthy and its status changes to Default value: |
Logging
Property | Data type | Description |
---|---|---|
|
Boolean |
Enables JSON console appender. Only one console appender can be enabled at a time. |
|
Boolean |
Enables JSON file appender. Only one file appender can be enabled at a time. |
|
Boolean |
Enables plain text console appender. Only one console appender can be enabled at a time. |
|
Boolean |
Enables plain text file appender. Only one file appender can be enabled at a time. |
|
String |
A compression or archive format to which log files should be converted when they are closed. Default value: |
|
String |
The name of the file used by the file appender. Default value: |
|
String |
Indicates how often the current log file should be closed and a new one started. Default value: |
|
String |
The minimum severity level starting from which logged messages are sent to the sink. Default value: |
Retrying
Property | Data type | Description |
---|---|---|
|
String |
Controls retrying of gRPC and graphQL communication attempts. The property determines when retrying stops. By default, retrying stops after 6 attempts in total, out of which 5 are retries. Default value: |
|
String |
Controls retrying of gRPC and graphQL communication attempts. The property determines which approach is used to stop retrying. For more information, see the Tenacity API Reference, Stop Functions section. Default value: |
|
String |
Controls retrying of gRPC and graphQL communication attempts. The property is used to calculate the duration of waiting periods between retries. For more information about how waiting periods between unsuccessful attempts are managed, see the Tenacity API Reference, Wait Functions section. Default value: |
|
String |
Controls retrying of gRPC and graphQL communication attempts. The property determines which approach is used when waiting. For more information about how waiting periods between unsuccessful attempts are managed, see the Tenacity API Reference, Wait Functions section. Default value: |
On-start behavior
Property | Data type | Description |
---|---|---|
|
Number |
Sets for how many seconds the microservice waits after requesting health information about its dependencies, for example, when the Recommender waits for the Neighbors or the Autocomplete waits for MMM.
For more information, see the Requests Developer Interface Documentation, section about the Default value: |
|
String |
Defines the behavior of the microservice while it waits on a dependency before starting. Keyword arguments (kwargs) are the arguments used to construct an instance of the specified wait type. In this case, the keyword argument sets the duration of waiting intervals. Default value: |
|
String |
Defines the behavior of the microservice while it waits on a dependency before starting. Currently, the microservice either waits to receive information about the health of the dependency or the database readiness (typically, this means waiting for the database to start and for MMM to create the tables needed). The property defines how waiting periods are managed between unsuccessful attempts to verify the readiness of the dependency. For a list of other available wait types, see the Tenacity API Reference, Wait Functions section. Default value: |
DB
Property | Data type | Description |
---|---|---|
|
String |
The host for the microservice database. |
|
String |
The password for the microservice database. |
|
String |
The username for the microservice database. |
|
String |
Sets the SQLAlchemy engine options, such as the maximum length of identifiers used in the database. For more information, see the Engine Configuration section Engine Creation API, Parameters. Default value: |
gRPC client
Property | Data type | Description |
---|---|---|
|
String |
Limits the size of messages that the gRPC client can process. Default value: 1GB. Accepted units: |
Authentication
Property | Data type | Description |
---|---|---|
|
String |
The private key of the microservice used to generate tokens for internal JWT authentication. |
|
Number |
Defines the amount of time after which the token generated by the internal JWT generator expires. Expressed in seconds. Default value: |
TLS/mTLS
Property | Data type | Description |
---|---|---|
|
String |
All client TLS options can be specified per connection.
To set any TLS option for a specific client connection, configure the same set of properties as for the global client TLS configuration (properties with the
If an option is not specified for the given client connection, global client TLS options are applied. Default value: |
|
String |
All client TLS options can be specified directly for gRPC client.
To set any TLS option for a gRPC client, configure the same set of properties as for the global client TLS configuration (properties with the Default value: |
|
String |
All client TLS options can be specified directly for HTTP client.
To set any TLS option for a HTTP client, configure the same set of properties as for the global client TLS configuration (properties with the Default value: |
|
Boolean |
Defines whether the gRPC and HTTP clients should use TLS when communicating with the servers. Default value: |
|
String |
The private key name specified in the provided keystore that is used for TLS.
Does not work with Default value: |
|
String |
The password for the private key of the gRPC and HTTP clients.
Used if the private key is encrypted. Does not work with Default value: |
|
String |
Points to the keystore containing private and public key certificates that are used by the gRPC and HTTP clients.
For example, Default value: |
|
String |
The password for the keystore. Used if the keystore is encrypted. Default value: |
|
String |
The type of the keystore. Possible types are Default value: |
|
Boolean |
Defines whether the gRPC and HTTP clients should use mTLS when communicating with the servers. Default value: |
|
Boolean |
Defines whether the gRPC and HTTP clients should verify the certificate of the server with which they communicate. Default value: |
|
String |
Points to the truststore with all the trusted certification authorities (CAs) used in gRPC and HTTP TLS communication.
Used only when Default value: |
`ataccama.client.tls.trust-store-password ` |
String |
The password for the truststore. Used if the truststore is encrypted. Default value: |
|
String |
The type of the truststore.
Possible types are Default value: |
gRPC server
Property | Data type | Description |
---|---|---|
|
String |
Limits the size of messages that the gRPC server can process. Default value: |
Authentication
Property | Data type | Description |
---|---|---|
|
Boolean |
Enables basic authentication on the gRPC server. If enabled, Keycloak becomes a mandatory dependency - it needs to be running before the microservice starts. Default value: |
|
Boolean |
Enables bearer authentication on the gRPC server. If enabled, Keycloak becomes a mandatory dependency - it needs to be running before the microservice starts. Default value: |
|
Boolean |
Enables internal JWT token authentication on the gRPC server. Default value: |
|
Boolean |
If set to Default value: |
|
String |
Used for securing HTTP endpoints based on user or module roles.
The role comparison is case-insensitive.
For example, to allow only users with
Default value: |
|
Boolean |
Enables basic authentication on the HTTP server. If enabled, Keycloak becomes a mandatory dependency - it needs to be running before the microservice starts. Default value: |
|
String |
Ant-style patterns that filter which HTTP endpoints have basic authentication enabled.
To separate multiple patterns, use a semicolon ( Default value: |
|
Boolean |
Enables bearer authentication on the HTTP server. If enabled, Keycloak becomes a mandatory dependency - it needs to be running before the microservice starts. Default value: |
|
String |
Ant-style patterns that filter which HTTP endpoints have bearer authentication enabled.
To separate multiple patterns, use a semicolon ( Default value: |
|
Boolean |
Enables internal JWT token authentication on the HTTP server. Default value: |
|
String |
Ant-style patterns that filter which HTTP endpoints have internal JWT authentication enabled.
To separate multiple patterns, use a semicolon ( Default value: |
|
String |
Ant-style patterns that filter which public HTTP endpoints should be protected.
If configured, these endpoints are no longer publicly available and authentication is required.
To separate multiple patterns, use a semicolon ( Default value: |
|
String |
The role used for validating that a service sending a request to the microservice can impersonate another user. Default value: |
|
String |
The name of the Keycloak realm. Used when requesting an access token during authorization. |
|
String |
The URL of the server where Keycloak is running. |
|
String |
The expected recipients of the Keycloak token.
Used to validate the access (bearer) token obtained from Keycloak.
If the value is Default value: |
|
String |
The client token identifier of the microservice. Used when requesting an access token during authorization. |
|
String |
The expected algorithm that was used to sign the access (bearer) token obtained from Keycloak. Default value: |
|
String |
The issuer of the Keycloak token.
Used to validate the access (bearer) token obtained from Keycloak.
If the value is Default value: |
|
Number |
Defines the minimum amount of time between two consecutive requests for Keycloak certificates during which Keycloak is not asked for new certificates. This acts as a prevention against DDoS attacks with an unknown key. Expressed in seconds. Default value: |
|
Number |
Defines how long the public certificates from Keycloak are cached on the microservice side. If this time is exceeded, new certificates are fetched from Keycloak before the microservice makes an attempt to authenticate. If this time is not exceeded, but the public certificate for the key parsed from the authentication attempt was not found in the cache, new certificates are fetched from Keycloak and authentication is attempted again. Expressed in seconds. Default value: |
|
String |
The secret key of the microservice client. Used when requesting an access token during authorization. |
|
String |
The deployment settings with public JWT keys for other modules communicating with the microservice. The following fields are available:
Example settings for MMM:
Default value: |
TLS/mTLS
Property | Data type | Description |
---|---|---|
|
String |
All server TLS options can be specified directly for gRPC server.
To set any TLS option for a gRPC server, configure the same set of properties as for the global server TLS configuration (properties with the Default value: |
|
String |
All server TLS options can be specified directly for HTTP server.
To set any TLS option for an HTTP server, configure the same set of properties as for the global server TLS configuration (properties with the Default value: |
|
Boolean |
Defines whether the gRPC and HTTP servers should generate their self-signed certificate.
The private key is saved to a location specified by Default value: |
|
String |
The path to the generated certificate of the gRPC and HTTP servers.
For example, Default value: |
|
Boolean |
Defines whether the gRPC and HTTP servers should use TLS authentication. Default value: |
|
String |
The private key name specified in the provided keystore that is used for TLS.
Does not work with Default value: |
|
String |
The password for the private key of the gRPC and HTTP servers.
Used if the private key is encrypted. Does not work with Default value: |
|
String |
Points to the keystore containing private and public key certificates that are used by the gRPC and HTTP servers.
For example, Default value: |
|
String |
The password for the keystore. Used if the keystore is encrypted. Default value: |
|
String |
The type of the keystore.
Possible types are Default value: |
|
String |
Defines whether the gRPC and HTTP servers require clients to be authenticated.
Possible values are Default value: |
|
String |
The path to the generated private key of the gRPC and HTTP servers.
For example, Default value: |
|
String |
Points to the truststore with all the trusted certification authorities (CAs) used in the gRPC and HTTP TLS communication.
For example, Default value: |
|
String |
The password for the truststore. Used if the truststore is encrypted. Default value: |
|
String |
The type of the truststore.
Possible types are Default value: |
Security Headers
Property | Data type | Description |
---|---|---|
|
String |
The value of the HTTP Strict-Transport-Security (HSTS) response header. Used only when HTTPS is enabled. Informs browsers that the resource should only be accessed using the HTTPS protocol. Default value: |
Parallelism
Property | Data type | Description |
---|---|---|
|
Number |
An alternative way of overriding the number of parallel threads spawned by low-level calculations that are used by machine learning algorithms. If the value is set to Relies on the static OpenBLAS API and might be ignored depending on the compilation options for the OpenBLAS library.
When this property is set, OpenBLAS gives it higher priority compared to Default value: |
|
Number |
The number of parallel threads or processes spawned by high-level machine learning algorithms with explicit job management.
If the value is set to Use this option together with Default value: |
|
Number |
The number of parallel threads spawned by low-level calculations that are used by high-level machine learning algorithms.
If the value is set to The property relies on the static OpenBLAS API and OpenMP API, which have a lower overhead than the dynamic API used by the property Use this option together with Default value: |
|
Number |
An alternative way of setting the number of parallel threads spawned by low-level calculations that are used by machine learning algorithms.
If the value is set to Relies on the dynamic OpenBLAS API, which has a higher overhead than the static API used by Default value: |
Internal and properties encryption
Property | Data type | Description |
---|---|---|
|
String |
Points to the keystore containing the symmetric key that is used to decrypt properties with Default value: |
|
String |
The password for the keystore.
Used if the keystore is encrypted.
To use an empty password, set the value to an empty string ( Default value: |
|
String |
The single-line file containing the password for the keystore.
When reading the file, UTF-8 encoding is assumed. Used if the keystore is encrypted.
If specified, the property overrides the value of Default value: |
|
String |
The type of the keystore.
Possible types are Default value: |
|
String |
Points to the keystore containing the symmetric key that is used to decrypt properties with Default value: |
|
String |
The password for the keystore.
Used if the keystore is encrypted.
To use an empty password, set the value to an empty string ( Default value: |
|
String |
The single-line file containing the password for the keystore.
When reading the file, UTF-8 encoding is assumed. Used if the keystore is encrypted.
If specified, the property overrides the value of Default value: |
|
String |
The type of the keystore. Possible types are Default value: |
Manager
Property | Data type | Description |
---|---|---|
|
Number |
The gRPC port of the server where the MDM Engine is running. Default value: |
|
String |
The IP address or the URL of the server where the MDM Engine is running. Default value: |
|
Number |
The minimum desired model quality after training phase finishes.
The value needs to be between Default value: |
|
String |
The network address to which the Manager gRPC server should bind. Default value: |
|
Number |
The port where the gRPC interface of the Manager microservice is running. Default value: |
|
String |
The network address to which the Manager HTTP server should bind. Default value: |
|
Number |
The HTTP port where the Manager microservice is running. Default value: |
|
Number |
Defines how often the Manager microservice runs its background processing thread. Expressed in seconds. Default value: |
Worker
Property | Data type | Description |
---|---|---|
|
Boolean |
Forwards stdout and stderr streams of the computation process to the respective stdout and stderr streams of the Worker service for debugging purposes. Default value: |
|
Number |
Defines the amount of time the Worker service waits for its job subprocess to shutdown gracefully. Kills the subprocess otherwise. Expressed in seconds. Default value: |
|
String |
List of job types the Worker can consume. Allowed values are: 'initialization', 'proposals_generation', 'rules_extraction' Default value: |
|
String |
The network address to which the Worker HTTP server should bind. Default value: |
|
Number |
The HTTP port where the Worker microservice is running. Default value: |
|
String |
The path to the Default value: |
|
Number |
Defines how often the Worker microservice runs its background processing thread. Expressed in seconds. Default value: |
|
Number |
The dedupe clustering decision threshold that functions as a compromise between precision and recall.
The value needs to be between Default value: |
|
Number |
The number of records that are uniformly sampled from all the records fetched from MDM. Those records are the only ones used for initializing and training the AI Matching model. Default value: |
|
Number |
The number of records that the AI Matching selects out of the records covered by the property Default value: |
|
Number |
The maximum number of columns in one extracted rule. A higher number means that the extracted rules can be more complex, that is, use more columns, but the rule extraction might take significantly longer. Default value: |
|
Number |
The maximum number of confident negative pairs to be considered for rule extraction. A higher number means that the extraction of rules is significantly slower but the results could be more precise. Default value: |
|
Number |
The maximum number of confident positive pairs to be considered for rule extraction. A higher number means that the extraction of rules is significantly slower but the results could be more precise. Default value: |
Was this page useful?