DPE Configuration
In on-premise deployments, the following properties configure Data Processing Engine (DPE) and are provided in the dpe/etc/application.properties file.
In addition, the following properties can be specified for DPE as well:
Basic settings
| Property | Data type | Description | ||
|---|---|---|---|---|
| 
 | Boolean | Enables debug logging mode.
If set to  Default value:  | ||
| 
 | String | Used to send information to Data Processing Manager (DPM) about the location of DPE. If DPM cannot reach DPE by the machine hostname, this property overrides that hostname. If not set, the hostname is determined by trying to resolve the DNS records for the  Default value:  | ||
| 
 | Number | Used to send information to DPM about the location of DPE. If DPM cannot reach DPE by the gRPC port, this property overrides that port. Default value:  | ||
| 
 | Number | The HTTP server port for DPE. Default value:  | ||
| 
 | String | A comma-separated list of environments in which this instance of DPE can be used. In the current release, this only affects user restrictions when accessing the file system data sources. | ||
| 
 | String | A meaningful name for DPE that is used in DPM Admin Console. This is especially useful in firewall-friendly mode as it makes it easier to identify DPE instances. We recommend using alphanumeric characters without spaces. If not set, the default value matches the hostname or the URL and the port of the DPE server.
In firewall-friendly mode, the default value is  | ||
| 
 | Boolean | When set to true , alerts are sent to Ataccama Support when an on-premise DPE disconnects from ONE. Default value:  
 | 
DPE Labels
To configure the DPE label, use the property ataccama.one.dpe.label.
By default, the property is set to dpe, which means that all DPE instances where the label is not specifically set are grouped together.
You might want to group DPEs according to certain criteria, such as the connector settings for on-premise deployment.
To do this, set the ataccama.one.dpe.label property for each DPE to a value that is unique to that DPE instance.
For example, if you have two DPEs with different connector settings for on-premise deployment, you could set the ataccama.one.dpe.label property to dpe-onprem-1 for the first DPE and dpe-onprem-2 for the second DPE.
This would create two separate groups of DPEs, one for each connector setting.
| DPEs with different configurations must not have the same label. If you need to make changes to a specific instance, be sure to give it a unique label. Only use shared labels when you expect the configuration to be identical in the same DPE replicas (for example, for cloud DPEs). | 
Keycloak authentication
| Properties | Data type | Description | 
|---|---|---|
| 
 | String | The URL of the server where Keycloak is running. Default value:  | 
| 
 | String | The name of the Keycloak realm. Default value:  | 
| 
 | String | The client identifier used to verify the admin user’s authorization token. Default value:  | 
| 
 | String | The secret key of the client identifier for the admin account. Secret keys can be generated using Keycloak. Default value:  | 
| 
 | String | The client identifier. Used to verify a user’s authorization token and to log in a user. Default value:  | 
| 
 | String | The secret key of the client. Secret keys can be generated using Keycloak. Default value:  | 
| 
 | String | Specifies the issuer of the JWT token. Typically, Keycloak uses the URL of the realm as the token issuer. Default value:  | 
| 
 | String | The type of client token authentication.
Possible values:  Default value:  | 
| 
 | String | Points to the keystore file used for  | 
| 
 | String | The type of the keystore used for  Default value:  | 
| 
 | String | The password of the keystore used for  | 
| 
 | String | The private key name specified in the keystore used for  The default value is the client identifier. | 
| 
 | String | The password for the private key. Used if the private key is encrypted. The default value is the keystore password. | 
| 
 | String | Specifies for how long the JWT token used for authentication in Keycloak remains valid.
Used for  Default value:  | 
gRPC Server
General settings
| Property | Data type | Description | ||
|---|---|---|---|---|
| 
 | Number | The port where the gRPC server is running. Default value:  | ||
| 
 | String | Limits the size of messages that the gRPC server can process. The message size needs to fit in the working memory. Default value:  | ||
| 
 | Number | The gRPC server request executor core pool size. Make sure this value is sufficient for the usual traffic in order to avoid creating additional connections because the current version of the gRPC library has the keep-alive for threads set to 0. Default value:  
 | ||
| 
 | Number | The gRPC server request executor max pool size. The queue length of executor is unlimited by default, and therefore the maximum pool size is effectively ignored. However, the value should be equal to or higher than the core pool size. Default value:  
 | 
Authentication
| Property | Data type | Description | 
|---|---|---|
| 
 | Boolean | Enables basic authentication on the gRPC Server. Default value:  | 
| 
 | Boolean | Enables bearer authentication on the gRPC Server. Default value:  | 
| 
 | Boolean | Enables internal JWT token authentication on the gRPC Server. Default value:  | 
| 
 | Boolean | Enables mTLS authentication on the gRPC Server. Default value:  | 
TLS/mTLS
| Property | Data type | Description | 
|---|---|---|
| 
 | Boolean | Enables TLS authentication on the gRPC server.
When set to  This property can be set for each DPE. Default value:  | 
| 
 | String | Defines whether mutual TLS authentication is enabled.
Possible values:  When set to  Disabled by default. | 
| 
 | Boolean | Specifies whether the server allows all TLS connection attempts. Used if mTLS is enabled. | 
| 
 | String | The full path to the TLS certificate, for example,  | 
| 
 | String | The full path to the private key of the certificate, for example,  | 
| 
 | String | The full path to the public certificate of the root certificate authority, for example,  | 
| 
 | String | The domain name of the generated server certificate. | 
| 
 | String | The full path to the keystore containing private and public key certificates that are used by the gRPC server.
This property has a higher priority compared to  | 
| 
 | String | The type of keystore.
Possible values:  | 
| 
 | String | The password for the keystore. Used if the keystore is encrypted. | 
| 
 | String | The private key name specified in the provided keystore. | 
| 
 | String | The password for the private key of the gRPC server. Used if the private key is encrypted. If the private key is not set, the password must be the same for all the items in the keystore. | 
gRPC Client
General settings
| Property | Data type | Description | 
|---|---|---|
| 
 | String | Limits the size of messages that the gRPC client can process. Default value:  | 
TLS/mTLS
| Property | Data type | Description | 
|---|---|---|
| 
 | Boolean | Enables TLS authentication when communicating with the gRPC server. It also ensures that the communication with the server over gRPC is secure (encrypted instead of in plaintext) and guarantees the integrity of messages. Default value:  | 
| 
 | Boolean | Enables mutual TLS authentication between the server and the client. Default value:  | 
| 
 | Boolean | Specifies whether the gRPC client should verify the certificate of the server with which they communicate. Used if mTLS is enabled. | 
| 
 | String | The full path to the TLS certificate, for example,  | 
| 
 | String | The full path to the private key of the certificate, for example,  | 
| 
 | String | The full path to the public certificate of the root certificate authority, for example,  | 
| 
 | String | The full path to a truststore file that contains public keys and certificates against which the client verifies the certificates from the server, for example,  | 
| 
 | String | The password for the truststore. Used if the truststore is encrypted. | 
| 
 | String | The type of truststore.
Possible values:  | 
DPM connection
| Property | Data type | Description | 
|---|---|---|
| 
 | Number | Defines how often DPE checks the connection to DPM. If there is no reply from DPM, DPE tries to register again with DPM. Expressed in milliseconds. Default value:  | 
| 
 | String | The name of the gRPC channel that DPE uses to communicate with DPM. Default value:  | 
| 
 | String | The IP address or the URL of the server where DPM is running. Default value:  | 
| 
 | Number | The port where the gRPC server is running. Default value:  | 
| 
 | Boolean | Enables TLS authentication when communicating with the DPM gRPC server. It also ensures that the communication with the server over gRPC is secure (encrypted instead of in plaintext) and guarantees the integrity of messages. Default value:  | 
| 
 | String | Defines whether mutual TLS authentication is enabled.
Possible values:  When set to  Disabled by default. | 
| 
 | Boolean | Specifies whether the gRPC client should verify the certificate of the server with which they communicate. Used if mTLS is enabled. | 
| 
 | String | The full path to the TLS certificate, for example,  | 
| 
 | String | The full path to the private key of the certificate, for example,  | 
| 
 | String | The full path to the public certificate of the root certificate authority, for example,  | 
| 
 | String | The full path to a truststore file that contains public keys and certificates against which the client verifies the certificates from the server, for example,  | 
| 
 | String | The password for the truststore. Used if the truststore is encrypted. | 
| 
 | String | The type of truststore.
Possible values:  | 
Communication mode between DPE and DPM
Starting from version 13.3.1, it is possible to enable communication over bidirectional gRPC stream between DPE and DPM. When this firewall-friendly mode is configured, the follow-up communication from DPM to DPE, such as browsing queries or submitting jobs, does not require opening DPE’s inbound ports to the outside world.
In cases when firewall-friendly mode cannot be configured, the TLS security can be set for all DPE by setting the DPM property ataccama.one.dpm.registry.enforce-tls or by enabling TLS security on selected DPE instances.
We recommend checking the desired security level if any of your DPE instances communicate with DPM via internet.
| Property | Data type | Description | ||
|---|---|---|---|---|
| 
 | String | Defines how DPE connects to DPM. The following options are available: 
 
 Default value:  | ||
| 
 | String | Defines how long DPE waits for the in-process server shutdown before shutting it down forcefully.
Used only if  Default value:  | ||
| 
 | String | Sets the maximum period of time that DPE waits for a new request from DPM before it attempts to register again.
Used only if  Default value:  | 
MDM connection
| Property | Data type | Description | 
|---|---|---|
| 
 | Boolean | If set to  Default value:  | 
| 
 | Boolean | Enables TLS authentication when communicating with MDM. | 
| 
 | String | The full path to the truststore, for example,  | 
| 
 | String | The password for the truststore. | 
Plugins and JDBC drivers
| Property | Data type | Description | 
|---|---|---|
| 
 | String | The location of the plugins folder. Default value:  | 
| 
 | String | Points to the folder containing the JDBC drivers used by DPE. Default value:  | 
| 
 | String | The connection entity name used in the metadata model (MMD) for local file systems. | 
| 
 | String | The connection entity name used in the metadata model (MMD) for ONE MDM data source. | 
| 
 | String | The connection entity name used in the metadata model (MMD) for ONE RDM data source. | 
| 
 | String | The connection entity name used in the metadata model (MMD) for S3 data source. | 
Snowflake query pushdown processing
| Property | Data type | Description | 
|---|---|---|
| 
 | String | Sets the period of time after which Snowflake queries that are still running are canceled. Default value:  | 
| 
 | String | Specifies how queries are created.
Possible values:  The  Default value:  | 
| 
 | Number | Defines how many values are computed in a single query. Currently applies only to the basic count query. To turn the feature off, pick a large number, such as  Default value:  | 
| 
 | String | Defines which mask function is used.
Possible values:  The  Default value:  | 
| 
 | Boolean | Enables automatic upload of user-defined functions to Snowflake. Default value:  | 
| 
 | String | Specifies the age of lookups with ready status, for which we have a newer version, at which it is considered for removal. Default value: 2d. For a full list of accepted units, see Duration units. | 
| 
 | Number | Specifies how many keys are uploaded in a batch during lookup upload. Default value:  | 
| 
 | Number | Specifies the maximum number of keys when uploading lookup tables. Larger lookup tables are inserted into a temporary table. Default value:  | 
| 
 | String | Specifies how often to check and clean old lookups. Default value:  | 
| 
 | String | Specifies how often to poll for lookups being initialized in other session. Default value:  | 
| 
 | String | Specifies the amount of time before a lookup with initializing status is considered for removal. Default value:  | 
| 
 | String | Specifies the age of lookup with ready status, for which we don’t have a newer version, at which it is considered for removal. Default value:  | 
| 
 | String | If set to  Default value:  | 
| 
 | Integer | Specifies the maximum total number of filter combinations in monitoring project attribute filters. If this limit is exceeded, the monitoring job fails. Increasing this limit might result in higher resource consumption on the database side. Default value:  | 
DataConnect plugin
The DataConnect plugin is used to query metadata in a particular data source. The retrieved information can be cached.
| Property | Data type | Description | 
|---|---|---|
| 
 | Number | The maximum number of threads that can be dedicated to closing the clients that have been evicted. Default value:  | 
| 
 | Boolean | If set to  Default value:  | 
| 
 | Number | The maximum size of the DataConnect cache for all data sources. The value refers to the number of cached client instances, where each instance corresponds to a combined definition of a data source and one of its credentials. To disable caching, set the value to  Default value:  | 
| 
 | Number | The maximum number of browse query results that are cached for each DataSource client. Default value:  | 
| 
 | String | Specifies for how long the entries are stored in the cache. Default value:  | 
| 
 | String | Defines for how long the DataSource client cache stores items for browsing. Starting from 14.1.0, the cache is cleared after this period expires. Default value:  | 
File system
| Property | Data type | Description | ||
|---|---|---|---|---|
| 
 | String | The root path of the default mounted file system. It is possible to set up multiple file systems in the same DPE. These folders can be used for profiling and browsing data in MMM. To add another file system, replace the name of the file system ( 
 | ||
| 
 | String | Restricts which roles are allowed to work with a particular file system.
To set limitations for another file system, make sure to provide the correct file system name and the environment name, for example,  | ||
| 
 | String | The sample file size for items loaded from local file systems.
Used to detect the content type, such as  Default value:  | 
ONE Object Storage
| Property | Data type | Description | 
|---|---|---|
| 
 | Boolean | If set to  Default value:  | 
| 
 | Number | Defines how often the object storage is checked for objects that need to be removed following a storage failure. Expressed in milliseconds. Default value:  | 
Persistent storage
Persistent storage is intended to work as DPE’s internal storage. This means that it should not store any user data or, in general, use a database with user data.
There are two types of persistent storage: the default embedded database and a custom data source.
Using a custom data source instead of the embedded one is optional and requires further configuration.
The driver required for the custom data source must be manually added to the lib folder with other database drivers.
| Property | Data type | Description | 
|---|---|---|
| 
 | String | The type of data source for persistent storage.
If you are using the default embedded database, set the value to  Default value:  | 
| 
 | String | Points to the folder storing the persisted data. Persisted data includes all application data, files for processing, and processing results. Default value:  | 
Custom data source
| Property | Data type | Description | 
|---|---|---|
| 
 | String | The JDBC driver class name for the custom data source for persistent storage, for example,  | 
| 
 | String | A JDBC connection string pointing to the custom data source for persistent storage, for example,  | 
| 
 | String | The username for the custom data source for persistent storage. Default value:  | 
| 
 | String | The password for the custom data source persistent storage. Default value:  | 
Executor
| When setting any launch parameters, make sure the property values (such as file paths) do not include any space characters. This also applies to values in  Compare the following examples: 
 | 
| Property | Data type | Description | ||
|---|---|---|---|---|
| 
 | String | The classpath delimiter used for libraries for local jobs. Default value:  | ||
| 
 | String | The libraries needed for local jobs.
Jobs are stored as  Starting from 13.4.0, all JDBC libraries need to be specified here in order to use the  You can specify: 
 Default value:  | ||
| 
 | String | The directory that holds temporary work files for local jobs.
A folder is created for each job within this directory, with the name containing the  If not specified, the system environment variable  After each job finishes, the corresponding work files are automatically deleted from this directory. 
 | ||
| 
 | String | References the script for customizing how local jobs are launched.
For example, you can change which shell script is used to start the job.
The script is located in the  The default value for Linux is  
 | ||
| 
 | String | (Optional) Excludes certain Java libraries and drivers that are in the same classpath but are not needed for processing. Each library needs to be prefixed by an exclamation mark ( Example:  | ||
| 
 | String | Points to the location of Ataccama licenses needed for local jobs.
The property should be configured only if licenses are stored outside of the standard locations, that is the home directory of the user and the folder  Default value:  | ||
| 
 | String | Sets the  The runtime is compatible with Java 8 or later. | ||
| 
 | String | Configures any environment variable for running local jobs. To set a custom environment variable, provide the name of the variable instead of the placeholder  | ||
| plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.job-specific-system-properties.allowed-keys plugin.executor-launch-model.ataccama.one.launch-type-properties.SPARK.job-specific-system-properties.allowed-keys | String | A comma-separated list of keys identifying the system properties that can be set when submitting a job, which makes their values specific for a particular job.
These job-specific system properties are then passed to each spawned runtime JVM associated with the identified launch type ( | ||
| 
 | Boolean | If set to  Default value:  | ||
| 
 | String | Specifies the time period after which job results are deleted from DPE, starting from when the job finishes. Typically, the property serves as a backup in case job files could not be deleted after the job was completed. This option applies regardless of how the property  Default value:  | ||
| 
 | Number | Defines how often DPM checks for recently completed jobs in DPE. Expressed in milliseconds. Default value:  | ||
| 
 | Number | The number of old jobs to be cleaned at once. Default value:  | ||
| 
 | Number | The maximum duration of each job-cleaning run.
This property must be set with respect to  Default value:  | ||
| 
 | Number | The maximum number of processes or threads that can be run in parallel. Default value:  | ||
| 
 | Boolean | Enables notifications over gRPC from the ONE runtime server for runtime jobs. Default value:  | ||
| 
 | Boolean | Enables notifications over gRPC from the ONE runtime server for data quality monitoring jobs. Default value:  | ||
| 
 | String | A regular expression matching the names of environment variables set for DPE that can be accessed by child processes spawned for runtime jobs. Default value:  | ||
| 
 | Number | The required number of evaluation threads. To set this, use  The variable defines the number of models which are processed in parallel in the DQ engine. Default value:  | ||
| 
 | Boolean | Set this property to  To set this, use  For more information about the query limits, see Data Sources Configuration, section Troubleshooting. Default value:  | ||
| 
 | Number | Batch limit for component rules. 
 To set this using  Default value:  | 
Shutdown
| Property | Data type | Description | 
|---|---|---|
| 
 | String | The type of application shutdown. If set to  Default value:  | 
| 
 | String | Defines how long a shutdown phase can last. After this time expires, the application shuts down regardless of any active requests. Default value:  | 
| 
 | Boolean | If set to  The waiting period is defined through the property  | 
| 
 | String | How long the application waits for running jobs to complete before shutting down gracefully. Once the timeout is reached, any remaining jobs are canceled. Should be shorter than the value set in  Default value:  | 
Was this page useful?