User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

DPE Configuration

In on-premise deployments, the following properties configure Data Processing Engine (DPE) and are provided in the dpe/etc/application.properties file.

In addition, the following properties can be specified for DPE as well:

Basic settings

Property Data type Description

ataccama.one.dpe.service.debug

Boolean

Enables debug logging mode. If set to true, job logs are shown in the server logs and temporary files are not cleaned immediately from the job workspace. The scheduler removes temporary job files later depending on the settings.

Default value: false.

ataccama.one.dpe.service.hostname

String

Used to send information to Data Processing Manager (DPM) about the location of DPE. If DPM cannot reach DPE by the machine hostname, this property overrides that hostname.

If not set, the hostname is determined by trying to resolve the DNS records for the localhost address.

Default value: localhost.

ataccama.one.dpe.service.port

Number

Used to send information to DPM about the location of DPE. If DPM cannot reach DPE by the gRPC port, this property overrides that port.

Default value: 8532.

server.port

Number

The HTTP server port for DPE.

Default value: 8034.

ataccama.one.dpe.environment

String

A comma-separated list of environments in which this instance of DPE can be used. In the current release, this only affects user restrictions when accessing the file system data sources.

ataccama.one.dpe.name

String

A meaningful name for DPE that is used in DPM Admin Console. This is especially useful in firewall-friendly mode as it makes it easier to identify DPE instances. We recommend using alphanumeric characters without spaces.

If not set, the default value matches the hostname or the URL and the port of the DPE server. In firewall-friendly mode, the default value is 0.0.0.0 unless modified.

com.ataccama.dpe.sla-monitored

Boolean

When set to true , alerts are sent to Ataccama Support when an on-premise DPE disconnects from ONE.

Default value: false.

The response time is determined by the Service Level Agreement (SLA).

DPE Labels

To configure the DPE label, use the property ataccama.one.dpe.label. By default, the property is set to dpe, which means that all DPE instances where the label is not specifically set are grouped together.

You might want to group DPEs according to certain criteria, such as the connector settings for on-premise deployment. To do this, set the ataccama.one.dpe.label property for each DPE to a value that is unique to that DPE instance.

For example, if you have two DPEs with different connector settings for on-premise deployment, you could set the ataccama.one.dpe.label property to dpe-onprem-1 for the first DPE and dpe-onprem-2 for the second DPE. This would create two separate groups of DPEs, one for each connector setting.

DPEs with different configurations must not have the same label. If you need to make changes to a specific instance, be sure to give it a unique label.

Only use shared labels when you expect the configuration to be identical in the same DPE replicas (for example, for cloud DPEs).

Keycloak authentication

Properties Data type Description

ataccama.authentication.keycloak.server-url

String

The URL of the server where Keycloak is running.

Default value: http://localhost:8080/auth.

ataccama.authentication.keycloak.realm

String

The name of the Keycloak realm.

Default value: ataccamaone.

ataccama.authentication.keycloak.admin.client-id

String

The client identifier used to verify the admin user’s authorization token.

Default value: dpe-admin-client.

ataccama.authentication.keycloak.admin.secret

String

The secret key of the client identifier for the admin account. Secret keys can be generated using Keycloak.

Default value: dpe-admin-client-s3cret.

ataccama.authentication.keycloak.token.client-id

String

The client identifier. Used to verify a user’s authorization token and to log in a user.

Default value: dpe-token-client.

ataccama.authentication.keycloak.token.secret

String

The secret key of the client. Secret keys can be generated using Keycloak.

Default value: dpe-token-client-s3cret.

ataccama.authentication.keycloak.token.issuer

String

Specifies the issuer of the JWT token. Typically, Keycloak uses the URL of the realm as the token issuer.

Default value: ${ataccama.authentication.keycloak.server-url}/realms/${ataccama.authentication.keycloak.realm}.

ataccama.authentication.keycloak.admin.type

String

The type of client token authentication. Possible values: BASIC, SIGNED_JWT, SECRET_JWT.

Default value: BASIC.

ataccama.authentication.keycloak.admin.key-store.file

String

Points to the keystore file used for SIGNED_JWT authentication.

ataccama.authentication.keycloak.admin.key-store.format

String

The type of the keystore used for SIGNED_JWT authentication. Possible values: JKS, PKCS12.

Default value: JKS.

ataccama.authentication.keycloak.admin.key-store.password

String

The password of the keystore used for SIGNED_JWT authentication. Used if the keystore is encrypted.

ataccama.authentication.keycloak.admin.key-store.key-alias

String

The private key name specified in the keystore used for SIGNED_JWT authentication.

The default value is the client identifier.

ataccama.authentication.keycloak.admin.key-store.key-password

String

The password for the private key. Used if the private key is encrypted.

The default value is the keystore password.

ataccama.authentication.keycloak.admin.token-expiration

String

Specifies for how long the JWT token used for authentication in Keycloak remains valid. Used for SIGNED_JWT and SECRET_JWT authentication strategies.

Default value: 15s. For a full list of accepted units, see Duration units.

gRPC Server

General settings

Property Data type Description

ataccama.server.grpc.port

Number

The port where the gRPC server is running.

Default value: 8532.

ataccama.server.grpc.max-message-size

String

Limits the size of messages that the gRPC server can process. The message size needs to fit in the working memory.

Default value: 10MB. For a full list of accepted units, see Size units.

ataccama.server.grpc.executor.poolSizeCore

Number

The gRPC server request executor core pool size. Make sure this value is sufficient for the usual traffic in order to avoid creating additional connections because the current version of the gRPC library has the keep-alive for threads set to 0.

Default value: 20.

Make sure this value corresponds to ataccama.client.grpc.executor.poolSizeCore in DPM Configuration.

ataccama.server.grpc.executor.poolSizeMax

Number

The gRPC server request executor max pool size. The queue length of executor is unlimited by default, and therefore the maximum pool size is effectively ignored. However, the value should be equal to or higher than the core pool size.

Default value: 20.

Make sure this value corresponds to ataccama.client.grpc.executor.poolSizeCore in DPM Configuration.

Authentication

Property Data type Description

ataccama.authentication.grpc.basic.enable

Boolean

Enables basic authentication on the gRPC Server.

Default value: true.

ataccama.authentication.grpc.bearer.enable

Boolean

Enables bearer authentication on the gRPC Server.

Default value: true.

ataccama.authentication.grpc.internal.jwt.enable

Boolean

Enables internal JWT token authentication on the gRPC Server.

Default value: true.

ataccama.authentication.grpc.mtls.enable

Boolean

Enables mTLS authentication on the gRPC Server.

Default value: false.

TLS/mTLS

Property Data type Description

ataccama.server.grpc.tls.enabled

Boolean

Enables TLS authentication on the gRPC server. When set to true, DPE communicates with DPM in a TLS-secured way.

This property can be set for each DPE.

Default value: false.

ataccama.server.grpc.tls.mTls

String

Defines whether mutual TLS authentication is enabled. Possible values: NONE, OPTIONAL, REQUIRED.

When set to OPTIONAL, if the server receives an mTLS request, it attempts to authenticate the request using mTLS.

Disabled by default.

ataccama.server.grpc.tls.trust-all

Boolean

Specifies whether the server allows all TLS connection attempts. Used if mTLS is enabled.

ataccama.server.grpc.tls.cert-chain

String

The full path to the TLS certificate, for example, file:/path/to/server.crt.

ataccama.server.grpc.tls.private-key

String

The full path to the private key of the certificate, for example, file:/path/to/server.key.

ataccama.server.grpc.tls.trust-cert-collection

String

The full path to the public certificate of the root certificate authority, for example, file:/path/to/rootCA.crt.

ataccama.server.grpc.tls.hostname`

String

The domain name of the generated server certificate.

ataccama.server.grpc.tls.key-store

String

The full path to the keystore containing private and public key certificates that are used by the gRPC server. This property has a higher priority compared to tls.cert-chain and tls.private-key.

ataccama.server.grpc.tls.key-store-type

String

The type of keystore. Possible values: PKCS12, JCEKS.

ataccama.server.grpc.tls.key-store-password

String

The password for the keystore. Used if the keystore is encrypted.

ataccama.server.grpc.tls.key-alias

String

The private key name specified in the provided keystore.

ataccama.server.grpc.tls.key-password

String

The password for the private key of the gRPC server. Used if the private key is encrypted.

If the private key is not set, the password must be the same for all the items in the keystore.

gRPC Client

General settings

Property Data type Description

ataccama.client.grpc.properties.max-message-size

String

Limits the size of messages that the gRPC client can process.

Default value: 4MB. For a full list of accepted units, see Size units.

TLS/mTLS

Property Data type Description

ataccama.client.grpc.tls.enabled

Boolean

Enables TLS authentication when communicating with the gRPC server. It also ensures that the communication with the server over gRPC is secure (encrypted instead of in plaintext) and guarantees the integrity of messages.

Default value: false.

ataccama.client.grpc.tls.mtls

Boolean

Enables mutual TLS authentication between the server and the client.

Default value: false.

ataccama.client.grpc.tls.trust-all

Boolean

Specifies whether the gRPC client should verify the certificate of the server with which they communicate. Used if mTLS is enabled.

ataccama.client.grpc.tls.cert-chain

String

The full path to the TLS certificate, for example, file:/path/to/client.crt.

ataccama.client.grpc.tls.private-key

String

The full path to the private key of the certificate, for example, file:/path/to/client.key.

ataccama.client.grpc.tls.trust-cert-collection

String

The full path to the public certificate of the root certificate authority, for example, file:/path/to/rootCA.crt.

ataccama.client.grpc.tls.trust-store

String

The full path to a truststore file that contains public keys and certificates against which the client verifies the certificates from the server, for example, file:path/to/trust/cert/trust-store.pfx. Used only when tls.trust-all is disabled.

ataccama.client.grpc.tls.trust-store-password`

String

The password for the truststore. Used if the truststore is encrypted.

ataccama.client.grpc.tls.trust-store-type

String

The type of truststore. Possible values: PKCS12, JCEKS.

DPM connection

Property Data type Description

ataccama.one.dpe.service.dpm.check-connection-interval

Number

Defines how often DPE checks the connection to DPM. If there is no reply from DPM, DPE tries to register again with DPM. Expressed in milliseconds.

Default value: 5000.

ataccama.client.connection.dpm.name

String

The name of the gRPC channel that DPE uses to communicate with DPM.

Default value: dpm.

ataccama.client.connection.dpm.host

String

The IP address or the URL of the server where DPM is running.

Default value: localhost.

ataccama.client.connection.dpm.grpc.port

Number

The port where the gRPC server is running.

Default value: 8531.

ataccama.client.connection.dpm.grpc.tls.enabled

Boolean

Enables TLS authentication when communicating with the DPM gRPC server. It also ensures that the communication with the server over gRPC is secure (encrypted instead of in plaintext) and guarantees the integrity of messages.

Default value: false.

ataccama.client.connection.dpm.grpc.tls.mTls

String

Defines whether mutual TLS authentication is enabled. Possible values: NONE, OPTIONAL, REQUIRED.

When set to OPTIONAL, if the server receives an mTLS request, it attempts to authenticate the request using mTLS.

Disabled by default.

ataccama.client.connection.dpm.grpc.tls.trust-all

Boolean

Specifies whether the gRPC client should verify the certificate of the server with which they communicate. Used if mTLS is enabled.

ataccama.client.connection.dpm.grpc.tls.cert-chain

String

The full path to the TLS certificate, for example, file:/path/to/client.crt.

ataccama.client.connection.dpm.grpc.tls.private-key

String

The full path to the private key of the certificate, for example, file:/path/to/client.key.

ataccama.client.connection.dpm.grpc.tls.trust-cert-collection

String

The full path to the public certificate of the root certificate authority, for example, file:/path/to/rootCA.crt.

ataccama.client.connection.dpm.grpc.tls.trust-store-file

String

The full path to a truststore file that contains public keys and certificates against which the client verifies the certificates from the server, for example, file:path/to/trust/cert/trust-store.pfx. Used only when tls.trust-all is disabled.

ataccama.client.connection.dpm.grpc.tls.trust-store-password

String

The password for the truststore. Used if the truststore is encrypted.

ataccama.client.connection.dpm.grpc.tls.trust-store-type

String

The type of truststore. Possible values: PKCS12, JCEKS.

Communication mode between DPE and DPM

Starting from version 13.3.1, it is possible to enable communication over bidirectional gRPC stream between DPE and DPM. When this firewall-friendly mode is configured, the follow-up communication from DPM to DPE, such as browsing queries or submitting jobs, does not require opening DPE’s inbound ports to the outside world.

In cases when firewall-friendly mode cannot be configured, the TLS security can be set for all DPE by setting the DPM property ataccama.one.dpm.registry.enforce-tls or by enabling TLS security on selected DPE instances. We recommend checking the desired security level if any of your DPE instances communicate with DPM via internet.

Property Data type Description

ataccama.one.dpe.service.dpm.connection.mode

String

Defines how DPE connects to DPM.

The following options are available:

  • NORMAL_REGISTRATION: The default option. In this case, DPM continues attempting to create connections to DPE.

  • FIREWALL_FRIENDLY_REGISTRATION: Enables a firewall-friendly mode of communication between DPM and DPE, which is implemented over a bidirectional gRPC stream that is created each time DPE registers with DPM, including reregistrations.

  • NO_DPM_REGISTRATION: Used for testing purposes. DPE does not attempt to register with DPM.

In cases when connection modes FIREWALL_FRIENDLY_REGISTRATION or NO_DPM_REGISTRATION are used, the following properties are ignored as they are transmitted to DPM only in order to establish a connection to DPE:

  • ataccama.one.dpe.service.hostname

  • ataccama.one.dpe.service.port

Default value: NORMAL_REGISTRATION.

ataccama.one.dpe.service.dpm.connection.firewall-friendly.max-server-termination-waiting

String

Defines how long DPE waits for the in-process server shutdown before shutting it down forcefully. Used only if dpm.connection.mode is set to FIREWALL_FRIENDLY_REGISTRATION.

Default value: 20s. For a full list of accepted units, see Duration units.

ataccama.one.dpe.service.dpm.connection.firewall-friendly.max-connection-inactivity

String

Sets the maximum period of time that DPE waits for a new request from DPM before it attempts to register again. Used only if dpm.connection.mode is set to FIREWALL_FRIENDLY_REGISTRATION.

Default value: 10m. For a full list of accepted units, see Duration units.

MDM connection

Property Data type Description

ataccama.client.connection.dmgraphql.http.enabled

Boolean

If set to true, an HTTP client is created for communicating with MDM.

Default value: true.

ataccama.client.connection.dmgraphql.http.tls.enabled

Boolean

Enables TLS authentication when communicating with MDM.

ataccama.client.connection.dmgraphql.http.tls.trust-store

String

The full path to the truststore, for example, file:/path/to/truststore.

ataccama.client.connection.dmgraphql.http.tls.trust-store-password

String

The password for the truststore.

Plugins and JDBC drivers

Property Data type Description

plugins.path

String

The location of the plugins folder.

Default value: ./plugin.

ataccama.one.dpe.drivers.path

String

Points to the folder containing the JDBC drivers used by DPE.

Default value: ${ataccama.path.root}/lib/jdbc.

plugin.local-fs-datasource.ataccama.one.mmd-connection-entity-name

String

The connection entity name used in the metadata model (MMD) for local file systems.

plugin.mdm.ataccama.one.mmd-connection-entity-name

String

The connection entity name used in the metadata model (MMD) for ONE MDM data source.

plugin.rdm-datasource.ataccama.one.mmd-connection-entity-name

String

The connection entity name used in the metadata model (MMD) for ONE RDM data source.

plugin.s3-datasource.ataccama.one.mmd-connection-entity-name

String

The connection entity name used in the metadata model (MMD) for S3 data source.

Snowflake query pushdown processing

Property Data type Description

plugin.snowflake.com.ataccama.snowflake.profiling.timeout

String

Sets the period of time after which Snowflake queries that are still running are canceled.

Default value: 1d. For a full list of accepted units, see Duration units.

plugin.snowflake.com.ataccama.snowflake.profiling.query-builder-method

String

Specifies how queries are created. Possible values: SNOWPARK, STRING_BUILDER.

The STRING_BUILDER method is faster but less stable in some cases.

Default value: STRING_BUILDER.

plugin.snowflake.com.ataccama.snowflake.profiling.parallel-step

Number

Defines how many values are computed in a single query. Currently applies only to the basic count query.

To turn the feature off, pick a large number, such as 1000000.

Default value: 16.

plugin.snowflake.com.ataccama.snowflake.profiling.mask-query-method

String

Defines which mask function is used. Possible values: JAVA_UDF, REGEX.

The REGEX method is faster but only supports strings with ASCII characters.

Default value: REGEX.

plugin.snowflake.ataccama.one.udf-sync-enabled

Boolean

Enables automatic upload of user-defined functions to Snowflake.

Default value: true.

plugin.snowflake.ataccama.one.superseded-lookup-max-age

String

Specifies the age of lookups with ready status, for which we have a newer version, at which it is considered for removal.

Default value: 2d. For a full list of accepted units, see Duration units.

plugin.snowflake.ataccama.one.lookup-upload-batch-size

Number

Specifies how many keys are uploaded in a batch during lookup upload.

Default value: 1000.

plugin.snowflake.ataccama.one.lookup-larger-size-boundary

Number

Specifies the maximum number of keys when uploading lookup tables. Larger lookup tables are inserted into a temporary table.

Default value: 100.

plugin.snowflake.ataccama.one.lookup-cleanup-period

String

Specifies how often to check and clean old lookups.

Default value: 1h. For a full list of accepted units, see Duration units.

plugin.snowflake.ataccama.one.initializing-lookup-poll-interval

String

Specifies how often to poll for lookups being initialized in other session.

Default value: 1s. For a full list of accepted units, see Duration units.

plugin.snowflake.ataccama.one.initializing-lookup-max-age

String

Specifies the amount of time before a lookup with initializing status is considered for removal.

Default value: 4m. For a full list of accepted units, see Duration units.

plugin.snowflake.ataccama.one.general-lookup-max-age

String

Specifies the age of lookup with ready status, for which we don’t have a newer version, at which it is considered for removal.

Default value: 28d. For a full list of accepted units, see Duration units.

logging.level.com.ataccama.dpe.plugin.snowflake.profiling

String

If set to DEBUG, it enables detailed logging of the tasks performance. Disabled by default.

Default value: DEBUG.

DataConnect plugin

The DataConnect plugin is used to query metadata in a particular data source. The retrieved information can be cached.

Property Data type Description

plugin.data-connect.ataccama.one.max-closing-threads

Number

The maximum number of threads that can be dedicated to closing the clients that have been evicted.

Default value: 10.

plugin.data-connect.ataccama.one.caching.record-stats

Boolean

If set to true, the cache statistics are recorded, which can be useful for performance monitoring. To turn it off, set the property to false.

Default value: true.

plugin.data-connect.ataccama.one.caching.maximum-size

Number

The maximum size of the DataConnect cache for all data sources. The value refers to the number of cached client instances, where each instance corresponds to a combined definition of a data source and one of its credentials.

To disable caching, set the value to 0.

Default value: 50.

plugin.data-connect.ataccama.one.caching.maximum-browse-items-results

Number

The maximum number of browse query results that are cached for each DataSource client.

Default value: 100.

plugin.data-connect.ataccama.one.caching.duration

String

Specifies for how long the entries are stored in the cache.

Default value: 450s. For a full list of accepted units, see Duration units.

plugin.data-connect.ataccama.one.caching.browse-items-lifetime

String

Defines for how long the DataSource client cache stores items for browsing. Starting from 14.1.0, the cache is cleared after this period expires.

Default value: 300s. For a full list of accepted units, see Duration units.

File system

Property Data type Description

plugin.local-fs-datasource.ataccama.one.mounted.paths.default

String

The root path of the default mounted file system.

It is possible to set up multiple file systems in the same DPE. These folders can be used for profiling and browsing data in MMM.

To add another file system, replace the name of the file system (default) in the property name with the name of the new file system and specify its location, for example:

plugin.local-fs-datasource.ataccama.one.mounted.paths.myFileSystem=../../filesystem
The name of a file system must not contain any space characters.

ataccama.one.platform.local-file-systems.<file_system_name>.<environment_name>.allowed-roles

String

Restricts which roles are allowed to work with a particular file system. To set limitations for another file system, make sure to provide the correct file system name and the environment name, for example, ataccama.one.platform.local-file-systems.default.dev.allowed-roles=LFS_PROD.

plugin.local-fs-datasource.ataccama.one.guess.content-size

String

The sample file size for items loaded from local file systems. Used to detect the content type, such as application/octet stream, and so on.

Default value: 100KB. For a full list of accepted units, see Size units.

ONE Object Storage

Property Data type Description

ataccama.one.storage-for-removal-check-enabled

Boolean

If set to true, in case an error occurs while writing objects to ONE Object Storage, all previously stored objects are removed during cleanup.

Default value: true.

ataccama.one.storage-for-removal-check-interval

Number

Defines how often the object storage is checked for objects that need to be removed following a storage failure. Expressed in milliseconds.

Default value: 600000.

Persistent storage

Persistent storage is intended to work as DPE’s internal storage. This means that it should not store any user data or, in general, use a database with user data.

There are two types of persistent storage: the default embedded database and a custom data source. Using a custom data source instead of the embedded one is optional and requires further configuration. The driver required for the custom data source must be manually added to the lib folder with other database drivers.

Property Data type Description

ataccama.one.dpe.service.persistence.datasource-type

String

The type of data source for persistent storage. If you are using the default embedded database, set the value to EMBEDDED (default). If you want to use another data source, set the value to CUSTOM.

Default value: EMBEDDED.

ataccama.one.dpe.service.persistence.location

String

Points to the folder storing the persisted data. Persisted data includes all application data, files for processing, and processing results.

Default value: ${ataccama.path.storage}.

Custom data source

Property Data type Description

ataccama.one.dpe.service.persistence.driver-class-name

String

The JDBC driver class name for the custom data source for persistent storage, for example, org.h2.Driver.

ataccama.one.dpe.service.persistence.url

String

A JDBC connection string pointing to the custom data source for persistent storage, for example, jdbc:h2:mem:test.

ataccama.one.dpe.service.persistence.username

String

The username for the custom data source for persistent storage.

Default value: dpe.

ataccama.one.dpe.service.persistence.password

String

The password for the custom data source persistent storage.

Default value: dpe.

Executor

When setting any launch parameters, make sure the property values (such as file paths) do not include any space characters.

This also applies to values in JAVA_OPTS. However, you can use a space to separate different options in JAVA_OPTS.

Compare the following examples:

  • Correct formatting (space used to separate different values, no spaces within a single value): plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Da=b -Dc=d

  • Incorrect formatting (space used within a single value): plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Da=some value

Property Data type Description

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.cpdelim

String

The classpath delimiter used for libraries for local jobs.

Default value: ; (semicolon).

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.cp.runtime

String

The libraries needed for local jobs. Jobs are stored as tmp/jobs/${jobId} in the root directory of the module.

Starting from 13.4.0, all JDBC libraries need to be specified here in order to use the driverClass in Global runtime configuration.

You can specify:

  • A whole folder, for example:

    plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.cp.runtime=../../../build/runtime/*;../../../lib/jdbc
  • A file, for example:

    plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.cp.runtime=../../../build/runtime/*;../../../lib/jdbc/snowflake-*.jar

Default value: ../../../lib/runtime/*;../../../lib/jdbc_ext/*.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.TEMP_ROOT

String

The directory that holds temporary work files for local jobs. A folder is created for each job within this directory, with the name containing the jobId.

If not specified, the system environment variable tmp is used. If the system environment variable is not set up, the temporary job folder is created in the root directory of the machine.

After each job finishes, the corresponding work files are automatically deleted from this directory.

You can change the default location of this directory by editing the custom starting script to which this property plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.exec is pointing to. For example: /opt/ataccama/bin/local/exec_local.sh.

The plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.exec property needs to be configured for this property to work.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.exec

String

References the script for customizing how local jobs are launched. For example, you can change which shell script is used to start the job. The script is located in the bin/local folder in the root directory of the module.

The default value for Linux is local/exec_local.sh.

You can use JAVA_OPTS to pass additional properties to the runtime.

Some useful examples include:

  • plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-XX:MaxRAMPercentage=60: Sets the maximum size of the Java heap as a percentage of the total memory available to the JVM.

  • plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Ddqd.io.storage.compress=compress: If set to compress, the processed records kept in the file storage should be compressed.

    The default value is false. Used mainly for hybrid deployment.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.cp.!exclude

String

(Optional) Excludes certain Java libraries and drivers that are in the same classpath but are not needed for processing.

Each library needs to be prefixed by an exclamation mark (!). Libraries are separated by the delimiter defined in the cpdelim property.

Example: !hadoop*.jar.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.dqc.licenses

String

Points to the location of Ataccama licenses needed for local jobs. The property should be configured only if licenses are stored outside of the standard locations, that is the home directory of the user and the folder dpe/lib/runtime/license_keys.

Default value: ../../../lib/runtime/license_keys.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_HOME

String

Sets the JAVA_HOME variable for running local jobs, for example, /usr/java/jdk1.8.0_65. The property can only be used together with the property plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.exec.

The runtime is compatible with Java 8 or later.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.ANY_ENVIRONMENT_VARIABLE

String

Configures any environment variable for running local jobs.

To set a custom environment variable, provide the name of the variable instead of the placeholder ANY_ENVIRONMENT_VARIABLE and its value.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.job-specific-system-properties.allowed-keys

plugin.executor-launch-model.ataccama.one.launch-type-properties.SPARK.job-specific-system-properties.allowed-keys

String

A comma-separated list of keys identifying the system properties that can be set when submitting a job, which makes their values specific for a particular job. These job-specific system properties are then passed to each spawned runtime JVM associated with the identified launch type (LOCAL, SPARK).

plugin.executor.ataccama.one.clear-job

Boolean

If set to true, job data is removed immediately after the job finishes.

Default value: true.

plugin.executor.ataccama.one.job-expiration-interval

String

Specifies the time period after which job results are deleted from DPE, starting from when the job finishes. Typically, the property serves as a backup in case job files could not be deleted after the job was completed.

This option applies regardless of how the property plugin.executor.ataccama.one.clear-job is configured.

Default value: 2h. For a full list of accepted units, see Duration units.

plugin.executor.ataccama.one.job-check-interval

Number

Defines how often DPM checks for recently completed jobs in DPE. Expressed in milliseconds.

Default value: 1800000 (30 minutes).

plugin.executor.ataccama.one.job-clean-chunk-size

Number

The number of old jobs to be cleaned at once.

Default value: 1000.

plugin.executor.ataccama.one.job-clean-max-duration

Number

The maximum duration of each job-cleaning run. This property must be set with respect to plugin.executor.ataccama.one.job-check-interval so that expired jobs could be removed effectively.

Default value: 1m. For a full list of accepted units, see Duration units.

plugin.executor.ataccama.one.max-parallel-jobs

Number

The maximum number of processes or threads that can be run in parallel.

Default value: 5.

plugin.dqc-job.ataccama.one.notify.grpc

Boolean

Enables notifications over gRPC from the ONE runtime server for runtime jobs.

Default value: true.

plugin.dqm-job.ataccama.one.notify.grpc

Boolean

Enables notifications over gRPC from the ONE runtime server for data quality monitoring jobs.

Default value: true.

ataccama.one.dpe.env.variables.allowed

String

A regular expression matching the names of environment variables set for DPE that can be accessed by child processes spawned for runtime jobs.

Default value: JAVA_HOME|JAVA_OPTS|PATH|Path|LANG|LANGUAGE|LC_.*.

dqd.processing.models.parallel

Number

The required number of evaluation threads.

To set this, use JAVA_OPTS, for example:

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Ddqd.processing.models.parallel=5

The variable defines the number of models which are processed in parallel in the DQ engine.

Default value: 1.

jdbc.snowflake.useInformationSchemaQueries

Boolean

Set this property to true when your Snowflake database has a large number of tables (>10000) due to query limits when using the default configuration.

To set this, use JAVA_OPTS, for example:

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Djdbc.snowflake.useInformationSchemaQueries=true

For more information about the query limits, see Data Sources Configuration, section Troubleshooting.

Default value: false.

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.DQD_RULES_BATCH_SIZE

Number

Batch limit for component rules.

If the batch size value is smaller than or equal to the total number of records in the data set, the results might be inaccurate.

If you want to raise this value above 10000, evaluate it against your DPE server sizing first. Otherwise, it can lead to jobs failing due to DPE running out of memory.

To set this using JAVA_OPTS, use the dqd.rules.batchSize syntax, for example:

plugin.executor-launch-model.ataccama.one.launch-type-properties.LOCAL.env.JAVA_OPTS=-Ddqd.rules.batchSize=1000

Default value: 1000.

Shutdown

Property Data type Description

server.shutdown

String

The type of application shutdown.

If set to graceful, after receiving the shutdown signal, the application waits for any active requests to finish before proceeding. If set to immediate, the application shuts down as soon as the shutdown signal is received.

Default value: graceful.

spring.lifecycle.timeout-per-shutdown-phase

String

Defines how long a shutdown phase can last. After this time expires, the application shuts down regardless of any active requests.

Default value: 30s. For a full list of accepted units, see Duration units.

plugin.executor.ataccama.one.server.shutdown.kill-jobs

Boolean

If set to true, the application waits for any running jobs to complete before continuing with graceful shutdown.

The waiting period is defined through the property shutdown.wait-jobs.

plugin.executor.ataccama.one.server.shutdown.wait-jobs

String

How long the application waits for running jobs to complete before shutting down gracefully. Once the timeout is reached, any remaining jobs are canceled.

Should be shorter than the value set in spring.lifecycle.timeout-per-shutdown-phase. Used only if the shutdown.kill-jobs property is set to true.

Default value: 20s. For a full list of accepted units, see Duration units.

Accepted units

Duration

Accepted units for time duration are as follows:

  • ns (nanoseconds)

  • us (microseconds)

  • ms (milliseconds)

  • s (seconds)

  • m (minutes)

  • h (hours)

  • d (days)

Size

Accepted units for file or message size are as follows:

  • B (bytes)

  • KB (kilobytes)

  • MB (megabytes)

  • GB (gigabytes)

  • TB (terabytes)

Was this page useful?