MDM Web App Runtime Configuration
The runtime configuration file defines resources available to the DQC engine in the batch mode and when the online server is started.
Batch mode is implemented via the runcif utility ([installation folder]/runtime/bin
) where the path to the runtime configuration file is supplied by the -runtimeConfig <filename>
parameter:
runcif.sh -server -serverPort 4040 -runtimeConfig example.runtimeConfig example.plan
The runtime configuration file is referenced in the Server Configuration file, so resources defined in the runtime configuration are available to the online server too.
The default runtime configuration file is default.runtimeConfig
, located in [installation folder]/runtime/server/etc/
.
Use this file to create your own runtime configuration.
Runtime Resources and Parameters
The following DQC runtime resources and parameters can be configured:
-
Contributed configurations (remote server connections)
-
Data sources
-
Folder shortcuts
-
Runtime components
-
Initial parallelism level
-
Logging
-
Resources folder for workflows
-
Resources configuration for workflows
The configuration file can be created in the text editor or it can be created by exporting the current settings of folder shortcuts, data sources, and configured servers in ONE. See Export and Import Runtime Configuration. |
Other runtime variables need to be configured manually according to the specifications in this article.
The configuration file is an XML file with the following format:
<?xml version='1.0' encoding='utf-8'?>
<runtimeconfig>
<!-- CONTRIBUTED CONFIGS -->
<contributedConfigs>
<config class="com.ataccama.dqc.processor.support.UrlResourceContributor">
<urls>
<url name="SomeConfiguredServer" user="myusername" password="crypted:DESede:p63913D4fMa175vrXECs1nOHdV1SG5sUto5HhuV6Izg=" url="localhost:22"/>
</urls>
</config>
<config class="com.ataccama.dqc.jms.config.JmsContributor">
<jmsConnections>
<jmsConnection connectionFactory="QueueConnectionFactory" name="someJMSbroker">
<contextParams>
<contextParam name="java.naming.factory.initial" value="org.apache.activemq.jndi.ActiveMQInitialContextFactory"/>
<contextParam name="java.naming.provider.url" value="tcp://acme.com:61616"/>
</contextParams>
</jmsConnection>
</jmsConnections>
</config>
</contributedConfigs>
<!-- DATA SOURCES -->
<dataSources>
<dataSource name="name"
driverclass="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/myDatabase"
user="root"
password="root">
<properties>
<property name="name" value="value" />
</properties>
</dataSource>
</dataSources>
<!--FOLDER SHORTCUTS -->
<pathVariables>
<pathVariable name="MyPath" value="D:/DQC/" />
</pathVariables>
<!-- RUNTIME COMPONENTS -->
<runtimeComponents>
<runtimeComponent class="com.ataccama.dqc.processor.monitoring.file.FileLoggerComp"
fileName="filename"
stdout="true"
loggingIntervalInMins="1" />
</runtimeComponents>
<!-- PARALLELISM LEVEL -->
<parallelismLevel>2</parallelismLevel>
<!-- LOGGING CONFIGURATION FILE -->
<loggingConfig>./etc/logging.xml</loggingConfig>
<!-- WORKFLOWS RESOURCES FOLDER -->
<resourcesFolder>./resources</resourcesFolder>
<!-- WORKFLOWS RESOURCE CONFIGURATION -->
<resources>
<resource id="demo" units="4" name="Demo resource" />
</resources>
</runtimeconfig>
Contributed Configurations
Contributed configurations contain connection definitions that can be referenced from various steps and configuration files. There are several kinds of such configurations, for example, JMS servers or URLs.
Contributed configurations may be created in ONE in the Servers node in the File Explorer. See Connect to a Server.
URL (Generic Server)
-
name: Name of the URL resource.
-
url: URL address.
-
authConfig: Select authentication type according to server security settings. Following options are available:
-
No authentication:
-
class: Select com.ataccama.dqc.processor.bin.config.auth.NoneAuthConfig.
-
-
Basic authentication:
-
class: Select com.ataccama.dqc.processor.bin.config.auth.BasicAuthConfig.
-
user: User name.
-
password: User password; the password can be either in plain or encrypted form; a password can be encrypted using
onlinectl
utility. See Encrypt Passwords.
-
-
OpenID Connect authentication:
-
class: Select com.ataccama.dqc.processor.bin.config.auth.OpenIdConnectAuthConfig.
-
clientId: Client ID.
-
clientSecret: Client secret.
-
tokenEndPointUrl: URL from which the HTTP client obtains access token. Contact your admin.
-
-
<contributedConfigs>
<config class="com.ataccama.dqc.processor.support.UrlResourceContributor">
<urls>
<url name="SomeConfiguredServer" url="testserver.ataccama.com:8888">
<authConfig password="crypted:AES:tjfcHC9iTJpjZmV2y/uFKaX+WZuAZMRRSAzvVFmYVwRRWG6drUKfeBEudUYoV339" class="com.ataccama.dqc.processor.bin.config.auth.BasicAuthConfig" user="test_user"/>
</url>
</urls>
</config>
</contributedConfigs>
Azure Data Lake Storage Gen 1
-
name: Name of the URL resource.
-
accountFQDN: Fully qualified domain name of the account. Can be found in user settings in Azure. Account FQDN has the <account_name>.http://azuredatalakestore.net[azuredatalakestore.net] format.
-
clientId: Client ID.
-
clientKey: Client key.
-
authenticationTokenEndpoint: URL from which the HTTP client obtains access token. Contact your admin for details.
-
authenticateUser: Enables username/password authentication. Set to
false
(the feature is not supported by ADLS yet).
<contributedConfigs>
<config class="com.ataccama.dqc.azure.config.AzureContributor">
<azureConnections>
<azureConnection clientId="00000000-0000-0000-0000-000000000000" authenticateUser="false" clientKey="crypted:AES:vKYpslbnAZVGc8dKV5XB8eAJ0iDlESofid/IZtlYIJKFMVsWtXuazDeOfyK4GPVjgb3L1Frd0yniWHyGfcFYa5PpmEy+oMju6ADsDNuzkQE=" name="ADLSServer" accountFQDN="myaccount.azuredatalakestore.net" authTokenEndpoint="https://login.microsoftonline.com/11111111-1111-1111-1111-111111111111/oauth2/token"/>
</azureConnections>
</config>
</contributedConfigs>
Google Cloud Storage
-
bucket: Specifies the bucket URL associated with your project within Google Cloud Platform.
-
keyFile: Specifies the key file (
.json
or.p12
) location on your local hard drive. -
name: Specifies the server connection name.
-
projectId: Specifies the project ID associated with your project within Google Cloud Platform.
<contributedConfigs>
<config class="com.ataccama.dqc.google.config.GoogleContributor">
<googleConnections>
<googleConnection bucket="ataccama_example" keyFile="C:/Users/test-qa-235308-e21d12de6ee9.json" name="GoogleCloudStorage" projectId="test-qa-235308"/>
</googleConnections>
</config>
</contributedConfigs>
JMS
-
name: Name of the URL resource.
-
connectionFactory: Connection factory class name.
-
user: User name.
-
password: User password; the password can be either in plain or encrypted form; a password can be encrypted using
onlinectl
utility. See Encrypt Passwords. -
contextParams (properties): Optional array of Java properties passed to the connection.
<contributedConfigs>
<config class="com.ataccama.dqc.jms.config.JmsContributor">
<jmsConnections>
<jmsConnection connectionFactory="QueueConnectionFactory" name="someJMSbroker">
<contextParams>
<contextParam name="java.naming.factory.initial" value="org.apache.activemq.jndi.ActiveMQInitialContextFactory"/>
<contextParam name="java.naming.provider.url" value="tcp://acme.com:61616"/>
</contextParams>
</jmsConnection>
</jmsConnections>
</config>
</contributedConfigs>
Kafka
-
name: Name of the URL resource.
-
connectionString: Comma-separated list of Kafka broker servers in the
<host>:<port>
format. For example,http://kafkabroker1.domain.com:9092
,http://kafkabroker2.domain.com:9092
. -
properties (optional): List of Kafka properties shared by all Kafka steps using the Kafka server connection. For a list of all possible properties, see the official Kafka documentation. Property with the same name defined in a Kafka step overrides the property defined in the Kafka server connection.
<contributedConfigs>
<config class="com.ataccama.dqc.streaming.config.KafkaContributor">
<kafkaConnections>
<kafkaConnection name="KafkaServer" connectString="kafkabroker1.domain.com:9092,kafkabroker2.domain.com:9092">
<properties>
<property name="security.protocol" value="SSL"/>
<property name="ssl.truststore.location" value="/some-directory/kafka.client.truststore.jks"/>
<property name="ssl.truststore.password" value="test1234"/>
<property name="ssl.keystore.location" value="/some-directory/kafka.client.keystore.jks"/>
<property name="ssl.keystore.password" value="test1234"/>
<property name="ssl.key.password" value="test1234"/>
</properties>
</kafkaConnection>
</kafkaConnections>
</config>
</contributedConfigs>
Ataccama ONE
-
password: Password for the specified user.
-
name: Specifies the server connection name.
-
storage: Name of a new or existing DQ project.
-
user: User name.
-
url: Specifies the server URL.
<contributedConfigs>
<config class="com.ataccama.one.client.config.AtaOneContributor">
<oneConnections>
<oneConnection password="crypted:AES:xj4/lUQmoHqDAZwxUujbDtnHY1IBJJDyITA3jECw0k4=" name="ONE" storage="Project" user="username@organization.com" url="https://one.ataccama.com"/>
</oneConnections>
</config>
</contributedConfigs>
Amazon S3
-
clientEncryptKey: A key to encrypt the data on the client’s side. By default, Java limits the maximum key length for encryption to 128 bits. To remove the key length restriction, download JCE Unlimited Strength policy files to the to the
<JAVA_HOME>/lib/security
folder. -
secretKey: Secret access key associated with the S3 account.
-
accessKey: Access key associated with the S3 account.
-
name: Specifies the server connection name.
-
sseKey: Select the encryption key from the keys generated by the server. If you leave this field empty, a default service key (generated by the server on a customer by service by region level) is used. The field is available only with SSE-KMS server encryption.
-
sseType: Specifies the way how the server encrypts the data. Following options are available:
-
None: No server-side data encryption. Default value.
-
SSE-S3: Encryption key is generated and selected by the S3 server.
-
SSE-KMS: Encryption key is selected by a user from the keys generated on the server.
-
-
url: Specifies the server URL in the
s3a://<bucket>[/<directory>]
format.
<contributedConfigs>
<config class="com.ataccama.dqc.s3.config.S3Contributor">
<s3Connections>
<s3Connection secretKey="crypted:AES:PIJhJbDIXJbr7Gahr67XPNevfmi7X7/QnEMlkW51Ob9pSiNyAkFTplVtwofD52ZLn64h235DICo+hLKNvFkABQ==" accessKey="AKIAJAWAMV3F3O37TPTA" name="s3" sseKey="SERVER_KEY_ID" sseType="NONE" url="https://ataccama.s3.amazonaws.com"/>
</s3Connections>
</config>
</contributedConfigs>
SMTP
-
host: Specifies the SMTP server host.
-
port: Specifies the connection port used by the server.
-
user: User name.
-
password: Password for the specified user; the password can be either in plain or encrypted form; a password can be encrypted using
onlinectl
utility. See Encrypt Passwords.
<contributedConfigs>
<config class="com.ataccama.dqc.processor.support.SmtpResourceContributor">
<smtpConnections>
<smtpConnection password="crypted:AES:5rNM3amiDCHjOSo3PRdF4scrNEHMhzeKmMr8TlRjLbFvaoDyY18kR8SpS1TXUm/o" port="25" host="smtpserver.company.com" name="SMTPServer" user="test_user"/>
</smtpConnections>
</config>
</contributedConfigs>
Keycloak Deployment Connection
Define the configuration for your Keycloak clients.
The settings in the KeycloakDeploymentContributor
should correspond to the Keycloak settings for the client.
The Keycloak client configuration is mapped to the URL pattern in HTTP Dispatcher, see HTTP Dispatcher.
The option to define Keycloak Deployment Connection is available only for the applications running on the Ataccama server (for example, Admin Center). Currently, it does not work with web applications. |
-
keycloakConfigs: Define one or multiple Keycloak configurations. It is recommended to define a separate configuration for each Keycloak realm.
-
keycloakConfig
-
name: Unique configuration name.
-
clients: Define one or multiple Keycloak clients.
-
client: Keycloak client configuration.
-
id: Unique client ID. Should correspond to the client name in the Keycloak.
-
url: Base URL of the Keycloak server. Should be defined for each client (either directly in the client configuration or inherited from the keycloakConfig parent).
-
realm: Name of the Keycloak realm. Should be defined for each client (either directly in the client configuration or inherited from the keycloakConfig parent).
-
secret (optional): Should correspond to the secret in the Keycloak admin console. Can be either in plain or encrypted form; a password can be encrypted using
onlinectl
utility. See Encrypt Passwords. -
attributes: List of all other Keycloak client configuration attributes.
The url, realm, and attribute properties can be defined either for all clients (as an attribute of the keycloakConfig property) or for individual clients (as an attribute of the client property). In case a property is defined in both places, client value overrides the value from the parental keycloakConfig.
-
-
-
-
<contributedConfigs>
<config class="com.ataccama.server.keycloak.KeycloakDeploymentContributor">
<keycloakConfigs>
<keycloakConfig name="localKeycloak">
<!-- Define common parameters for all clients. They can be overridden by client-specific settings.-->
<url>http://localhost:8083/auth</url>
<realm>ataccamaone</realm>
<attributes>
<attribute name="ssl-required" value="external"/>
</attributes>
<clients>
<client id="one-admin-center">
<secret>crypted:AES:DZ+36XQlju1sAAAIS6YUxtbN603Ag+Qxz3mLrNeNnSo=</secret>
<attributes>
<!-- Define client-specific settings.-->
<attribute name="use-resource-role-mappings" value="false"/>
<attribute name="public-client" value="false"/>
<attribute name="bearer-only" value="false"/>
<attribute name="autodetect-bearer-only" value="false"/>
<attribute name="always-refresh-token" value="false"/>
<attribute name="principal-attribute" value="preferred_username"/>
</attributes>
</client>
</clients>
</keycloakConfig>
</keycloakConfigs>
</config>
</contributedConfigs>
Data Source
The data source represents the information needed for a data source connection (for example for connection to a database).
Data sources may be created in the IDE in the Databases node in the File Explorer. See Databases.
-
dataSource
-
name: Name of the data source.
-
driverClass: Driver to use for connection to the data source.
-
url: URL address of the data source.
-
user: User name.
-
password: User password; the password can be either in plain or encrypted form; a password can be encrypted using the
onlinectl
utility. See Encrypt Passwords. -
validationQuery: An SQL
SELECT
command used to validate a DB connection prior to using it. -
minSize: The minimum number of established connections that will be kept in the connection pool at all times. Default value:`1`.
Example: If you start the online server with minSize = 2, then two database connections will be established automatically after the server is started. -
maxIdleSize: The maximum number of inactive connections that will be kept in the connection pool. All inactive connections exceeding maxIdleSize are disposed automatically. Default value:
10
. -
maxAge: The maximum time (in milliseconds) an inactive connection can be (re)used in the connection pool. Default value:
0
(unlimited).
Example: Having maxAge = 10000, the particular connection will be reused only in the time interval of 10 seconds; if there is another connection request (for example DQC plan with JDBC Reader starts) after this interval, the mentioned connection will be disposed and a new connection will be established and used instead.
-
-
properties: Properties related to a selected database engine (these properties are specific to a particular engine, please refer to the engine documentation, for example OracleDriver or Connector j reference configuration properties, etc.).
-
name: Name of the property (for example user or defaultRowPrefetch in Oracle DB).
-
value: Value of the property.
-
<dataSources>
<dataSource
name="name"
driverclass="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/myDatabase"
user="root"
password="root"
validationQuery="select 1"
minSize="2"
maxIdleSize="5"
maxAge="60000">
<properties>
<property name="connectTimeout" value="0" />
</properties>
</dataSource>
</dataSources>
Database Drivers
Since 12.5.0, the runtime configuration file is generated in a new way.
The root runtimeconfig element contains a databaseDrivers element with the name and definitions of database drivers.
When this approach is used, there is no need to store drivers (*.jar
files) into <DQC_HOME>/lib
, even if you run workflow with *Run DQC as Process* or Run DQC on Cluster.
The classpath of the driver can contain:
-
Concrete
*.jar
files -
Folders with
*.jar
files -
Folders
A separate classloader is created for such a driver.
All required *.jar
files have to be included in the classpath entries.
Driver *.jar
files should not be specified in cp/lcp properties.
Runtime configuration file can contain data sources defined by the old approach (attribute driverClass) for backward compatibility.
In that case, *.jar
files have to be placed in the <DQC_HOME>/lib
folder.
Driver *.jar
files have to be specified in cp/lcp properties.
When the runtime configuration file is imported, driver definitions are imported as well. If in ONE driver definition the same name exists, it is used, otherwise new driver definition from the runtime configuration file is created. If some file is selected (for example in File Explorer) and Import Runtime Configuration wizard is open, the file is filled in that wizard. On the Database preference page, you can add/edit database drivers (see Databases, section Installing Database Connectivity Drivers).
<?xml version='1.0' encoding='UTF-8'?>
<runtimeconfig>
<dataSources>
<!-- Oracle defined by old approach - driver class - DQC classloader will be used -->
<dataSource password="crypted:AES:FX1xWJBTX63gNzB3UFdkCPKvapujpE4TcM2TSdcSftg="
name="Oracle"
driverClass="oracle.jdbc.driver.OracleDriver"
user="test"
url="jdbc:oracle:thin:@dbase.ataccama.com:1521/ora12c"/>
<!-- Oracle defined by new approach - driver name - separate classloader will be used -->
<dataSource password="crypted:AES:FX1xWJBTX63gNzB3UFdkCPKvapujpE4TcM2TSdcSftg="
name="Oracle"
driverName="Oracle"
user="test"
url="jdbc:oracle:thin:@dbase.ataccama.com:1521/ora12c"/>
<!-- Hive Knox defined by new approach - driver name - separate classloader will be used -->
<dataSource password="crypted:AES:RGc9i5omV0SZeSaired+OlVLu5XOl8n9AHxZ9Hj"
name="Apache Hive Knox"
driverName="Apache Hive Knox"
user="sam"
url="jdbc:hive2://hadr.ataccama.com:8443/;ssl=true;transportMode=http;httpPath=gateway/default/hive"/>
</dataSources>
<databaseDrivers>
<!-- Knox driver defined by JARs -->
<databaseDriver driverClass="org.apache.hive.jdbc.HiveDriver" name="Apache Hive Knox">
<classpath>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\slf4j-log4j12-1.7.7.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\commons-codec-1.6.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\httpcore-4.4.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\libthrift-0.9.3.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\log4j-1.2.14.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\guava-14.0.1.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\httpclient-4.4.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\hive-service-1.2.1000.2.6.3.0-235.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\commons-lang-2.6.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\slf4j-api-1.7.7.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\hive-jdbc-1.2.1000.2.6.3.0-235.jar"/>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\commons-logging-1.1.3.jar"/>
</classpath>
</databaseDriver>
<!-- Knox driver defined by JAR folder -->
<databaseDriver driverClass="org.apache.hive.jdbc.HiveDriver" name="Apache Hive Knox">
<classpath>
<classpathEntry path="C:\Hive-jdbc\hdp-hive-1.2-thin-knox\*"/>
</classpath>
</databaseDriver>
<!-- Oracle driver defined by JARs -->
<databaseDriver driverClass="oracle.jdbc.driver.OracleDriver" name="Oracle">
<classpath>
<classpathEntry path="C:\Workspaces\dqcruntime\lib\jdbc\oracle\ojdbc7-12.1.0.2.0.jar"/>
</classpath>
</databaseDriver>
</databaseDrivers>
</runtimeconfig>
Folder Shortcuts
A path to the file can be specified as an absolute path, a relative path, or using folder shortcuts. A folder shortcut is a named path to the file or folder.
Folder shortcuts may be created in the Folder Shortcuts node in the File Explorer. See Folder Shortcuts for detailed instructions.
-
name: Name of folder shortcut.
-
value: Real folder represented by this shortcut.
<pathVariables>
<pathVariable name="MyPath" value="D:/DQC/" />
</pathVariables>
Example of usage:
Name of the folder shortcut | MyPath |
---|---|
Value of the folder shortcut |
D:/DQC/ |
Example of path using folder shortcut |
pathvar://MyPath/MyProject/config.xml |
Real path to the file |
D:/DQC/MyProject/config.xml |
Runtime Components
Runtime components are components that enhance the functionality of the DQC server. Their parameters are configured in a runtime configuration file. Currently, only one component is supported.
Supported types of runtime components:
-
File Logger Component: This component is used for monitoring the values of counters in the DQC server and logging these values to a file.
-
class:
com.ataccama.dqc.processor.monitoring.file.FileLoggerComp
(attributeclass
always has this value when FileLoggerComp is concerned). -
fileName: Name of the file where the values of counters are logged.
-
stdout: Boolean flag. If set to
true
, the values of counters are printed to the console (and to the file), if set tofalse
, the values are logged only to the file. -
loggingIntervalInMins: Counter values interval (in minutes).
-
<runtimeComponents>
<runtimeComponent class="com.ataccama.dqc.processor.monitoring.file.FileLoggerComp"
fileName="filename" stdout="true" loggingIntervalInMins="1">
</runtimeComponent>
</runtimeComponents>
Parallelism Level
By default, each step is spawned in a single thread, but the initial number of threads is defined by the parallelismLevel
property.
Note that only filters can be run in parallel, complex steps ignore this setting and always run in an unmodifiable single step configuration.
Currently the maximum number of threads per step is unlimited, though it is a good practice not to exceed the number of CPUs in the system.
<parallelismLevel>2</parallelismLevel>
Logging
Path to the logging configuration file, see Logging Configuration. The path can be absolute or relative. Relative path is resolved to the location of this file.
<loggingConfig>./etc/logging.xml</loggingConfig>
Workflow Resources Folder
Location of workflow resources, for example, run result logs. Relative path is resolved to the location of this file.
<resourcesFolder>./resources</resourcesFolder>
Resources
Configuration of resources allocated for workflows. See Workflow Resource Management for configuration instructions.
<resources>
<resource id="db-oracle" units="4" name="DB Oracle (connections)"/>
<resource id="memory" units="4096" name="Memory (MB)" />
</resources>
Was this page useful?