User Community Service Desk Downloads

DQ&C 17.1.0 Upgrade Notes

New Snowflake pushdown processing

Snowflake pushdown processing now supports new capabilities on Snowflake connections. See 17.1.0 release notes for details.

After upgrade, existing Snowflake connections continue to use the previous pushdown implementation.

To switch to the new pushdown on a connection:

  1. Review the differences between the two pushdown settings.

  2. Verify that the DQ rules applied to catalog items and used in monitoring projects on this connection are compatible with the new pushdown.

    For details on how to check rule support per rule, see Verify rule support.

    Rules that aren’t supported by the new pushdown can’t run after switching. Rewrite such rules using only supported functions, or keep Legacy pushdown enabled on connections that use them.

  3. In the Pushdown processing section of the connection, disable Legacy pushdown.

In phased upgrades where the Data Processing Module (DPM) is updated to 17.1.0 before the Data Processing Engines (DPEs): jobs submitted on connections with Legacy pushdown turned off fail explicitly on DPEs older than 17.1.0, rather than silently downgrading. Upgrade DPEs to 17.1.0 or later before disabling Legacy pushdown on any connection.

Not applicable DQ results

DQ evaluation rules can now produce a Not applicable result for records where a rule cannot be meaningfully applied. Such records are excluded from the overall quality calculation instead of counting as passed or failed. See Not applicable results for details.

For most uses, no action is required after upgrade. Existing rules behave as before, and the overall quality calculation is unchanged. Not applicable takes effect only after you configure a Not applicable result on a dimension and select it in a rule condition.

If you start using Not applicable results and process DQ output programmatically, review the following:

  • DQ firewall API responses can return Not applicable as a result, in addition to passed and failed. If your integration branches on the firewall result, handle this third state.

  • Post-processing plan exports that already exist are not affected. Newly created plans include the not_applicable_rules and not_applicable_rules_explanation columns.

Connection-timeout-property for JDBC drivers

All JDBC data source drivers now require the connection-timeout-property setting.

The property specifies the driver-level property name and unit used to enforce a socket or read timeout on JDBC connections. If it is missing, the DPE can become completely blocked when a data source is unresponsive.

The property follows this pattern:

plugin.jdbcdatasource.ataccama.one.driver.<driverId>.connection-timeout-property = <property>, <unit>

Where:

  • <property> is the JDBC driver property name.

  • <unit> is s (seconds) or ms (milliseconds).

Set the value to NONE for drivers that do not support a timeout property.

When upgrading to 17.1.0, verify that each configured driver (default or custom) includes connection-timeout-property.

Reference values for default drivers

For full configuration details, see Data Sources Configuration.

Driver Value

Amazon Aurora MySQL

socketTimeout, ms

Amazon Aurora PostgreSQL

socketTimeout, s

Amazon Redshift

socketTimeout, s

Apache Cassandra

ReadTimeoutMillis, ms

Arrow Flight SQL (Dremio)

socketTimeout, ms

AWS Athena

SocketTimeout, s

Azure Data Explorer (ADX)

socketTimeout, ms

Azure Synapse Analytics

socketTimeout, ms

BigQuery

Timeout, s

IBM Db2

blockingReadConnectionTimeout, s

IBM Netezza

loginTimeout, s

Informix

INFORMIXCONTIME, s

MariaDB

socketTimeout, ms

MS SQL

socketTimeout, ms

MySQL

socketTimeout, ms

Oracle

oracle.jdbc.ReadTimeout, ms

PostgreSQL

socketTimeout, s

SAP HANA

webSocketPingTimeout, s

Snowflake

loginTimeout, s

SQLite

NONE

Sybase

com.sybase.CORBA.socketTimeout, s

Teradata

NONE

Databricks JDBC batch inserts

Starting with Simba 2.6.38 and OSS 3.0.5, the JDBC Writer inserts records one at a time, causing slow writes. Ataccama ONE now automatically restores batched inserts for Simba 2.7.0+ and OSS 3.0.5+.

If you are using Simba 2.6.38–2.6.x, manually add ;EnableNativeParameterizedQuery=0 to your connection string.

Basic Authentication no longer supported for Salesforce CData JDBC driver

Starting with v25.0.9434.0, Basic Authentication is obsolete for the Salesforce CData JDBC driver.

To resolve this issue, we recommend updating to a supported authentication scheme. For more information, refer to the CData JDBC Driver for Salesforce documentation.

To continue using Basic Authentication, update your JDBC connection string. For details, see Salesforce CData JDBC.

Advanced encryption between DPM and DPE

Apply advanced encryption to communication channels between Data Processing Module (DPM) and Data Processing Engines (DPEs), such as gRPC messages and configuration data.

We recommend switching to advanced encryption at your earliest convenience.

Before you start, ensure there are no encryption related warnings on the Engines tab in the DPM Admin Console. If there are, update the DPE configuration to address this or remove the engine.

If advanced encryption is not configured, the environment continues to work using standard encryption, which is now considered obsolete.

If you are using an Ataccama Cloud environment, contact Ataccama Support for configuration assistance.

For configuration details, see Advanced encryption between DPM and DPE.

DPM database requires free space for upgrade

During the upgrade, the dpm_job_info table in the DPM database is split into two tables. While the migration is in progress, the database temporarily holds an additional copy of this table.

Before upgrading, make sure the DPM database has enough free disk space to accommodate another copy of the dpm_job_info table.

To estimate how much space is needed, check the current size of the table:

SELECT
    pg_size_pretty(pg_total_relation_size('dpm_job_info'))                          AS total,
    pg_size_pretty(pg_table_size('dpm_job_info'))                                   AS table_plus_toast,
    pg_size_pretty(pg_indexes_size('dpm_job_info'))                                 AS indexes,
    pg_size_pretty(pg_relation_size('dpm_job_info'))                                AS main_heap_only,
    pg_size_pretty(pg_total_relation_size(
        (SELECT reltoastrelid FROM pg_class WHERE relname = 'dpm_job_info')))       AS toast_total;

The total value indicates the absolute minimum amount of free space required. We recommend allowing for additional headroom beyond this value.

Changes to pushdown profiling retry behavior

Pushdown profiling uses a new fault tolerance mechanism across all data sources that support pushdown processing (BigQuery, Databricks, IOMETE, Snowflake, Azure Synapse Analytics):

  • Transient query failures are retried with exponential backoff, configurable per data source using the new properties com.ataccama.<data_source_type>.profiling.max-retry-count, com.ataccama.<data_source_type>.profiling.initial-retry-delay, and com.ataccama.<data_source_type>.profiling.max-retry-delay. Errors that cannot be retried, such as insufficient permissions or invalid credentials, fail immediately.

  • If any query exhausts all retry attempts or encounters a non-retryable error, the whole profiling job is canceled instead of waiting for the remaining queries to finish.

  • If the connection pool is exhausted, DPE waits for a connection to become available instead of failing the query.

For Snowflake, this new retry mechanism applies only to the new Snowflake pushdown processing. Legacy pushdown continues to use the existing retry behavior and default values (max-retry-count: 4, max-retry-delay: 10m). Note that the same properties apply to both profiling paths: if you have configured them explicitly, the values are used by both. Otherwise, each implementation uses its respective default values.

For more information, see Pushdown profiling retry configuration.

DQ Issue Tracker Web Application moves to Spring Boot

Starting with 17.1.0, the DQ Issue Tracker (DQIT) Web Application is a Spring Boot application, consistent with the other ONE modules.

It no longer runs as a WAR file (epp-webapp-<version>.war) deployed to Apache Tomcat. Instead, it is deployed under /opt/ataccama/one, runs as a systemd service, and is configured through etc/application.properties. Apache Tomcat is no longer required for the DQIT Web Application.

This is a breaking change for self-managed (on-premise) deployments. The previous XML configuration (WEB-INF/config.xml, WEB-INF/web.xml, and similar files), WEB-INF/classes/setup.properties, and META-INF/context.xml are replaced by etc/application.properties using the app.*, spring.datasource.*, keycloak.*, and spring.security.oauth2.* property namespaces.

Secrets (the database password, Keycloak client secrets, and Sentry DSN) and the license folder path are not stored in this file. They are injected through systemd environment variables: SPRING_DATASOURCE_PASSWORD, KEYCLOAK_CREDENTIALS_SECRET, KEYCLOAK_WEBAPP_CLIENT_SECRET, KEYCLOAK_STEPS_CLIENT_SECRET, ATACCAMA_ONE_SENTRY_DSN, and LICENSE_FOLDER.

The DQIT Server (engine) deployment is unchanged.

For the full upgrade procedure, see Self-Managed Deployment and Upgrade, DQ Issue Tracker Upgrade section.

Configuration migration

The following table maps the previous DQIT Web Application configuration to the new etc/application.properties settings.

Previous configuration New etc/application.properties setting Notes

WEB-INF/classes/setup.properties > license.folder

app.license.folder

Defined as app.license.folder=${LICENSE_FOLDER:<install-license-dir>}: it defaults to the install license directory but is overridden at runtime by the LICENSE_FOLDER environment variable. The license.plf file is deployed to <install_dir>/license/.

WEB-INF/classes/setup.properties > data.folder

app.base.dir, app.project.path, app.attachment.path, app.attachments.allowed.upload.folders, app.lucene.index.path

There is no longer a single data.folder. Paths are set explicitly through individual app.* keys. app.base.dir is the install base directory and app.project.path points to <install_dir>/webapp-dqit/WEB-INF.

META-INF/context.xml (datasource)

spring.datasource.url, spring.datasource.username

Database connection for the DQ Issue Repository (storage database). The password is not stored in the file: it is injected through the SPRING_DATASOURCE_PASSWORD environment variable. The connection pool (spring.datasource.hikari.*) is preconfigured.

WEB-INF/config.xml, WEB-INF/app-config.xml

app.main.config.file=classpath:config/config.xml, app.config.file=classpath:config/app-config.xml

These XML files are not removed. They are bundled on the classpath and referenced through app.* properties.

WEB-INF/web.xml

(removed)

Servlet wiring is now handled by Spring Boot.

WEB-INF/workflows.xml, WEB-INF/metadata.xml

app.project.path

app.project.path resolves to <install_dir>/webapp-dqit/WEB-INF, where metadata.xml, workflows.xml, and messages.properties are deployed. These files are still generated and deployed (not replaced), and a WEB-INF folder still exists.

Security file (default dqit-security-openid.xml), keycloak-webapp.json, keycloak-steps.json

keycloak.* (for example, keycloak.auth-server-url, keycloak.realm, keycloak.resource, keycloak.webapp.client-id, keycloak.steps.client-id, keycloak.redirect-uri) and spring.security.oauth2.client.registration.keycloak.* / spring.security.oauth2.client.provider.keycloak.*

SSO / Keycloak configuration. Client secrets are not stored in the file: they are injected through the KEYCLOAK_CREDENTIALS_SECRET, KEYCLOAK_WEBAPP_CLIENT_SECRET, and KEYCLOAK_STEPS_CLIENT_SECRET environment variables (referenced in the file as, for example, ${keycloak.webapp.client-secret}).

Was this page useful?