Data Quality Gates

Data Quality Gates enables organizations to deploy and execute data quality rules directly within their data processing environments. It extends Ataccama ONE’s data quality capabilities into your pipelines, allowing validation of data in motion without moving it outside your existing infrastructure.

Why use Data Quality Gates?

Validate data in motion

Traditional DQ rules validate data at rest - either where it is stored in databases or data lakes, or in ONE. DQ Gates validates data in motion - as it flows through your pipelines in real-time. This enables you to embed DQ rules directly in data pipelines to validate records in flight.

The distinction is important, as addressing data quality issues after invalid records have already entered your ecosystem is often too late. By then, invalid records might have already:

Corrupted dashboards and reports.
Triggered false compliance alerts.
Informed business decisions.
Propagated to multiple downstream systems.

DQ Gates prevents these issues by identifying invalid records during pipeline processing, before they reach production. This reduces remediation costs and improves the quality of data-driven decisions.

Centralized rule management

With DQ Gates, you maintain an enterprise-level, centralized rule repository:

Write once, reuse everywhere: Define rules in Ataccama ONE, deploy across all pipelines.
Business context preserved: Business experts define rules with domain knowledge that engineers might lack.
Consistent enforcement: Same validation logic applied uniformly across your data ecosystem.

How does my role benefit?

Data Engineers

Eliminate redundant validation code across pipelines.
Focus on pipeline development rather than data quality firefighting.
Reduce debugging time from inconsistent validation implementations.

Business Analysts and Data Stewards

Define rules once in Ataccama ONE without coding.
Modify business logic without engineering involvement.
Maintain consistency across all data products.

Data Scientists

Receive pre-validated datasets ready for analysis.
Reduce time spent on data cleansing and preparation.
Improve model accuracy with consistent data quality.
Focus on insights rather than remediation.

DQ Gates vs. Data Observability

Data Observability (DO) and Data Quality Gates serve different purposes in the data quality landscape.

Typical DO implementations focus on pipeline health and metadata monitoring, operating reactively by detecting issues after they occur through post-transformation monitoring. These tools generally rely on basic technical checks that are maintained by engineering teams.

DQ Gates takes a different approach by performing in-flight validation of data content during processing, identifying issues before they propagate downstream. The validation is based on business-aware rules that are defined and governed by business users rather than engineers.

This means that while DO tells you when something went wrong with your pipeline, DQ Gates prevents business rule violations from reaching your data products in the first place.

The two approaches are complementary rather than mutually exclusive. DO excels at monitoring pipeline infrastructure and detecting unexpected technical issues, while DQ Gates enforces specific business logic during data processing.

How it works

Component relationships

Here’s how DQ rules, DQ firewalls, and DQ gates fit together.

DQ rules: Individual quality checks defined in Ataccama ONE (for example, "currency_code must match ISO 4217 standards").
DQ firewalls: Collections of DQ rules bundled together which can be exposed via API. These can be accessed in two ways:
- Via API: External systems call REST or GraphQL APIs, with execution happening on Ataccama servers. This feature is available to all users.
- Via DQ Gates: Export the DQ firewall definitions to run directly within external systems and pipelines. See Start using DQ Gates.
DQ Gates: The deployment mechanism that downloads DQ firewall definitions and converts them into native platform functions (like Snowflake UDFs) that execute locally within your data pipelines.

Process flow

The DQ Gates workflow follows these steps:

Rule definition: Business users and data stewards define DQ rules and organize them into DQ firewalls using the Ataccama ONE visual interface.
Export and conversion: DQ Gates exports these firewall definitions and converts them into platform-specific functions.
Deployment: In Snowflake (currently supported), each firewall becomes a User-Defined Function (UDF) callable directly in SQL.
Execution: Data quality validation occurs in-place during pipeline processing.

Performance

DQ Gates is designed for modern data architectures:

Handles huge volumes of data without added latency.
Compatible with streaming and event-driven architectures.
Executes as native platform functions (no external calls).
Scales with your platform’s capabilities.

There are no limitations on the volume of processed data—the performance scales with Snowflake warehouse size and rule complexity.

Common implementation patterns

DQ Gates enable specific integration patterns for different data quality scenarios:

Data filtering pipelines: Remove invalid records before downstream processing.
Quality monitoring workflows: Flag problematic data while preserving complete datasets for investigation.
Quality gates or circuit breakers: Automatically stop data processing when quality thresholds aren’t met.
Quality metrics and reporting: Use DQ evaluation results to monitor data quality trends and generate reports.
Testing and development: Integrate quality UDFs into transformation testing frameworks (e.g., dbt) to catch quality issues during development and deployment.

Implementation examples

For Snowflake integration examples, see Integrate firewalls into production pipelines.

Limitations

Unsupported rule types

The following rule types are not supported:

Aggregation rules (for example, uniqueness checks or duplicate detection across datasets)
Component rules.

DQ firewalls containing unsupported rule types are automatically skipped and not deployed to the data platform. A log entry is created with the reason for skipping.

Unsupported functions

The following functions are not supported:

Set functions: Some of the more advanced set operations are not implemented.
List of unsupported set functions
- set.approxSymmetricDifference
- set.difference
- set.differenceExp
- set.differenceResult
- set.differenceResultExp
- set.intersection
- set.intersectionExp
- set.intersectionResult
- set.intersectionResultExp
- set.lcsDifference
- set.lcsDifferenceExp
- set.lcsDifferenceResult
- set.lcsDifferenceResultExp
- set.lcsIntersection
- set.lcsIntersectionExp
- set.lcsIntersectionResult
- set.lcsIntersectionResultExp
- set.lcsSymmetricDifference
- set.lcsSymmetricDifferenceExp
- set.lcsSymmetricDifferenceResult
- set.lcsSymmetricDifferenceResultExp
- set.symmetricDifference
- set.symmetricDifferenceExp
- set.symmetricDifferenceResult
- set.symmetricDifferenceResultExp
- set.union
- set.unionExp
- set.unionResult
- set.unionResultExp
String functions: Several string processing functions are not available.
List of unsupported string functions
- diceCoefficient
- doubleMetaphone
- jaccardCoefficient
- jaroWinkler
- metaphone
- ngram
- preserveCase
- soundex
- wordCombinations
- trashDiacritics (deprecated in favor of removeAccents)
Runtime functions: Some system and parameter functions are not implemented.
List of unsupported runtime functions
- getParameterValue
- getRuntimeVersion
- setParameterValue

DQ firewalls containing unsupported functions are automatically skipped and not deployed to the data platform. A log entry is created with the reason for skipping.

Result differences

Due to differences in the underlying implementations, DQ firewalls in ONE and DQ Gates might produce different results in certain edge cases.

Discrepancies can arise in areas such as:

Date formatting and locale-specific operations.
Calculations with large numerical values.
Data type handling and conversions.
Null value processing.

The discrepancies typically involve how borderline or ambiguous values are interpreted, and they do not undermine the overall accuracy or reliability of data quality assessments.

Supported platforms

Currently supported: Snowflake.
Planned support: Databricks, Kafka, and more.

Key takeaways

Data Quality Gates enables organizations to enforce consistent, business-defined data quality rules directly within their data pipelines. By validating data in motion rather than at rest, DQ Gates prevents data quality issues from cascading through your systems, reducing remediation costs and improving trust in data-driven decisions.

The key advantages include:

Prevention over detection: Stop bad data before it spreads.
Centralized management: Define rules once, deploy everywhere.
Business ownership: Domain experts define rules without coding.
Platform native: Execute at pipeline speed without data movement.
Scalable: Process any volume—performance scales with warehouse size and rule complexity.

Start using DQ Gates

Contact your Customer Success Manager to discuss how DQ Gates can fit your specific needs and to request access.

Once you’ve secured entitlement to DQ Gates and you’re familiar with the basics, follow the installation instructions in Install Data Quality Gates for Snowflake and begin evaluating data quality in your pipelines.

Was this page useful?