Data Quality Gates
Data Quality Gates enables organizations to deploy and execute data quality rules directly within their data processing environments. It extends Ataccama ONE’s data quality capabilities into your pipelines, allowing validation of data in motion without moving it outside your existing infrastructure.
Why use Data Quality Gates
Validate data in motion
Traditional DQ rules validate data at rest - either where it is stored in databases or data lakes, or in ONE. DQ Gates validates data in motion - as it flows through your pipelines in real-time. This enables you to embed DQ rules directly in data pipelines to validate records in flight.
The distinction is important, as addressing data quality issues after invalid records have already entered your ecosystem is often too late. By then, invalid records might have already:
-
Corrupted dashboards and reports.
-
Triggered false compliance alerts.
-
Informed business decisions.
-
Propagated to multiple downstream systems.
DQ Gates prevents these issues by identifying invalid records during pipeline processing, before they reach production. This reduces remediation costs and improves the quality of data-driven decisions.
Centralized rule management
With DQ Gates, you maintain an enterprise-level, centralized rule repository:
-
Write once, reuse everywhere: Define rules in Ataccama ONE, deploy them across your pipelines.
-
Business context preserved: Business experts define rules with domain knowledge that engineers might lack.
-
Consistent enforcement: The same validation logic is applied uniformly across your data ecosystem.
How your role benefits
Data Engineers
-
Eliminate redundant validation code across pipelines.
-
Focus on pipeline development rather than data quality firefighting.
-
Reduce debugging time from inconsistent validation implementations.
Business Analysts and Data Stewards
-
Define rules once in Ataccama ONE without coding.
-
Modify business logic without engineering involvement.
-
Maintain consistency across all data products.
Data Scientists
-
Receive pre-validated datasets ready for analysis.
-
Reduce time spent on data cleansing and preparation.
-
Improve model accuracy with consistent data quality.
-
Focus on insights rather than remediation.
DQ Gates vs. Data Observability
Data Observability (DO) and Data Quality Gates serve different purposes in the data quality landscape.
Typical DO implementations focus on pipeline health and metadata monitoring, operating reactively by detecting issues after they occur through post-transformation monitoring. These tools generally rely on basic technical checks that are maintained by engineering teams.
DQ Gates takes a different approach by performing in-flight validation of data content during processing, identifying issues before they propagate downstream. The validation is based on business-aware rules that are defined and governed by business users rather than engineers.
This means that while DO tells you when something went wrong with your pipeline, DQ Gates prevents business rule violations from reaching your data products in the first place.
The two approaches are complementary rather than mutually exclusive. DO excels at monitoring pipeline infrastructure and detecting unexpected technical issues, while DQ Gates enforces specific business logic during data processing.
How it works
DQ Gates builds on the same DQ rules and DQ firewalls that you manage in Ataccama ONE.
Component relationships
Here is how DQ rules, DQ firewalls, and DQ Gates fit together.
-
DQ rules: Individual quality checks defined in Ataccama ONE (for example, "currency_code must match ISO 4217 standards").
-
DQ firewalls: Collections of DQ rules bundled together which can be exposed via API. These can be accessed in two ways:
-
Via API: External systems call REST or GraphQL APIs, with execution happening on Ataccama servers. This feature is available to all users.
-
Via DQ Gates: Export the DQ firewall definitions to run directly within external systems and pipelines. See Start using DQ Gates.
-
-
DQ Gates: The deployment mechanism that downloads DQ firewall definitions and converts them into native platform functions (like Snowflake UDFs) that execute locally within your data pipelines.
Process flow
The DQ Gates workflow follows these steps:
-
Rule definition: Business users and data stewards define DQ rules and organize them into DQ firewalls using the Ataccama ONE visual interface.
-
Export and conversion: DQ Gates exports firewall definitions and converts them into platform-specific artifacts, such as Snowflake UDFs or Snowflake DMFs.
-
Deployment: These artifacts are deployed to the target runtime.
-
Execution: Data quality validation occurs in-place during pipeline processing.
Common implementation patterns
DQ Gates supports integration patterns such as:
-
Data filtering pipelines: Remove invalid records before downstream processing.
-
Quality monitoring workflows: Flag problematic data while preserving complete datasets for investigation.
-
Quality gates or circuit breakers: Stop processing automatically when quality thresholds are not met.
-
Quality metrics and reporting: Use validation results to monitor quality trends and generate reports.
-
Testing and development: Reuse production-grade DQ logic during development and pipeline testing.
These patterns apply across all Supported platforms.
Limitations
The following limitations apply when using the firewall-based flow in local Python and Snowflake UDFs.
Unsupported rule types
The following rule types are not supported:
-
Aggregation rules (for example, uniqueness checks or duplicate detection across datasets)
DQ firewalls containing unsupported rule types are automatically skipped and not deployed to the data platform. A log entry is created with the reason for skipping.
Unsupported functions
The following functions are not supported:
-
Set functions: Some of the more advanced set operations are not implemented.
List of unsupported set functions
-
set.approxSymmetricDifference -
set.difference -
set.differenceExp -
set.differenceResult -
set.differenceResultExp -
set.intersection -
set.intersectionExp -
set.intersectionResult -
set.intersectionResultExp -
set.lcsDifference -
set.lcsDifferenceExp -
set.lcsDifferenceResult -
set.lcsDifferenceResultExp -
set.lcsIntersection -
set.lcsIntersectionExp -
set.lcsIntersectionResult -
set.lcsIntersectionResultExp -
set.lcsSymmetricDifference -
set.lcsSymmetricDifferenceExp -
set.lcsSymmetricDifferenceResult -
set.lcsSymmetricDifferenceResultExp -
set.symmetricDifference -
set.symmetricDifferenceExp -
set.symmetricDifferenceResult -
set.symmetricDifferenceResultExp -
set.union -
set.unionExp -
set.unionResult -
set.unionResultExp
-
-
String functions: Several string processing functions are not available.
List of unsupported string functions
-
diceCoefficient -
doubleMetaphone -
jaccardCoefficient -
jaroWinkler -
metaphone -
ngram -
preserveCase -
soundex -
wordCombinations -
trashDiacritics(deprecated in favor ofremoveAccents)
-
-
Runtime functions: Some system and parameter functions are not implemented.
List of unsupported runtime functions
-
getParameterValue -
getRuntimeVersion -
setParameterValue
-
-
Geospatial functions: Geospatial operations (
geo.*) are not supported. The only supported geospatial function isgeoDistance.
DQ firewalls containing unsupported functions are automatically skipped and not deployed to the data platform. A log entry is created with the reason for skipping.
Result differences
Due to differences in the underlying implementations, DQ firewalls in ONE and DQ Gates might produce different results in certain edge cases.
Discrepancies can arise in areas such as:
-
Date formatting and locale-specific operations.
-
Calculations with large numerical values.
-
Data type handling and conversions.
-
Null value processing.
| The discrepancies typically involve how borderline or ambiguous values are interpreted, and they do not undermine the overall accuracy or reliability of data quality assessments. |
Supported platforms
DQ Gates currently supports the following platforms:
-
Python: Run firewalls locally in a Python environment. See Use Data Quality Gates Locally in Python.
-
Snowflake UDFs: Deploy firewalls as Snowflake User-Defined Functions. See Install Data Quality Gates for Snowflake UDFs.
-
Snowflake DMFs: Generate Data Metric Functions from DQ rules. This follows a separate rule-to-SQL flow. See Use Snowflake DMFs with Data Quality Gates.
Key takeaways
Data Quality Gates enables organizations to enforce consistent, business-defined data quality rules directly within their data pipelines. By validating data in motion rather than at rest, DQ Gates prevents data quality issues from cascading through your systems, reducing remediation costs and improving trust in data-driven decisions.
The key advantages include:
-
Prevention over detection: Stop bad data before it spreads.
-
Centralized management: Define rules once, deploy everywhere.
-
Business ownership: Domain experts define rules without coding.
-
Platform native: Execute at pipeline speed without data movement.
-
Scalable: Process any volume—performance scales with warehouse size and rule complexity.
Start using DQ Gates
| Contact your Customer Success Manager or Ataccama Support to discuss how DQ Gates can fit your specific needs and to confirm access for your environment. |
Before you begin, review Data Quality Gates Prerequisites for the common setup requirements.
Then choose the option that matches your environment:
-
Local Python: Start with Use Data Quality Gates Locally in Python. This page walks you through installing the SDK, authenticating, retrieving the firewall definition, and running validation locally or offline.
-
Snowflake UDFs: Start with Install Data Quality Gates for Snowflake UDFs. Then continue with Evaluate Data Quality in Snowflake with UDFs and Synchronize Snowflake UDFs with ONE.
-
Snowflake DMFs: Start with Use Snowflake DMFs with Data Quality Gates for the rule-to-DMF SQL generation flow.
Was this page useful?