Lead your team forward
OCT 24 / 9AM ET Register nowAggregation Rules
Aggregation rules in ONE web application can be utilized in all DQ Rule types (Validity, Uniqueness, Completeness and Accuracy).
There are three ways to create aggregation rules:
-
Use aggregate functions in ONE expressions. See Create rule using aggregate functions.
-
Use Group By and ONE expressions. See Create rule using Group By and aggregate functions.
-
Use Group By the condition builder. See No code aggregation rules.
If you are using the Condition Builder, options for aggregated groups are limited to is unique
andis not unique
.
Float attributes are not suitable for grouping since the DQ evaluation operation removes trailing zeroes, which might lead to imprecise grouping and affect the evaluation results. |
Before you start
Make sure you are familiar with the basics of creating a rule in ONE.
We recommend checking the following topics:
-
Data Quality - Learn how data quality works in ONE.
-
Detection and DQ Evaluation Rules - Read more about the types of rules and their role.
-
Create DQ Evaluation Rule - A tutorial for creating a DQ rule.
-
Create Detection Rule - A tutorial for creating a detection rule.
Create rule using aggregate functions
-
In Data Quality > Rules, select Create.
-
Provide the Name of your rule.
Optionally, add a Description, Rule definition source, Owner and Steward (recommended).
-
On the Implementation tab, define the Input Attributes and their data types.
Input attributes must be of a data type compatible with the aggregate function you wish to use. -
In Rule Logic, select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.
-
After selecting rule type, the following dropdown menu appears. Select Aggregation rule.
-
Select Change Type when prompted.
If further rule logic has already been defined, be aware that this is not saved when you change rule type. -
A new section named Group by appears. To use aggregate functions it is not mandatory to first use Group by: don’t select Add expression and instead proceed to the first rule condition.
The Group by section allows you to split data into multiple groups of rows based on the expressions provided. If no expressions are defined, the whole table is treated as a single group for the purpose of aggregation.
-
Using the dropdown, change from Condition Builder to Advanced Expression.
-
You can now use aggregate functions in the expression field.
Create rule using Group By and aggregate functions
-
In Data Quality > Rules, select Create.
-
Provide the Name of your rule.
Optionally, add a Description, Rule definition source, Owner and Steward (recommended).
-
On the Implementation tab, define the Input Attributes and their data types.
Input attributes must be of a data type compatible with the aggregate function you wish to use. -
In Rule Logic, select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.
-
After selecting rule type, the following dropdown menu appears. Select Aggregation rule.
-
Select Change Type when prompted.
If further rule logic has already been defined, be aware that this is not saved when you change rule type. -
A new section named Group By appears. Group by a single selected attribute or a number of them, using Add input (and then use the plus sign (
+
) for each additional attribute).Attributes are available under the names provided for them in the Attributes section.
This works the same way as GROUP BY
clause in SQL: theGROUP BY
statement groups rows that have the same values into summary rows.You can create a group consisting only of a single input attribute without any transformation, in order to make aggregate functions available on the attribute.
-
Using the dropdown, change from Condition Builder to Advanced Expression.
-
You can now use aggregate functions in the expression field: attributes and the created group are all available to use in the expression.
No code aggregation rules
-
In Data Quality > Rules, select Create.
-
Provide the Name of your rule.
Optionally, add a Description, Rule definition source, Owner and Steward (recommended).
-
In the Implementation tab, define the Input Attributes and their data types.
-
In Rule Logic select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.
-
After selecting rule type, the following dropdown menu appears. Select Aggregation rule.
-
Select Change Type when prompted.
If further rule logic has already been defined, be aware that this is not saved when you change rule type. -
A new section named Group By appears. Group by a single selected attribute or a number of them, using Add input (and then use the plus sign (
+
) for each additional attribute).Attributes are available under the names provided for them in the Attributes section.
-
Using the dropdown, select Condition Builder if it is not already selected.
-
Select the newly-created group from the list of available inputs. Groups are listed under the heading Group By results.
-
You can now define the outcome of the DQ rule based on the uniqueness of the group.
Use cases
In this section you can find step-by-step guides on how to configure aggregation rules for the following three use cases:
-
Use case 1: Does a certain combination of values in the table appear more than once (that is, is a certain combination unique or not)? This uses the Condition Builder.
-
Use case 2: Is the value of this transaction within 20% of the average transaction value (+/-10% from AVG) for this customer? This uses the
avg
aggregate function. -
Use case 3: Check whether there is the correct number of transactional records per day (for example, 24 per day) by aggregating records by date and then checking the size of each group. This uses the
count
aggregate function.
Use case 1: Unique group
Does a certain combination of values in the table appear more than once?
-
In Data Quality > Rules, select Create.
-
Name your rule and add other information as required. For the purpose of this example, we use 'Combination rule'.
-
On the Implementation tab, add the input attributes.
In this example, we want to find if a combination of three attributes (
ATTRIBUTE 1
,ATTRIBUTE 2
, andATTRIBUTE 3
) appears more than once. -
In Rule Logic, select Validity rule.
-
Using the dropdown select Aggregation rule.
-
Select Change type when prompted.
-
In Group By, group the data by the three input attributes.
-
Provide the condition. For this example, we need to define that when the group is not unique, then the result is invalid.
-
Optionally, test the rule by selecting Test Rule.
Use case 2: avg
aggregate function
Is the value of this transaction within 20% of the average transaction value (+/-10% from AVG) for this customer?
-
In Data Quality > Rules, select Create.
-
Name your rule and add other information as required. For the purpose of this example, we use 'Average rule'.
-
On the Implementation tab, add the input attributes.
In this example, we want to find if the value of a transaction is within 20% of the average transaction value for that customer so our inputs are
CUSTOMER
andTRANSACTIONS
. -
In Rule Logic, select Validity rule.
-
Using the dropdown select Aggregation rule.
-
Select Change type when prompted.
-
In Group By, group the data by the
CUSTOMER
attribute. -
Switch from Condition Builder to Advanced Expression.
-
Set the rule logic so that anything outside the 20% range of the average is considered
INVALID
. This has two elements:-
If the value is bigger than 10% of the average:
TRANSACTIONS > (avg(TRANSACTIONS) * 1.1)
-
If the value is lower than 10% of the average:
TRANSACTIONS < (avg(TRANSACTIONS) * 0.9)
-
-
Optionally, test the rule by selecting Test Rule.
Use case 3: count
aggregate function
-
In Data Quality > Rules, select Create.
-
Name your rule and add other information as required. For the purpose of this example, we use 'Day rule'.
-
On the Implementation tab, add the input attributes.
In this example, we want to find if there is the correct number of records in a day so our input attribute is
DATE
data type. -
In Rule Logic, select Validity rule.
-
Using the dropdown select Aggregation rule.
-
Select Change type when prompted.
-
In Group By, group the data by the
DATE
attribute. -
Using the dropdown, switch from Condition Builder to Advanced Expression.
-
Select Change type when prompted.
-
Provide the expression. For this example, we need:
WHEN (count() != 24) THEN Result is INVALID. Reason: Incorrect # of RECORDS
-
Optionally, test the rule by selecting Test Rule.
Next steps
Apply your newly-created DQ rule to a term. See Add Rules to Terms.
Was this page useful?