User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Aggregation Rules

Aggregation rules in ONE web application can be utilized in all DQ Rule types (Validity, Uniqueness, Completeness and Accuracy).

Aggregation rule

There are three ways to create aggregation rules:

  1. Use aggregate functions in ONE expressions. See Create rule using aggregate functions.

  2. Use Group By and ONE expressions. See Create rule using Group By and aggregate functions.

  3. Use Group By the condition builder. See No code aggregation rules.

    If you are using the Condition Builder, options for aggregated groups are limited to is unique and is not unique.
Group by options

Float attributes are not suitable for grouping since the DQ evaluation operation removes trailing zeroes, which might lead to imprecise grouping and affect the evaluation results.

Before you start

Make sure you are familiar with the basics of creating a rule in ONE.

We recommend checking the following topics:

Create rule using aggregate functions

  1. In Data Quality > Rules, select Create.

  2. Provide the Name of your rule.

    Optionally, add a Description, Rule definition source, Owner and Steward (recommended).

  3. On the Implementation tab, define the Input Attributes and their data types.

    Input attributes must be of a data type compatible with the aggregate function you wish to use.
  4. In Rule Logic, select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.

  5. After selecting rule type, the following dropdown menu appears. Select Aggregation rule.

    600
  6. Select Change Type when prompted.

    500
    If further rule logic has already been defined, be aware that this is not saved when you change rule type.
  7. A new section named Group by appears. To use aggregate functions it is not mandatory to first use Group by: don’t select Add expression and instead proceed to the first rule condition.

    The Group by section allows you to split data into multiple groups of rows based on the expressions provided. If no expressions are defined, the whole table is treated as a single group for the purpose of aggregation.

  8. Using the dropdown, change from Condition Builder to Advanced Expression.

  9. You can now use aggregate functions in the expression field.

Create rule using Group By and aggregate functions

  1. In Data Quality > Rules, select Create.

  2. Provide the Name of your rule.

    Optionally, add a Description, Rule definition source, Owner and Steward (recommended).

  3. On the Implementation tab, define the Input Attributes and their data types.

    Input attributes must be of a data type compatible with the aggregate function you wish to use.
  4. In Rule Logic, select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.

  5. After selecting rule type, the following dropdown menu appears. Select Aggregation rule.

    600
  6. Select Change Type when prompted.

    500
    If further rule logic has already been defined, be aware that this is not saved when you change rule type.
  7. A new section named Group By appears. Group by a single selected attribute or a number of them, using Add input (and then use the plus sign (+) for each additional attribute).

    Attributes are available under the names provided for them in the Attributes section.

    500
    This works the same way as GROUP BY clause in SQL: the GROUP BY statement groups rows that have the same values into summary rows.

    You can create a group consisting only of a single input attribute without any transformation, in order to make aggregate functions available on the attribute.

  8. Using the dropdown, change from Condition Builder to Advanced Expression.

  9. You can now use aggregate functions in the expression field: attributes and the created group are all available to use in the expression.

No code aggregation rules

  1. In Data Quality > Rules, select Create.

  2. Provide the Name of your rule.

    Optionally, add a Description, Rule definition source, Owner and Steward (recommended).

  3. In the Implementation tab, define the Input Attributes and their data types.

  4. In Rule Logic select the rule dimension. Available options are Validity, Completeness, Uniqueness, and Accuracy.

  5. After selecting rule type, the following dropdown menu appears. Select Aggregation rule.

    Rule logic
  6. Select Change Type when prompted.

    Change type
    If further rule logic has already been defined, be aware that this is not saved when you change rule type.
  7. A new section named Group By appears. Group by a single selected attribute or a number of them, using Add input (and then use the plus sign (+) for each additional attribute).

    Attributes are available under the names provided for them in the Attributes section.

  8. Using the dropdown, select Condition Builder if it is not already selected.

  9. Select the newly-created group from the list of available inputs. Groups are listed under the heading Group By results.

    600
  10. You can now define the outcome of the DQ rule based on the uniqueness of the group.

    600

Advanced expression aggregation rules

Use cases

In this section you can find step-by-step guides on how to configure aggregation rules for the following three use cases:

  • Use case 1: Does a certain combination of values in the table appear more than once (that is, is a certain combination unique or not)? This uses the Condition Builder.

  • Use case 2: Is the value of this transaction within 20% of the average transaction value (+/-10% from AVG) for this customer? This uses the avg aggregate function.

  • Use case 3: Check whether there is the correct number of transactional records per day (for example, 24 per day) by aggregating records by date and then checking the size of each group. This uses the count aggregate function.

Use case 1: Unique group

Does a certain combination of values in the table appear more than once?

  1. In Data Quality > Rules, select Create.

  2. Name your rule and add other information as required. For the purpose of this example, we use 'Combination rule'.

  3. On the Implementation tab, add the input attributes.

    In this example, we want to find if a combination of three attributes (ATTRIBUTE 1, ATTRIBUTE 2, and ATTRIBUTE 3) appears more than once.

    Inputs
  4. In Rule Logic, select Validity rule.

  5. Using the dropdown select Aggregation rule.

  6. Select Change type when prompted.

  7. In Group By, group the data by the three input attributes.

    Group by inputs
  8. Provide the condition. For this example, we need to define that when the group is not unique, then the result is invalid.

    Rule unique condition
  9. Optionally, test the rule by selecting Test Rule.

Use case 2: avg aggregate function

Is the value of this transaction within 20% of the average transaction value (+/-10% from AVG) for this customer?

  1. In Data Quality > Rules, select Create.

  2. Name your rule and add other information as required. For the purpose of this example, we use 'Average rule'.

  3. On the Implementation tab, add the input attributes.

    In this example, we want to find if the value of a transaction is within 20% of the average transaction value for that customer so our inputs are CUSTOMER and TRANSACTIONS.

  4. In Rule Logic, select Validity rule.

  5. Using the dropdown select Aggregation rule.

  6. Select Change type when prompted.

  7. In Group By, group the data by the CUSTOMER attribute.

    Group by customer
  8. Switch from Condition Builder to Advanced Expression.

  9. Set the rule logic so that anything outside the 20% range of the average is considered INVALID. This has two elements:

    • If the value is bigger than 10% of the average:

      TRANSACTIONS > (avg(TRANSACTIONS) * 1.1)
      First condition
    • If the value is lower than 10% of the average:

      TRANSACTIONS < (avg(TRANSACTIONS) * 0.9)
      Second condition
  10. Optionally, test the rule by selecting Test Rule.

Use case 3: count aggregate function

  1. In Data Quality > Rules, select Create.

  2. Name your rule and add other information as required. For the purpose of this example, we use 'Day rule'.

  3. On the Implementation tab, add the input attributes.

    In this example, we want to find if there is the correct number of records in a day so our input attribute is DATE data type.

  4. In Rule Logic, select Validity rule.

  5. Using the dropdown select Aggregation rule.

  6. Select Change type when prompted.

  7. In Group By, group the data by the DATE attribute.

  8. Using the dropdown, switch from Condition Builder to Advanced Expression.

  9. Select Change type when prompted.

  10. Provide the expression. For this example, we need:

    WHEN
    (count() != 24)
    THEN
    Result is INVALID. Reason: Incorrect # of RECORDS
  11. Optionally, test the rule by selecting Test Rule.

Next steps

Apply your newly-created DQ rule to a term. See Add Rules to Terms.

Was this page useful?