User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

MDM Web App Validations

mda-validations.xml Overview

  1. The mda-validations.xml configuration file has the following core-level structure:

    Core Structure
        <validations>
            <defaultSeverity infoWarningTreshold="1000000" warningErrorTreshold="10000000" />
            <instanceLayer>
                <entities>
                    ...
                </entities>
            </instanceLayer>
            <masterLayers>
                <masterLayer name="masters">
                    <entities>
                        ...
                    </entities>
                </masterLayer>
            </masterLayers>
                        ...
            <messagesMappings class="...">
                        ...
            </messagesMappings>
            <keysTokenizer />
        </validations>
    default severity thresholds are used where no severity explanation code or score column severity thresholds are defined.
  2. Entities configurations in mda-validation.xml have the following structure:

    Instant and Master Entities
    <entities>
        <entity name="party">
                    <validationService>
                        <entityDQI>
                            ...
                        </entityDQI>
                        <columnDefinitions>
                           ...
                        </columnDefinitions>
                    </validationService>
                </entity>
                     <entity name="address">
                        ...
                     </entity>
    </entities>

    entities: The master entities that have validation configured (Instance layer is identical to master layer). Each entity might have the following services configured (name is related to entity name in the MDM Web App model):

    • identifyService: Enables the Identify service.

    • validationService: Enables the Validate service

  3. Entitiy DQI and Column Definition configurations in mda-validation.xml has the following structure:

    Validation Service
    <validationService>
        <entityDQI>
            <column name="sco_full_name" type="sco" />
            <column name="exp_full_name" type="exp" />
            <column name="exp_gender" type="exp /">
        </entityDQI>
        <columnDefinitions>
            <column name="std_sin" type="std" />
            <column name="sco_sin" type="sco" />
            <column name="exp_sin" type="exp" />
            <column name="sco_full_name" type="sco" />
            <column name="exp_full_name" type="exp" />
            <column name="exp_gender" type="exp" />
            <column name="exp_gender" type="exp" />
            <column name="src_sin" type="src" />
            <column name="cio_sin" type="cio" />
        </columnDefinitions>
    </validationService>

    entity DQI: Specifies how the entity-related data quality indicator is computed and which messages are present.

    column definition: Defined columns have to be mapped to it’s source column. Allowed column types are SRC, STD, CIO, SCO, EXP, OTHER. Columns that are not defined here have columns cleanse type = OTHER

    Entity DQI and Column Definition have identical structure.

  4. Filters and Keys configuration in`mda-validation.xml` has the following structure:

    Filters and Keys
    <column name="exp_instance" type="exp">
        <filter>
            <keys>
            <key name="SIN_EMPTY" />
            <key name="SIN_ADDRESS_MISMATCH" severity="ERROR"/>
                <columns>
                    <column name="src_sin" />
                    <column name="src_address" />
                </columns>
            </key>
                <key name="SIN_INVALID_DIGIT_COUNT" severity="WARNING" />
                <key name="SIN_NOT_PARSED" severity="INFO" />
            </keys>
        </filter>
                                <!-- columns below are applied as default to all key filters that have no specified columns -->
        <columns>
            <column name="src_sin" />
        </columns>
    </column>

    keys: Defines which explanation codes will be displayed. Allowed severity values: INFO, WARNING, ERROR. Explanation key severity has priority over the one counted from sco column based on severity threshold.

    source columns: Columns which keys are being defined are located next to the filter element. Child of the key element columns are an exception and has the priority over the default columns.

    filters: Defining no filter or leaving keys empty means all explanation values are present.

  5. Messages Mappings configuration in`mda-validation.xml` has the following structure

    Separator and languages example
        <messagesMappings class="com.ataccama.mda.core.validations.MdaValidationsFileMessagesMappings">
            <separator>.</separator> <!-- default separator is "." when not stated -->
            <messagesFiles>
                <messageFile fileName="mda-validation_messages.en.properties" language="en" />
                <messageFile fileName="mda-validation_messages.cz.properties" language="cz" />
                <messageFile fileName="mda-validation_messages.ru.properties" language="ru" />
            </messagesFiles>
        </messagesMappings>
    When using multiple languages, separate mda-validation_messages.properties should be created and put into mda-validations.xml for each individual language.

Validation Service

Each validationService uses cleansing plan of the instance entity master entity name specified in entity name element. The validation operation incorporates cleansing and merging for both instance and master entities, based on the Model.

  • messageSource: Name of the column from the cleansing plan that contains explanation codes.

  • key: Means one or more particular explanation code. If one of the code is matched codes in messageSource column, message is displayed as it is described if columns section below .

  • columns: This section allows to tie validation message to

    • entity ⇒ no column is defined, thus the message is displayed with no relation to a column.

    • a column ⇒ specific column from the corresponding entity is defined and therefore the message is displayed only on the column.

    • more columns ⇒ specific columns from the corresponding entity are defined, thus the same message is displayed for all columns involved.

mda-validation_messages.properties

The property file is defined as a part of the messagesMappings and it defines validation messages displayed in the MDM Web App instead of explanation codes. Structure of the file is the following: <explanation_code>=<textual message>

  1. mda-validation_messages.properties.xml should be manually created and included into mda-validations.xml.

    mda-validation_messages.properties example
    # address label
    USA_ADDRESS=This sample doesn't provide USA address cleansing rules.
    There is only Canadian address validation for Toronto and Leduc cities available.
    CA_OUT_OF_RANGE=Only sample Canadian address etalon is used!
    There are only Toronto and Leduc cities available.
    CA_DISABLED=Address cleansing is emulated!
    Only the pre-prepared addresses are cleansed/validated!
    VALID=Address is correct or differs only in standardization (state name, street direction etc.)
    CORRECTED_MINOR=Address quality is good, minor corrections performed (typos in names, small differences etc.)
    CORRECTED_MAJOR=Address quality is poor, address components corrected (different city or several minor corrections in different address elements)
    UNKNOWN=Address is invalid and was not parsed, or is ambiguous (unidentified to delivery point)
    INSUFFICIENT_INPUT= Input address elements are empty or not sufficient to be identified.
    #address_status
    V=CA POST SERP VALIDITY CODE: No or minimal correction was done to the input value.
    Software package is able to detect all address components.
    The result address is valid.
    N=CA POST SERP VALIDITY CODE: The result address is invalid.
    Software package is unable to detect all address components or make valid corrections.
    C=CA POST SERP VALIDITY CODE: An invalid address is "Correctable" "C" when there are one or more components missing or inconsistent from an otherwise valid address; and only one address can be derived from the information provided.
    # exp_gender
    GENDER_MISSING_INVALID=Gender is missing or was entered in an invalid format
    # exp_sin
    SIN_EMPTY=SIN is empty
    SIN_INVALID_CHECK=SIN has invalid checksum
    SIN_INVALID_DIGIT_COUNT=SIN has invalid number of digits
    SIN_NOT_PARSED=SIN wasn't parsed successfully
    
    MASTER.*.party.src_sin.IS_EMPTY=Is empty - but this feature is not working yet (as of 30-11)
  2. Common message mapping was written such as this: SOME_KEY=Corresponding

    There are however more advanced valid options:

    1. SOME_KEY=Corresponding sentence.

    2. MASTER.masters.party.src_column.COMMON_KEY3=key3.1

    3. INSTANCE.party.src_column.COMMON_KEY3=key3.2

    4. *.masters.party.src_column.UNIQUE_KEY4_1=key4.1

    5. MASTER.masters.party.*.UNIQUE_KEY4_7=key4.7

    6. *.masters.party.src_column.COMMON_KEY6=key4.1

    7. *.*.*.*.COMMON_KEY6=key4.4

    8. MASTER.masters.party.*.COMMON_KEY6=key4.7

      Several rules apply there. Lets call the left part (before =) key and right part (validation sentence) value.

      • The mmost concrete key is always chosen.

      • A key without a path is less concrete than any other matching key with path.

        [*.*.*.*.COMMON_KEY6=key4.4] > [SOME_KEY=Corresponding sentence]

      • key with lower count of asterisks in path is considered as the most concrete.

        [ MASTER.masters.party.src_column.COMMON_KEY3=key3.1] > [*.masters.party.src_column.UNIQUE_KEY4_1=key4.1]

      • Later asterisks are more concrete.

        [ MASTER.masters.party.*.COMMON_KEY6=key4.7] > [*.masters.party.src_column.COMMON_KEY6=key4.1]

        Asterisks do not work as wildcards.

Was this page useful?