User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Frequently Used Steps

Ataccama products come with various steps and functions for constructing plan files. The algorithms and logic used for creating a plan file vary from project to project; an introduction to steps is provided in the following sections.

To learn more about functions, see Commonly Used Functions.

Steps: an overview

Steps can perform many types of functions, such as transforming, filtering and categorizing, and reading data. The following is an overview of some of the most frequently used steps and their functions.

A complete description of steps and their usage can be found in Product Help (Help > Help Contents in the main menu) under Steps.

Flow control steps

Icon Step name Step description
Condition step

Condition

Directs data flow. True: right, false: left.

Filter step

Filter

Directs data flow. True: out.

Extract Filter step

Extract Filter

Directs data flow. True: right, all: left.

Multiplicator step

Multiplicator

Multiplies data flow without modification.

Trash step

Trash

Discards data flow.

Join step

Join

Works like SQL table join.

Union step

Union

Works like SQL table union.

Union Same step

Union Same

Like Union but applies only if the flows are exactly the same.

Alter Format step

Alter format

Adds or removes columns.

Data parsing steps

Icon Step name Step description
Regex Matching step

Regex Matching

Parses the input string based on a regular expression. See also regular-expressions.adoc.

Pattern Parser step

Pattern Parser

Parses the input text based on the patterns provided. You have to define all components and optional validations against dictionaries.

Guess Name Surname step

Guess Name Surname

A "predefined" version of Generic Parser used for parsing names.

Strip Titles step

Strip Titles

Extracts strings found in the dictionary from the input. For example, it turns "James White PhD" into "James White", "PhD".

Apply Replacements step

Apply Replacements

Replaces values found in the input with their standardized value. Replaces even substrings, for example, "5th Ave" is transformed to "5th Avenue".

Lookup step

Lookup

Lookup and validation against a dictionary.

Analysis steps

Icon Step name Step description
Profiling step

Profiling

Comprehensive analysis written to a file (.profile).

Character Group Analyzer step

Character Group Analyzer

Calculates masks (example: digit to #, letter to A).

Word Analyzer

Word Analyzer

Substitutes words found in reference dictionaries by symbols.

Relation Analysis step

Relation Analysis

Calculates the number of missing foreign keys for two source flows.

Data Quality Indicator

Data Quality Indicator

Calculates statistics for a given set of business rules. Adds a set of Boolean flags to each record.

Match and merge steps

Icon Step name Step description
Unification step

Unification

Assigns group IDs (client, candidate, unification roles). Can do the incremental process using the repository.

Representative Creator step

Representative Creator

Creates a new record from the defined group (records already have group IDs). Can add calculated values into the original data flow.

Simple Group Classifier step

Simple Group Classifier

Calculates the quality of groups (A - for automatic processing, U - unique, M - for manual processing, C - for additional data cleansing).

Unification Extended step

Unification Extended

Can run the match process in the mixed mode - online and batch in parallel.

Was this page useful?