Frequently Used Steps
Ataccama products come with various steps and functions for constructing plan files. The algorithms and logic used for creating a plan file vary from project to project; an introduction to steps is provided in the following sections.
To learn more about functions, see Commonly Used Functions. |
Steps: an overview
Steps can perform many types of functions, such as transforming, filtering and categorizing, and reading data. The following is an overview of some of the most frequently used steps and their functions.
A complete description of steps and their usage can be found in Product Help (Help > Help Contents in the main menu) under Steps.
Flow control steps
Icon | Step name | Step description |
---|---|---|
Condition |
Directs data flow. True: right, false: left. |
|
Filter |
Directs data flow. True: out. |
|
Extract Filter |
Directs data flow. True: right, all: left. |
|
Multiplicator |
Multiplies data flow without modification. |
|
Trash |
Discards data flow. |
|
Join |
Works like SQL table join. |
|
Union |
Works like SQL table union. |
|
Union Same |
Like Union but applies only if the flows are exactly the same. |
|
Alter format |
Adds or removes columns. |
Data parsing steps
Icon | Step name | Step description |
---|---|---|
Regex Matching |
Parses the input string based on a regular expression. See also Regular Expressions. |
|
Pattern Parser |
Parses the input text based on the patterns provided. You have to define all components and optional validations against dictionaries. |
|
Guess Name Surname |
A "predefined" version of Generic Parser used for parsing names. |
|
Strip Titles |
Extracts strings found in the dictionary from the input. For example, it turns "James White PhD" into "James White", "PhD". |
|
Apply Replacements |
Replaces values found in the input with their standardized value. Replaces even substrings, for example, "5th Ave" is transformed to "5th Avenue". |
|
Lookup |
Lookup and validation against a dictionary. |
Analysis steps
Icon | Step name | Step description |
---|---|---|
Profiling |
Comprehensive analysis written to a file ( |
|
Character Group Analyzer |
Calculates masks (example: digit to #, letter to A). |
|
Word Analyzer |
Substitutes words found in reference dictionaries by symbols. |
|
Relation Analysis |
Calculates the number of missing foreign keys for two source flows. |
|
Data Quality Indicator |
Calculates statistics for a given set of business rules. Adds a set of Boolean flags to each record. |
Match and merge steps
Icon | Step name | Step description |
---|---|---|
Unification |
Assigns group IDs (client, candidate, unification roles). Can do the incremental process using the repository. |
|
Representative Creator |
Creates a new record from the defined group (records already have group IDs). Can add calculated values into the original data flow. |
|
Simple Group Classifier |
Calculates the quality of groups (A - for automatic processing, U - unique, M - for manual processing, C - for additional data cleansing). |
|
Unification Extended |
Can run the match process in the mixed mode - online and batch in parallel. |
Was this page useful?