User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Ataccama Lineage Format

This page describes Ataccama lineage file structure produced by the lineage scanner. As this structure was designed to be human friendly, it can also be used to supply custom lineage metadata to Ataccama ONE.

Ataccama lineage file structure

The lineage file is a zip container file, that contains several plain data files which contain the lineage metadata. These are the following UTF-8 format data files located directly in the extraction file root.

  • assets.csv

  • flows.csv

  • transformations.csv

  • connections.csv

  • export.json

The detailed description of the file structures follow.

Lineage data files description

The following diagram shows the entity relation model of the data files:

extraction file structure

assets.csv

The list of all assets identified in all scanned source systems.

Name Type Description

uid

string not null unique

Artificial or natural unique identifier of the asset in the source system. Does not necessarily contain any information describing the asset.

type

enum not null

Type of the asset from the data lineage perspective. Possible values: PATH, ITEM, ATTRIBUTE. The original type of the object could be placed into the attributes.

name

string not null

Name of the asset.

parent_uid

string null

Unique identifier of a parent asset. Null for the highest node in the hierarchy.

connection_uid

string not null

Unique identifier of the connection from which the asset was taken (scanned) or artificial connection belonging to deduction when the source connection wasn’t recognized. When the scanned source system has the ability to connect to other systems, the connections to these systems will be extracted and used as well.

attributes

json null

Additional metadata related to the asset in JSON format following the structure:

[
{
"name": "....",
"value": [object]
}
]
  • name - of the attribute

  • value - of the attribute (could contain also the value in JSON format)

Supported attributes:

  • EXTERNAL_OBJECT - describes an object in a source system. Represents a JSON object with the following attributes:

  • TYPE

  • DATA_TYPE etc.

action

enum not null

Defines the action which should be applied to the asset when the data are processed by the consuming application. Possible values: ADD, DEL

flows.csv

The list of all flows identified in all scanned source systems.

Name Type Description

type

enum not null

Type of the edge. Possible values: DATA_FLOW.

source_asset_uid

string not null

Unique identifier of the source asset.

target_asset_uid

string not null

Unique identifier of the target asset.

attributes

json null

Additional metadata related to the flow in JSON format following the structure:

[
{
"name": "....",
"value": [object]
}
]
  • name - of the attribute

  • value - of the attribute (could contain also value in JSON format)

transformations

json null

JSON array of transformation unique identifiers.

action

enum not null

Defines the action which should be applied to the flow when the data are processed by the consuming application. Possible values: ADD, DEL

transformations.csv

The list of all transformations identified in the source systems.

Name Type Description

uid

string not null unique

The unique identifier of the transformation identified in the source system.

type

enum not null

The type of the transformation. Possible values: SCRIPT, QUERY, EXPRESSION.

The attribute level lineage pointer from the flows.csv file should point to EXPRESSION record type.

The expected hierarchy of the items in this table is EXPRESSIONQUERYSCRIPT.

text

string not null

The text (body) of the transformation.

parent_uid

string null

The parent transformation unique identifier. Null for the highest transformation in the hierarchy.

complexity

enum not null

The complexity of the transformation. Possible values: SIMPLE, COMPLEX, etc.

position

json null

The list of all positions of the transformation text within the parent transformation text. Contains the array of the position objects in JSON format:

  • x, y - char position within the parent transformation text, starting from 1,1

{
    "start": {
        "x": [number],
        "y": [number]
        },
        "end": {
            "x": [number],
            "y": [number]
            }
}

attributes

json null

Additional metadata related to the transformation in JSON format following the structure:

[
{
"name": "....",
"value": [object]
}
]
  • name - of the attribute

  • value - of the attribute (could contain also value in JSON format)

action

enum not null

Defines the action which should be applied to the asset when the data are processed by the consuming application. Possible values: ADD, DEL

connections.csv

List of all scanned source system connections extended by the list of all connections identified in the source systems capable of connecting to other systems (e.g., BI tools or ETL tools) or artificial connections to the deducted assets.

Name Type Description

uid

string not null unique

Unique identifier of the connection.

type

enum not null

Type of the connection. Possible values: DATABASE, BI, ETL, FILE, etc.

name

string not null

Name of the connection.

technology

string not null

Technology name of the source system (e.g., SNOWFLAKE, DBT, TABLEAU, etc.).

attributes

json null

Additional metadata related to the connection in JSON format following the structure:

[
{
"name": "....",
"value": [object]
}
]
  • name - of the attribute

  • value - of the attribute (could contain also value in JSON format)

action

enum not null

Defines the action which should be applied to the asset when the data are processed by the consuming application. Possible values: ADD, DEL

Example Ataccama lineage files

Azure Synapse Analytics

MS SQL server

Manually created: ataccama_mssql_custom.zip

S3 and MS Excel

Was this page useful?