User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Dremio Lineage Scanner

The Dremio lineage scanner supports metadata ingestion using combination of JDBC and REST API:

Running the scanner

To run the scanner, use the following command in the directory containing the scanner’s application .jar file:

 java -jar --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED mde-lineage-toolkit-X.Y.Z.jar dremio.json
Replace X.Y.Z with the version number of your downloaded toolkit.

Supported Sources

  • Cross lineage is supported for the following Dremio Sources:

    • Oracle

    • MS SQL

    • Snowflake

Supported statement types and SQL syntax

  • Supported Statements Types:

    • CREATE VIEW

  • Supported Database Objects for Design Time Lineage:

    • Views

  • Not yet supported SQL constructs:

    • Statements containing these SQL constructs will not be included in the lineage diagram:

      • Table Functions

      • UNNEST Clause

      • AT SNAPSHOT Clause

image 2024 09 15 16 37 34 488

Note: Most of the unsupported syntax is planned to be supported in the upcoming versions of the scanner, based on prioritization.

Database-Level Privileges

The scanner’s database user needs to be able to query these Data Dictionary views:

  • TABLES

  • VIEWS

  • COLUMNS

Scanner configuration

Online (JDBC) based extraction connects to Dremio and extracts the necessary metadata.

Scope: Design time views and stored procedures.

Limitations: Currently, only username/password authentication is supported (other types are planned).

Property Description

*name

Unique name of the scanner job

*sourceType

Must contain DREMIO

*description

Human readable description

*arrowFlightJdbcUrl

Arrow Flight JDBC url

*username

Dremio username

*password

Dremio password, can be encrypted

*apiHost

Dremio host - required for REST API calls

*apiPort

Dremio port - required for REST API calls

useSSLApiCalls

By default, is set to true. For development/docker Dremio can be set to false disable SSL API requests.

acceptSelfSignedSSLCertificate

By default, is set to false. For development/docker Dremio can be set to true to accept self-signed SSL certificate. Used only, when useSSLApiCalls is set to true.

includeSpaces

List of spaces to include in lineage extraction. The filter is case-insensitive and supports SQL wildcards: '%' and '_'. I.e.: to include all schemas ending with "PROD", add filter "%PROD".

excludeSpaces

List of spaces to exclude from lineage extraction. Similarly to includeSpaces, this filter is case-insensitive and supports SQL wildcards

Legend: *mandatory

Example configuration (for Dremio’s Docker)
 {
  "scannerConfigs": [
    {
      "name": "dremio-localhost",
      "sourceType": "DREMIO",
      "description": "Scan dremio export files",
      "includeSpaces": [],
      "excludeSpaces": ["@admin", "DevSpace%"],
      "connection": {
        "arrowFlightJdbcUrl": "jdbc:arrow-flight-sql://localhost:32010/?useEncryption=false",
        "username": "admin",
        "password": "some_password",
        "apiHost": "localhost",
        "apiPort": 9047
      }
    }
  ]
}

Scanner special features

Dremio scanner supports the same SQL "anomaly" detection checks as the Snowflake scanner. See the SQL "DQ" (Anomaly) detections for more details.

Was this page useful?