User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Azure Data Factory Lineage Scanner

Azure Data Factory lineage scanner uses a Microsoft SDK to extract raw metadata from Azure Data Factories. The metadata is then further processed to create lineage metadata (for example, datasets and dataflows are parsed).

What is extracted from Azure Data Factory?

The lineage information from Azure Data Factory is available on attribute level and the metadata extracted includes the following objects:

  • Factories

  • Linked services

  • Datasets

  • Dataflows

Permissions and security

Azure Data Factory uses Azure Role-Based Access Control (RBAC) to assign permissions to users. To be able to access data from Data Factories, you need to grant the role of Data Factory Contributor to the user you’ll be authenticating with.

Assign role in Azure Data Factory

Before continuing, make sure you have the necessary access permissions to your Azure Portal.

  1. In Azure Portal, navigate to your Azure Data Factory.

    Locate Factory
  2. Open Access control (IAM).

    Open Access Control
  3. Select Add > Add permissions.

    Add permissions
  4. Select the Data Factory Contributor role and on the Members tab, assign it to the relevant user or group of users.

    Assign role to users

Scanner configuration

All fields marked with an asterisk (*) are mandatory.

Property Description

name*

Unique name for the scanner job.

sourceType*

Specifies the source type to be scanned. Must contain ADF.

description*

A human-readable description of the scan.

oneConnections

List of Ataccama ONE connection names for future automatic pairing.

inputDataCatalogFilePath

Path to Data Factory Manager

tenantId*

To find your organization tenantID, in your Azure Portal, go to Azure Active Directory > Properties.

Azure Portal tenantID

subscriptionId*

To find your subscriptionId, in your Azure Portal, go to Subscriptions > Select your subscription.

clientID*

Microsoft OAuth 2.0 credentials client ID.

Make sure to first register a Microsoft Entra application in Azure. The application establishes permissions for SDK resources and allows access to the Azure Data Factory SDK data.

Azure Portal clientID

clientSecret*

Microsoft OAuth 2.0 credentials client secret.

resourceGroup*

Name of the container that holds related resources for an Azure solution.

Azure Data Factory scanner example configuration
{
   "scannerConfigs":[
      {
         "name":"ADFJob1",
         "sourceType":"ADF",
         "description":"Scan ADF",
         "oneConnections":[

         ],
         "inputDataCatalogFilePath":null,
         "tenantId":"tenant-id",
         "subscriptionId":"your-subscription-id",
         "clientId":"your-client-id",
         "clientSecret":"@@ref:ata:[ADF_CLIENT_SECRET]",
         "resourceGroup":"MetadataExtractionGroup"
      }
   ]
}

Supported Azure Data Factory source technologies

The scanner supports the following Azure Data Factories linked services:

  • AzurePostgreSqlLinkedService

  • AzureSqlDatabaseLinkedService

  • SqlServerLinkedService

  • SnowflakeV2LinkedService

Supported dataflow transformations

  • Source

  • Filter

  • Join

  • Union

  • Sink

Limitations

Azure Data Factory enables 12,500 requests per hour. For more details, see the official Microsoft documentation.

Was this page useful?