Azure Data Factory Lineage Scanner
Azure Data Factory lineage scanner uses a Microsoft SDK to extract raw metadata from Azure Data Factories. The metadata is then further processed to create lineage metadata (for example, datasets and dataflows are parsed).
What is extracted from Azure Data Factory?
The lineage information from Azure Data Factory is available on attribute level and the metadata extracted includes the following objects:
-
Factories
-
Linked services
-
Datasets
-
Dataflows
Permissions and security
Azure Data Factory uses Azure Role-Based Access Control (RBAC) to assign permissions to users. To be able to access data from Data Factories, you need to grant the role of Data Factory Contributor to the user you’ll be authenticating with.
Assign role in Azure Data Factory
Before continuing, make sure you have the necessary access permissions to your Azure Portal.
-
In Azure Portal, navigate to your Azure Data Factory.
-
Open Access control (IAM).
-
Select Add > Add permissions.
-
Select the Data Factory Contributor role and on the Members tab, assign it to the relevant user or group of users.
Scanner configuration
All fields marked with an asterisk (*
) are mandatory.
Property | Description |
---|---|
|
Unique name for the scanner job. |
|
Specifies the source type to be scanned.
Must contain |
|
A human-readable description of the scan. |
|
List of Ataccama ONE connection names for future automatic pairing. |
|
Path to Data Factory Manager |
|
To find your organization |
|
To find your |
|
Microsoft OAuth 2.0 credentials client ID. Make sure to first register a Microsoft Entra application in Azure. The application establishes permissions for SDK resources and allows access to the Azure Data Factory SDK data. |
|
Microsoft OAuth 2.0 credentials client secret. |
|
Name of the container that holds related resources for an Azure solution. |
{
"scannerConfigs":[
{
"name":"ADFJob1",
"sourceType":"ADF",
"description":"Scan ADF",
"oneConnections":[
],
"inputDataCatalogFilePath":null,
"tenantId":"tenant-id",
"subscriptionId":"your-subscription-id",
"clientId":"your-client-id",
"clientSecret":"@@ref:ata:[ADF_CLIENT_SECRET]",
"resourceGroup":"MetadataExtractionGroup"
}
]
}
Supported Azure Data Factory source technologies
The scanner supports the following Azure Data Factories linked services:
-
AzurePostgreSqlLinkedService
-
AzureSqlDatabaseLinkedService
-
SqlServerLinkedService
-
SnowflakeV2LinkedService
Limitations
Azure Data Factory enables 12,500 requests per hour. For more details, see the official Microsoft documentation.
Was this page useful?