Standalone Lineage Scanner
This guide is intended for admin users. |
The Ataccama standalone lineage scanner is deployed as an independent application using Docker Compose. One of the most important security aspects of this solution is ensuring that the hosting environment of the Docker Compose service is secured at both the OS and networking levels.

System requirements
-
Operating system: Linux, macOS, or Windows.
-
Minimum system requirements: 10G disk / 4G RAM / 2 CPU
-
Docker Compose: Version 2.32.0 or higher must be installed on the OS.
-
To check your Docker Compose version, run
docker-compose --version
ordocker info
.
-
-
or Podman: Podman running in docker compatibility mode.
When installing on Windows, you need to use a BASH terminal. We recommend using Git BASH, which comes with the Git client. |
Installation
Prerequisites
To install and run the scanner, you need to have permissions to execute Docker.
To verify this, run docker image ls
. The command should return no errors.
Upgrading the scanner
Before upgrading the scanner, you must stop the lineage service.
./lineage-scanners system shutdown
Once the service is stopped, proceed with the installation steps using the new software version.
Installing the scanner
-
Depending on your OS, start the following:
-
Linux: Start the Docker daemon.
-
Windows or macOS: Start the Docker Desktop.
-
-
Open the terminal and change your current directory to the folder containing the downloaded installation file.
-
Modify the file to be executable by running
chmod 775
. -
Run the installer:
./lineage-scanners-installer.sh
. -
When prompted, choose whether you want to use the current directory for installation. If you select No, enter the target installation directory.
In the following steps, we’ll refer to this folder as
<install>
.
Starting the lineage service
-
Open the terminal and change your current directory to
<install>/bin
. -
Execute the following command:
./lineage-scanners system start
-
Check whether the service is running:
./lineage-scanners system status
If you receive a similar message, it is safe to ignore it and proceed with the installation:
! lss The requested image's platform (linux/amd64) does not match the detected host platform (linux/arm64/v8) and no specific platform was requested 0.0s
Configuring and running the scanner
Configuring a scan plan
-
Place your scan plan to
<install>/user-data/scan-plans
directory. The plan must be in valid JSON format.If you are deploying on Windows, restart the service after you have copied the scan plan or modified an existing one.
-
Open the terminal and change your current directory to
<install>/bin
. -
Run the following command to check if your scan plan was successfully registered.
./lineage-scanners plan ls
Configuring scan plan secrets
We recommend encrypting any sensitive information such as passwords, secrets, and other credentials information. For this purpose, in the JSON scan plan file, you can use placeholders for secrets that ensure actual values are safely stored and encrypted in a key vault service. The following key vaults are supported.
Internal key vault
A placeholder for an internal secret has the syntax @@ref:ata:[<secret-name>]
. These secrets are stored in the lineage service key vault. We will refer to these secrets as ATA secrets.
Azure Key Vault
A placeholder for an Azure Key Vault secret has the syntax @@ref:akv:[<secret-name>]
. These secrets are stored in the Azure Key Vault. We will refer to these secrets as AKV secrets. The <secret-name>
must match the Azure Key Vault secret name.
Creating an ATA secret
If a secret is used in the scan plan, you need to set its value before you run the scan. Example of ATA secret named powerbi_client_secret
used in a JSON file.
"clientSecret" : "@@ref:ata:[powerbi_client_secret]"
Secrets entered from command line
Use the following command to create a secret directly from the command line. The maximum length for the secret is 256 characters:
./lineage-scanners secret create <secret-name>
When prompted, provide the actual value.
File secrets
Use the command below to create a secret for secrets that are longer than 256 characters or for key file secrets. The <secret-file> parameter refers to the filename that contains the secret. This file must be located in the <install>/user-data/
folder.
./lineage-scanners secret create <secret-name> -f <secret-file>
Once the secret is added, you can delete the file from the filesystem.
The <secret-name> is a unique identifier of the secret across all scan plans. The same secret can therefore be used in multiple scan plans. |
Removing an ATA secret
Use the following command to remove a secret
./lineage-scanners secret rm <secret-name>
Configuring Azure Key Vault
Configure the connection and credentials to your Azure Key Vault by creating the following ATA secrets. Once these secrets are configured correctly and the Azure Key Vault is accessible from the standalone scanner location, the AKV secrets used in the JSON scan plan files will be retrieved from this Azure Key Vault.
-
./lineage-scanners secret create lss.akv.url
The full URL to your Azure Key Vault instance.
-
./lineage-scanners secret create lss.akv.clientId
The Application (client) ID of the Azure AD Service Principal (or managed identity) you’re using to authenticate.
-
./lineage-scanners secret create lss.akv.clientSecret
The password/secret associated with the Client ID.
-
./lineage-scanners secret create lss.akv.tenantId
The Azure Active Directory tenant ID your app and Key Vault belong to.
Running a scan
-
Open the terminal and change your current directory to
<install>/bin
. -
Run the followingc command:
./lineage-scanners plan exec <scan-plane-name>
Provide the scan plan name without the .json
extension.The scan should now be running and you should see the scan
<scan-run-id>
and status. The<scan-run-id>
is a unique identifier of the scan. For a list of possible scan statuses, see Scan statuses.Scan run id: 0295650d-7302-43ec-9d7e-8fa799cf7ca7 Scan run status: CREATED
-
To check if the scan successfully finished, run the following command:
./lineage-scanners scanrun status <scan-run-id>
Scan statuses
A scan can be in one of the following states:
-
CREATED: The status assigned immediately after the scan is initiated.
-
RUNNING: The scan is currently running.
-
FINISHED: The scan has completed successfully.
-
FAILED: The scan has failed.
-
REJECTED: The scan was not initiated because the scan plan JSON file is corrupt or a secret value used in the scan plan is undefined.
Next steps
Once you generate the lineage file, it needs to be uploaded to Ataccama ONE.
Removing the lineage scanner
Uninstalling the lineage scanner completely removes the scanner from the system, including scan plans, secrets, logs, and scan results.
If you want to keep the scan plans, create a backup of the folder <install>/user-data/scan-plans
.
To remove the lineage scanner:
-
Stop the lineage scanner:
./lineage-scanners system shutdown
-
Change the current directory to the following one:
cd compose
-
Execute the lineage scanner environment variables file.
source ../bin/lineage-scanners.env
-
Stop and remove the Docker containers and associated files:
docker compose down --rmi local -v
-
Delete the
<install>
folder.
Troubleshooting
The scan is failing
In case the scan finished with en error, check the content of the <install>/user-data/work-area/scan-runs/<scan-run-id>
folder.
To find out more about the folder structure and learn what to look for, see Scanner Output File Structure.
In the case the lineage service fails, the log information is available in the <install>/logs/lss
folder.
Container lineage-scanners-vault-1 exited (1)
When the lineage scanner vault service fails to start and there is an error message Error initializing core: Failed to lock memory: cannot allocate memory
in the service log, modify the /etc/docker/daemon.json
and add a "memlock" section to "default-ulimits" as follows:
{
"default-ulimits": {
"memlock": {
"Name": "memlock",
"Hard": -1,
"Soft": -1
}
},
"log-opts": {
"max-size": "100m"
}
}
After you have modified the file, perform the following steps:
-
Shutdown the lineage service:
lineage-scanners system shutdown
-
Restart the docker service:
sudo systemctl restart docker
-
Start the lineage service:
lineage-scanners system start
Out of memory issue
Out of memory while scanning a large data source.
Check the file <install>/compose/lss/config/application-lineage-scanners.yaml
It contains jvm-args with content like this:
jvm-args: "-Xms2048m -Xmx2048m -XX:+ExitOnOutOfMemoryError --add-opens=java.base/java.nio=org.apache.arrow.memory.core,ALL-UNNAMED "
Increase the Xmx parameter as allowed by the virtual machine you are using. After modifying the file, restart the lineage-scanners service (shutdown/start).
dbt scan fails
The dbt scan fails when using local files to feed the scanner.
This issue occurs because the dbt scanner, operating inside the lineage-scanners-lss container, lacks access to the host machine’s entire filesystem. This factor is essential to consider when setting up scan plans and their respective paths.
The following should be done:
-
Place the dbt_files folder in the
<install>/user-data
directory. -
The user-data directory is mapped into the lss container at this location: /opt/ataccama/lineage-scanning/user-data.
-
This means that in your scan plan definition, this modification will be necessary:
"file": {
"path": "/opt/ataccama/lineage-scanning/user-data/dbt_files",
"manifest": "manifest.json",
"catalog": "catalog.json"
}
The path in the scan plan definition specifies the directory structure that is valid inside the LSS container, not that of the host machine.
Was this page useful?