User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Metastore Connection

Metastore data sources are able to detect partitions automatically. When connecting to a metastore, partitions are defined as separate items and you can decide which of them should be imported to Data Catalog, as described in Connect to a Source.

Partitions have more options available for profiling. To learn more, see Run Profiling, section Profiling on partitions.

Create a source

To connect to a metastore:

  1. Navigate to Data Catalog > Sources.

  2. Select Create.

  3. Provide the following:

    • Name: The source name.

    • Description: A description of the source.

    • Deployment (Optional): Choose the deployment type.

      You can add new values if needed. See Lists of Values.
    • Stewardship: The source owner and roles. For more information, see Stewardship.

Alternatively, add a connection to an existing data source. See Connect to a Source.

Add a connection

  1. Select Add Connection.

  2. In Select connection type, choose Metastore > [your database type].

  3. Provide the following:

    720
    • Name: A meaningful name for your connection. This is used to indicate the location of catalog items.

    • Description (Optional): A short description of the connection.

  4. Select Spark enabled. This improves how ONE works with the source during profiling and data quality tasks.

  5. In Additional settings, select Allow exporting and loading of Data if you want to export data from this connection and use it in ONE Data or outside of ONE.

    If you want to export data to this source, you also need to configure write credentials as well.
    Consider the security and privacy risks of allowing the export of data to other locations.

Add credentials

If Kerberos authentication is used, the credentials are read from the configuration provided in the DPE configuration (dpe/etc/application.properties or through DPM Admin Console). No credentials can be specified through ONE.

In this case, all user operations are done under the name of a single user. Which user is logged depends on the value of the property plugin.metastoredatasource.ataccama.one.cluster.<clusterId>.impersonate:

  • If set to true, the name of the user who is currently logged in to ONE is used.

  • If set to false, the name of the user whose credentials are provided in the keytab file is used. The keytab file is referenced in the property plugin.metastoredatasource.ataccama.one.cluster.<clusterId>.kerberos.keytab. For more information, see Metastore Data Source Configuration.

  1. Select Add Credentials.

  2. Provide the following:

    • Name (Optional): A name for this set of credentials.

    • Description (Optional): A description for this set of credentials.

    • Username: The username for the data source.

    • Password: The password for the data source.

    • Token: The token for the data source.

      Used only when connecting to Databricks. In this case, username and password fields are not shown.

  3. If you want to use this set of credentials by default when connecting to the data source, select Set as default.

    One set of credentials must be set as default for each connection. Otherwise, monitoring and DQ evaluation fail, and previewing data in the catalog is not possible.

Add write credentials

Write credentials are required if you want to export data to this source.

To configure these, in Write credentials, select Add Credentials and follow the corresponding step depending on the chosen authentication method (see Add credentials).

Make sure to set one set of write credentials as default. Otherwise, this connection isn’t shown when configuring data export.

Test the connection

To test and verify whether the data source connection has been correctly configured, select Test Connection.

If the connection is successful, continue with the following step. Otherwise, verify that your configuration is correct and that the data source is running.

Save and publish

Once you have configured your connection, save and publish your changes. If you provided all the required information, the connection is now available for other users in the application.

In case your configuration is missing required fields, you can view a list of detected errors instead. Review your configuration and resolve the issues before continuing.

Next steps

You can now browse and profile assets from your metastore connection.

In Data Catalog > Sources, find and open the source you just configured. Switch to the Connections tab and select Document. Alternatively, opt for Import or Discover documentation flow.

Or, to import or profile only some assets, select Browse on the Connections tab. Choose the assets you want to analyze and then the appropriate profiling option.

Was this page useful?