User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Import Data from a Catalog Item

Importing an existing catalog item from the Knowledge Catalog to ONE Data means making the data discoverable in the application. Once your data is loaded, you can enhance it by adding or removing attributes or modifying the table values.

You can import an entire catalog item with or without DQ results, only data, or deduplicated data.

This is especially useful in these cases:

  • You have a data source with ungoverned data that you can’t manage otherwise.

  • You need to quickly modify some data, namely invalid records, before exporting it back to the data source.

  • The catalog item contains data that can be used as reference data.

ONE Data is a type of data source in ONE and you can access the metadata of each ONE Data table in the Knowledge Catalog as well. These tables are labeled as ONE Data catalog items.

To easily navigate to ONE Data from the Knowledge Catalog, when viewing the table in the Knowledge Catalog, go to Data > Open in ONE Data.

In the current version, we recommend working with datasets of up to 50k records for optimal performance.

Import from a catalog item

This option is not supported for catalog items with attributes of binary data type.

To start your import, follow these steps, then proceed with one of the following sections (Import data only, Import data with DQ results, or Import deduplicated data respectively). Importing DQ results includes first running DQ evaluation on the whole catalog item.

  1. In ONE Data, use the dropdown next to Create table and select From Catalog Item.

    Create table from catalog item

    You can also import invalid records from an observed system in the Data Observability module. See Data Observability Dashboards.

    Export invalid records
  2. Find and select the catalog item that you want to use. Use full-text search and filters as needed.

    You can filter by applied or suggested glossary terms, DQ percentage, data source or location, number of catalog item attributes or records, last profiling date, catalog item owner, or detected anomalies. In addition, you can also view only published catalog items or drafts.
    Import from a catalog item

    Alternatively, locate the catalog item in the Data Catalog, use the three dots menu and select Load to ONE Data.

    Load to ONE Data

  3. Once you select the catalog item, select the data load type:

    Data load options
    1. To import data with DQ results, select Full (data with DQ results). In this case, you cannot select which attributes are imported. DQ evaluation is performed on the whole catalog item before a new ONE Data table is created.

      1. Next, choose whether to import the whole dataset or only invalid records.

        • All records: Imports all available records and the latest DQ results.

        • Invalid records: Imports all records that failed any of the applied DQ rules.

          It’s also possible to load failed records to ONE Data from the catalog item Overview or Data Quality tabs. See Load failed records to ONE Data.
      2. After you have made your choice, select Next and continue with step Import data with DQ results to finish creating your table.

    2. To import data without DQ results, select Data only and then Next. To finish creating your table, continue with step Import data only.

    3. To deduplicate data, select Deduplicated data and then Next. To finish creating your table, continue with step Import deduplicated data.

When loading deduplicated data, any terms and rules applied to data are also imported to ONE Data. This means that any DQ results on deduplicated attributes will be available in ONE Data.
If your catalog item contains attributes named dmm_record_id or dmm_rank, the import fails on validation as these keywords are reserved for technical attributes in ONE Data and each attribute name must be unique in a table.

Import data with DQ results

  1. If you selected Full (data with DQ results) and then All records or Invalid records in the previous step, choose one of the two options:

    To prevent any potential issues with multi-attribute rules, you can’t select which attributes are imported in this case.
    If the catalog item has any component or aggregation rules applied, these are imported but without DQ results. Therefore, you can’t select them when filtering records by DQ.
    1. Create new ONE Data Table

      1. Enter a unique name for the table and optionally a description.

      2. In Stewardship, select the table owner and roles. For more information, Stewardship. Otherwise, the stewardship configuration is inherited from the data source.

    2. Overwrite existing ONE Data table: In Target table, find and select which table you want to overwrite.

      This deletes all existing data from the selected table.
      This option is only available if you have previously created a ONE Data table with this catalog item.
  2. Select Create table. Depending on the size of your catalog item, it might take a few minutes to get everything ready. In addition, as this import option includes running DQ evaluation on the whole table, it typically takes longer compared to importing data only.

    To continue working with the platform in the meantime, select Run in background. This takes you to the newly created ONE Data catalog item in the Knowledge Catalog.

    Alternatively, remain on the same page until your ONE Data table is created. A notification lets you know when the import is finished.

    If your import fails, there is likely an issue with connecting to your data source. Check the error log in Processing Center for more details.
  3. Your table is now ready for use. See the Next steps section for more tips about how to proceed.

Import data only

  1. If you selected Data only in the previous step, you can now edit which attributes the new ONE Data table should contain.

    • To import all attributes: Select the checkbox in the column header.

    • To import a selection of attributes: Select the attributes individually. You can narrow down the list using the full-text search and sort by the attribute name, data type, comments, or description.

      Technical attributes, such as the record identifier in ONE Data tables (dmm_record_id), must be included in the import and cannot be cleared.
  2. Once you choose the attributes, select Next.

  3. Choose one of the two options:

    1. Create new ONE Data Table

      1. Enter a unique name for the table and optionally a description.

      2. In Stewardship, select the table owner and roles. For more information, see Stewardship. Otherwise, the stewardship configuration is inherited from the data source.

    2. Overwrite existing ONE Data table: In Target table, find and select which table you want to overwrite.

      This deletes all existing data from the selected table.
      This option is only available if you have previously created a ONE Data table with this catalog item.
  4. Select Create table. Depending on the size of your catalog item, it might take a few minutes to get everything ready.

    To continue working with the platform in the meantime, select Run in background. This takes you to the newly created ONE Data catalog item in the Knowledge Catalog.

    Alternatively, remain on the same page until your ONE Data table is created. A notification lets you know when the import is finished.

    If your import fails, there is likely an issue with connecting to your data source. Check the error log in Processing Center for more details.
  5. Your table is now ready for use. See the Next steps section for more tips about how to proceed.

Import deduplicated data

Before proceeding, get familiar with how deduplication is performed in ONE Data. See How deduplication works?.

  1. If you selected Deduplicated data in the previous step, you can now choose which attributes the new ONE Data table should contain. You need to select at least one attribute to proceed, however, you can edit your selection at the following step. You can narrow down the list using the full-text search and sort by the attribute name, data type, comments, or description.

    Technical attributes, such as the record identifier in ONE Data tables (dmm_record_id), must be included in the import and cannot be cleared.
  2. Once you choose the attributes, select Next.

  3. Select the attribute or a combination of attributes based on which the data will be deduplicated. If needed, select Add or remove attributes to modify the chosen attributes.

    All attributes listed here will be included in the table while the selected ones (Key field) will be used as the deduplication key.
    Select deduplication key
  4. Once you’re happy with your choice, select Next.

  5. Enter a unique name for the table and optionally a description.

  6. Select Create table. Depending on the size of your catalog item, it might take a few minutes to get everything ready.

    To continue working with the platform in the meantime, select Run in background. This takes you to the newly created ONE Data catalog item in the Knowledge Catalog.

    Alternatively, remain on the same page until your ONE Data table is created. A notification lets you know when the import is finished.

    If your import fails, there is likely an issue with connecting to your data source. Check the error log in Processing Center for more details.
  7. Your table is now ready for use. See the Next steps section for more tips about how to proceed.

You can also deduplicate data from the following locations:

  • From the ONE Data table detail screen. On the Data tab, select one or more attributes, then choose Create reference table in the banner that appears.

    Create reference table from ONE Data
  • From the catalog item Overview, Profile & DQ Insights, or Data tabs. Expand the three dots menu for a particular attribute and select Create reference table.

    Create reference table from attribute
  • From the catalog item detail screen. Expand the three dots menu for a catalog item and select Load to ONE Data, then follow the steps described in this article.

How deduplication works?

When deduplicating data in ONE, records from a dataset are grouped based on one or several attributes that you define as the deduplication key. Once the table is created, it contains the first occurrence of each record from the original dataset.

If the deduplication key consists of a single attribute, any records where that attribute is null are excluded from the table. Otherwise, if the key is composed of multiple attributes, records are exported as long as at least one of the attributes is not-null.

The table can also include any other attributes that help you better describe your data. In this case, non-empty records are prioritized over empty ones for every additional attribute.

For example, if the original dataset contains the following records and the grouping attribute is Country code, the deduplicated data will only retain the second row.

Country code (deduplication key) Country name Note

CZE

The additional attribute is empty, the record is not included in the table.

CZE

Czech Republic

The preferred record where the additional attribute is not empty. In addition, an attribute called frequency is added to the new table by default, which stores the number of occurrences for each record.

Use this attribute to identify outliers and determine which values should be removed. If you don’t need it, you can also delete it once the table is created.

Load failed records to ONE Data

It’s also possible to load failed records to ONE Data directly from the catalog item’s Overview or Data Quality tabs.

  1. Locate the required catalog item and:

    • In the Overview tab, locate the Data Quality widget, use the three dots menu, and select Load failed records to ONE Data.

      Load failed records
    • In the Data Quality tab, select Load failed records to ONE Data.

      Load failed records
      This option is only available when you are viewing the Latest results.
  2. Choose one of the two options:

    1. Create new ONE Data Table

      1. Enter a unique name for the table and optionally a description.

      2. In Stewardship, select the table owner and roles. For more information, see Stewardship. Otherwise, the stewardship configuration is inherited from the data source.

    2. Overwrite existing ONE Data table: In Target table, find and select which table you want to overwrite.

      This deletes all existing data from the selected table.
      This option is only available if you have previously created a ONE Data table with this catalog item.
  3. Select Start loading records to ONE Data. Depending on the size of your catalog item, it might take a few minutes to get everything ready. In addition, as this import option includes running DQ evaluation on the whole table, it typically takes longer compared to importing data only.

    To continue working with the platform in the meantime, select Run in background. This takes you to the newly created ONE Data catalog item in the Knowledge Catalog.

    Alternatively, remain on the same page until your ONE Data table is created. A notification lets you know when the import is finished.

    If your import fails, there is likely an issue with connecting to your data source. Check the error log in Processing Center for more details.
  4. Your table is now ready for use. See the Next steps section for more tips about how to proceed.

Tips for filtering attributes

  • Try searching for integer or string to filter attributes by their data type.

  • Unsure about what data you’re working with? Switch to the Data Preview tab to see a live preview of the top 50 records in the catalog item. You can also select attributes directly from this tab.

    The preview isn’t available for virtual catalog items.

  • If you have multiple attributes with similar data and the quality overview on the Attributes tab isn’t sufficient, check out their detailed profiling and DQ results on the Profile & DQ Insights tab. You can both filter and select attributes from here, there’s no need to switch back and forth between tabs.

    This information is available only for data that has been profiled.

    Select attributes

Next steps

Once you have successfully created a ONE Data table, start exploring what ONE Data and ONE can offer you.

Was this page useful?