User Community Service Desk Downloads
If you can't find the product or version you're looking for, visit support.ataccama.com/downloads

Apache Parquet

Apache Parquet assets are supported by the following data sources:

  • Amazon S3

    In versions prior to 14.5.1, only AWS access key authentication type is supported.
  • ADLS Gen2

There is no limit to the size of the Parquet asset you can import. However, only individual Parquet files up to 6 GB can be profiled.

How Apache Parquet assets are imported

You can import the following Apache Parquet assets into ONE:

  • Parquet files

  • Parquet tables

  • Partitioned Parquet tables

400

When a Parquet file is imported, a catalog item is created with the attributes from the file.

400

However, when Parquet assets other than files are imported, ONE analyses the asset then creates a catalog item with the attributes based on the Parquet asset. For example, a Parquet folder that contains a partitioned table creates a catalog item with the attributes from the Parquet files and the partitioned columns.

600

A Parquet folder that contains a Parquet table creates a catalog item with the attributes from the Parquet table.

400

Profiling

Profiling can only process individual Parquet files up to 6 GB.
  • Full profiling is only supported for individual Parquet files.

  • Both sample and full profiling are supported for Parquet tables and partitioned Parquet tables. However, sample profiling only runs on the first Parquet file found in the folder.

Browsing

You can preview the data only for small Parquet files. The maximum file size can be configured using the property ataccama.one.parquet.preview-maximum-size in dpe/etc/application.properties.

The default value is 5 MB.

Was this page useful?