Deployment Preconditions

This article describes the technical and organizational prerequisites necessary for automated, self-hosted deployment of the Ataccama ONE Platform. As such, the guide is intended for technical teams in charge of preparing the customer infrastructure (such as physical or virtual servers, networking) where the platform will be installed.

This document covers only Ataccama ONE installations on customer-provided infrastructure (physical or virtual). To learn about our cloud offering, see ataccama-one-paas:hybrid-deployment-architecture.adoc and Ataccama Cloud

Before you start

Take a moment to get familiar with some of the terms that are often used throughout this article.

Ansible: An open-source tool for server orchestration and configuration that is used to install and configure the Ataccama software as well as perform maintenance tasks during the product lifecycle (for example, license or certificate rotation).
Ansible controller: A computer that runs the Ansible automated installation package.
Must, must not, required: An absolute requirement. Not adhering to this requirement is not supported.
Operator: The person responsible for running the installation. This is typically a member of the Ataccama Professional Services team or a customer’s employee.
Platform: Refers to the Ataccama ONE platform consisting of a number of applications and modules. In this document, platform is used to designate all components included in Ataccama ONE (regardless of the specific components that your installation will include) with the exception of virtual (or physical) servers, their OS, and connectivity artifacts.
Recommended, not recommended: Not adhering to the recommendation can result in longer installation time or make it more difficult or complicated to apply future upgrades or perform maintenance tasks. Any specifics are provided with every recommendation.
Remote desktop: Any desktop or screen sharing technology, for example, through Zoom. Not limited only to the Remote Desktop Protocol (RDP).
Target servers, targets: The servers where the Ataccama ONE Platform is installed. Depending on the installation size, there are usually 3-12 target servers.

Deployment process

The automated deployment process deploys a customized Ataccama ONE Platform installation on a customer-provided infrastructure. The installation includes monitoring, internal databases, and all other components needed so that the platform can function. Once installed, the platform can then be used to process customer data.

The installation itself (for example, copying Ataccama software packages, configuration) is handled by Ansible, an open-source tool for deployment automation. You receive a software package for installation that needs to be configured according to the specifics of your organization’s infrastructure (for instance, it is necessary to provide server names, select optional components, and so on). The necessary changes can be done either by the Ataccama Professional Services team or you can do it yourself following the documentation included in the package. Once it is configured, Ansible is run on a single machine, called Ansible controller.

The whole deployment process can be broken down into the following steps:

1. Preparation

Select optional components.
Set up target systems where the platform will be installed. This include setting up and configuring:
- The operating system.
- Ansible access rights. Ansible requires full administrative privileges.
- Storage.
- Backups.
Set up the network access needed for the following:
- The installation process.
- User access to allow employees to use the system after the installation.
- Data source access to allow the platform to access customer data.
Set up the wider infrastructure, including the following:
- DNS: Used for user access and communication between servers.
- TLS certificates: Mandatory for user access.

2. Readiness confirmation

The customer and Ataccama check the provided infrastructure for completeness, solve all remaining issues, and verify that the sufficient access rights have been given for installation.

3. Installation

Run the installation package.

4. Verification

Access the user interface and the monitoring system to check that the platform functions as expected. Fix any remaining issues. Perform a demo of the new environment.

5. Handover

The platform is now ready for use.

Target state

The expected outcome of the deployment process is a complete, ready-to-use installation of the Ataccama ONE Platform that meets the following requirements:

It contains all selected optional components (see Ataccama ONE Platform components). Currently, the minimal installation consists only of Ataccama ONE (that is, Data Governance suite or Data Quality and Quality suite) and the observability stack (logging and monitoring tools). We call this a standalone installation.
It runs on the customer-provided hostname.
It is connected to at least one customer data source.
It can be accessed through the admin account.

The monitoring solution includes the following:

A Prometheus monitoring server with preconfigured alerts.
A Grafana server with preconfigured dashboards displaying system performance data.
An OpenSearch Dashboards log visualizer showing logs of all Ataccama components.

In addition, the following must be set up:

Backups, which are handled by the customer.
Firewalls, which need to be configured on every target server using iptables or a similar technology. Access can be allowed only to the services configured during the installation. See Firewall (Linux iptables).

Ataccama ONE Platform components

The following components can be deployed as part of the standard automated deployment:

Component	Mandatory	Description
Ataccama ONE	Yes	Data Governance and Data Quality and Governance product suites
Monitoring stack	Yes	Prometheus, Grafana, OpenSearch Dashboards
MDM	No	Master Data Management
RDM	No	Reference Data Management
DQIT	No	Data Quality Issue Tracker
MANTA	Not supported	Third-party tool used for data lineage (installed by a separate installation package)

Component

Mandatory

Description

Ataccama ONE

Yes

Data Governance and Data Quality and Governance product suites

Monitoring stack

Yes

Prometheus, Grafana, OpenSearch Dashboards

MDM

Master Data Management

RDM

Reference Data Management

DQIT

Data Quality Issue Tracker

MANTA

Not supported

Third-party tool used for data lineage (installed by a separate installation package)

Shared operation system components are installed using the latest stable version as defined by the operation system distribution standards. The deployed version of Ataccama ONE components is agreed with the customer. The third-party dependencies used in the Ataccama ONE Platform are defined in the Ansible installation package.

In addition to the standard deployment option, there are two more use cases: installing standalone ONE MDM Suite without Ataccama ONE (MDM or RDM variant) and installing DPE in hybrid deployment.

Standalone installations (MDM, RDM, hybrid DPE)

In standalone installations, only a single Ataccama product is deployed along with the necessary third-party software and the monitoring stack.

ONE MDM suite

The Ataccama ONE MDM suite comes in two variants, MDM or RDM, installed only in separate environments. In this case, installation sizing requirements are reduced according to the customer needs, typically to one or two server instances and an orchestrator. Ataccama ONE is not deployed in this case.

If you need to use MDM and RDM in the same environment, the only supported option is to install the entire Ataccama ONE Platform (consisting of ONE, MDM, RDM).

In standard deployment options, the following key components are installed using Ansible:

Keycloak (identity and access management tool)
The module database (can be cloud-based)
Monitoring stack

Hybrid deployment

In this case, the installation only covers Data Processing Engine (DPE) that depends on the rest of Ataccama ONE running in PaaS environment.

Preconditions

The following sections provide detailed information about how each server must be prepared before starting the installation.

Infrastructure environment

Ansible is designed to work with on-premise physical infrastructure or other compatible environments, such as Amazon Elastic Compute Cloud (EC2) instances or Azure Virtual Machines (VM). As Ansible cannot deploy cloud instances and resources, creating and managing target servers is the responsibility of the customer. It is also possible to use servers from different providers as long as they are connected by a network (for example, the infrastructure can include some on-premise physical servers and some virtual ones in cloud).

In AWS and Azure environments, instead of deploying PostgreSQL to a virtual server, it is possible to use the cloud provider’s managed relational database service (Amazon Relational Database Service or Azure Database for PostgreSQL respectively). In this case, the customer needs to create the PostgreSQL instance in AWS or Azure before starting the Ansible deployment.

Other cloud-managed technologies, such as OpenSearch, are not supported and they can only be installed on virtual servers using Ansible.

The platform is flexible in terms of components and server sizing. More information is available on request.

Customer-managed infrastructure	Supported
Physical servers on premises	Yes
Virtual servers on premises	Yes
Virtual servers in Amazon Web Services (AWS) cloud	Yes
Virtual servers in Microsoft Azure cloud	Yes
Other cloud providers	Contact us for more information

Customer-managed infrastructure

Supported

Physical servers on premises

Yes

Virtual servers on premises

Yes

Virtual servers in Amazon Web Services (AWS) cloud

Yes

Virtual servers in Microsoft Azure cloud

Yes

Other cloud providers

Supported operating systems

The Ansible controller as well as target servers must use one of the supported OS. We do not recommend using different Linux distributions on different target servers in a single installation. However, you can use different distributions for the Ansible controller and the servers.

The Ansible installation is developed to be used on a standard installation of the selected OS. Therefore, using modified, hardened, or derived OS is not supported.

Only the OS mentioned here are supported, in the versions listed.

OS	Supported
RHEL 9	Yes
Ubuntu 22.04	Yes

Supported

RHEL 9

Yes

Ubuntu 22.04

Yes

Ansible dependencies

Ansible depends on the following packages that must be available on the Ansible controller:

cURL
Git
Python 3 pip
unzip
Python 3 venv (Ubuntu only)

To be able to run Ansible scripts, Ansible binaries must be installed as well. The instructions about how to install them are distributed with the automated deployment package.

Security modules

As the Ataccama ONE Platform needs access to different directories and ports, using SELinux, AppArmor, and similar technologies is not supported and must be disabled. The specific list of directories and ports depends on customer needs and might differ with every installation.

Security module or Mandatory Access Control (MAC) system	Supported
Any	No

Security module or Mandatory Access Control (MAC) system

Supported

Any

Additional OS monitoring tools

You can use third-party monitoring tools provided that they do not collide with the ports reserved for the Ataccama ONE Platform. For more information, see Ports.

Server Access

The following section describes how to configure access to and from the controller and the target servers.

Operator access to servers

An operator is a person responsible for installing or maintaining the Ataccama ONE Platform on customer premises. This could be a customer’s employee, a contractor, or an external consultant.

To install or debug the platform, the operator must be able to access all target servers as a superuser (that is, root) from the Ansible controller using SSH access. SSH access is not needed for communication between target servers.

We recommend keeping the SSH access enabled after the installation is finished as it is used for upgrades, optimization, and debugging.

As shown in the following table, we recommend authenticating using a public key instead of a password, which is the less secure option. To ensure a smoother process, we do not recommend enabling multifactor authentication during the installation.

Access user Public key authentication Password authentication

Access user	Public key authentication	Password authentication
Ordinary user with passwordless `sudo` (no option to enter password to extend the `sudo` privileges)	Supported, preferred	Supported
Root	Supported, preferred	Supported

Ordinary user with passwordless sudo (no option to enter password to extend the sudo privileges)

Supported, preferred

Supported

Root

Supported, preferred

Supported

Internet access

Installing Ataccama ONE requires access to the internet both during and after the installation, although in varying degrees.

During installation

During the installation, both the controller and the target servers require access to external servers for the following reasons:

Controller:
- To install Ansible and its dependencies (requires external Python (pip) repositories).
Target servers:
- To access distribution repositories for third-party dependencies for the Ataccama ONE Platform, such as Extra Packages for Enterprise Linux (EPEL), PostgreSQL repository, and others.
- To download third-party open-source binaries not distributed in installation packages, such as Keycloak, Prometheus and its exporters, and others.
- To download Ataccama product packages from the Amazon S3 repository.

You can download all the necessary Ataccama packages in advance and upload them to the Ansible controller. However, the installation then needs to be reconfigured to use the local packages, which makes the process longer.

Therefore, internet access must be provided either directly or using a dedicated proxy server. Using a proxy server temporarily during the installation can be considered as a compromise between no internet access and full internet access, as the requests that go through the proxy can be logged and audited by the customer security team after the installation. Installations without any internet access are not supported.

Access type	Supported
Direct connection	Yes
Proxy server	Yes
Offline installation	No

Access type

Supported

Direct connection

Yes

Proxy server

Yes

Offline installation

After installation

Following the installation, target servers need reasonably well synchronized clocks for, among other things, proper certificate and token expiry validation. It is possible to use an internal NTP server or the public NTP pool.

Even if the platform is deployed using the internal NTP, long-term operations without access to the internet are currently not supported.

Service	Mandatory
Network Time Protocol (NTP)	Yes (can be internal)

Service

Mandatory

Network Time Protocol (NTP)

Yes (can be internal)

Data source access

The Ataccama ONE Platform requires access to the data it works with. A direct TCP/IP connection (no proxy) must be established between the relevant target servers and the data sources that they use.

The following components are responsible for data processing:

Data Processing Engine (DPE): Every server running an instance of DPE must allow connections to the required data sources.
MDM, RDM, DQIT: Engine servers must have a direct connection to the required data sources.

Access type	Supported
Direct connection	Yes
Proxy server	No

Access type

Supported

Direct connection

Yes

Proxy server

Ataccama ONE database

The platform internally uses multiple PostgreSQL databases to store various data and metadata. The databases are automatically installed or allocated in a cloud, but backups fall under the customer’s responsibility.

Database	Automatic installation	Backups handled by customer
PostgreSQL on premise	Yes	Yes
PostgreSQL allocated in the cloud	Yes	Yes

Database

Automatic installation

Backups handled by customer

PostgreSQL on premise

Yes

PostgreSQL allocated in the cloud

Yes

Customer data sources

The Ataccama ONE Platform connects to a number of data sources to process data, including relational databases such as PostgreSQL, Oracle, MSSQL, MySQL, data warehouses such as Amazon RDS and Teradata, big data sources such as Spark or Apache Cassandra, and many others. For more information, see Supported Data Sources.

Data sources are not managed by Ataccama in any way and the customer is responsible for managing access and making them reliable, which includes monitoring, backups, and other maintenance.

Data sources	Backups handled by customer
For a full list of data sources supported out-of-the-box, see Supported Data Sources.	Yes

Data sources

Backups handled by customer

For a full list of data sources supported out-of-the-box, see Supported Data Sources.

Yes

Ansible controller

The Ansible controller can be a single machine, with the following options supported, in order of preference:

A small, dedicated server.
The operator’s personal computer, accessed locally.
One of the target servers, usually the monitoring server.
Other personal computer accessed through remote desktop connection. This option is not recommended due to its complexity, low flexibility, and slow speed of work.

NOTE:No graphical environment is required for the controller, only a text terminal.

The operator must be able to access the controller using SSH or run a console emulator on the controller. Access by VPN is supported, with preferred options being VPN clients that can be installed on Linux, Windows, and OS X platforms.

We recommend using a dedicated server for orchestration since it will keep all the relevant configuration for later upgrades, debugging, and other purposes. This server can be turned off when it is not needed, that is, after the installation.

Controller	Supported	Recommended
Dedicated server	Yes	Yes
Operator’s PC	Yes	Yes
Target server	Yes	Yes
Other PC accessed through VPN	Yes	No

Controller

Supported

Recommended

Dedicated server

Yes

Operator’s PC

Yes

Target server

Yes

Other PC accessed through VPN

Yes

Networking

The following sections outline how networking must be set up.

Subnetting

When configuring the network, take into account the following:

Keep track of how Ataccama components are interconnected. Target servers must be able to access each other directly using TCP. We recommend that all servers share a single subnet, but this is not required.
When it comes to server placement, keep in mind that higher network latencies between Ataccama ONE components or Ataccama ONE components and data sources will significantly impact the performance.
Ataccama licenses are tied to the server IP addresses. This is a limitation that will cause issues if the servers are migrated to another network. Therefore, we strongly recommend considering any long-term plans when mapping out the network topology.

Network topology	Supported
The whole platform within a single subnet	Yes
Multiple subnets	Yes, provided that the connection from and to every server is possible
Multiple networks	Yes, provided that the connection from and to every server is possible

Network topology

Supported

The whole platform within a single subnet

Yes

Multiple subnets

Yes, provided that the connection from and to every server is possible

Multiple networks

Yes, provided that the connection from and to every server is possible

Firewall (Linux iptables)

On target servers

As part of the Ansible installation process, a Linux firewall (iptables) can be automatically configured if the option firewall_manage is set to true. In this case, a firewall for each component is installed as an Ansible job immediately after the component installation.

The firewall is configured to allow access from specific sources to:

SSH (for management, upgrades, debugging from any IP).
TCP ports of installed services (by default, access is limited for specific sources).
HTTP and HTTPS ports of the frontend server (access allowed from any IP).
All ICMP services (required for correct TCP functioning).
Packets of all other protocols and to all other ports are discarded (dropped). You can also open additional ports, with per-host granularity and filtering of connection sources.

The following requirements must be fulfilled on the customer side:

SSL certificates are prepared in advance.
A DNS zone is prepared in Azure with a public IP pointing to the web server.
There are records about NGINX in /etc/hosts file on every node. These should point to the private IP of the VM.

Currently, customer-configured firewalls are not supported and it is necessary to allow for Ansible configuring and managing an iptables-based firewall. Ansible can remove previously installed UFW and firewalld, however, if any other unsupported firewall manager is used, the customer must remove it before starting the installation process.

Firewall configuration Autoconfigured (option firewall_manage: true) Customizable

Firewall configuration	Autoconfigured (option `firewall_manage: true`)	Customizable
Access to installed services	Yes	Yes, by opening additional ports
Additional open ports	Rules are manually added to the Ansible inventory and then automatically deployed	Yes

Access to installed services

Yes

Yes, by opening additional ports

Additional open ports

Rules are manually added to the Ansible inventory and then automatically deployed

Yes

Firewall technology	Supported
Plain iptables	Yes
UFW	No
firewalld	No
Other	No

Firewall technology

Supported

Plain iptables

Yes

UFW

firewalld

Other

On the network edge

The following network connections must be allowed:

Outgoing connections to all relevant data sources.
Incoming connections from users to the frontend server (ports 80 and 443).
Outgoing connections to the NTP servers.

All other connections can be restricted.

Purpose	Port	Direction
Data source	Depends on the data source	Outgoing
Frontend	80, 443	Incoming

Purpose

Port

Direction

Data source

Depends on the data source

Outgoing

Frontend

80, 443

Incoming

DNS

Internal (server-to-server)

Every target server must have a hostname resolvable by all other targets including the Ansible controller. Using IP addresses to identify hosts is not supported.

Target identification	Supported
DNS name	Yes
IP address	No

Target identification

Supported

DNS name

Yes

IP address

External (user access)

Users access the server by connecting to the frontend. Individual services are recognized by their hostnames.

All hostnames must belong to a single domain (for example, one.domain.com). These hostnames must be resolvable by all target servers and point to the frontend server.

Frontend service identification	Supported
DNS names belonging to a single domain	Yes
DNS names belonging to multiple domains	No
IP addresses	No

Frontend service identification

Supported

DNS names belonging to a single domain

Yes

DNS names belonging to multiple domains

IP addresses

External domain list

All domain names belonging to Ataccama ONE components must be made available as described in the previous sections.

The DNS name has not been updated after changing the dependency due to backward compatibility.

Ataccama ONE DNS names:
- DNS name: one.<customer_domain>
- DNS name: dpm.<customer_domain>
- DNS name: dpm-grpc.<customer_domain>
- DNS name: dqf.<customer_domain>
- DNS name: dqf-grpc.<customer_domain>
- DNS name: mde.<customer_domain>
- DNS name: mmm-grpc.<customer_domain>
- DNS name: orch-console.<customer_domain>
- DNS name: audit.<customer_domain>
DQIT DNS names:
- dqit.<customer_domain>
- dqit-console.<customer_domain>
RDM DNS names:
- rdm.<customer_domain>
- rdm-console.<customer_domain>
MDM DNS names:
- DNS name: console.<customer_domain>
- DNS name: mda.<customer_domain>
- DNS name: mda-console.<customer_domain>
- DNS name: mdm-grpc.<customer_domain>
Dependencies DNS names:
- DNS name: minio.<customer_domain>
- DNS name: minio-grpc.<customer_domain>
- DNS name: minio-ui.<customer_domain>
Logging and monitoring DNS names:
- kibana.<customer_domain>
  
  The DNS name has not been updated after changing the dependency due to backward compatibility.
- grafana.<customer_domain>
- monitoring.<customer_domain>
- prometheus.<customer_domain>
- alertmanager.<customer_domain>
Orchestration DNS names (optional):
- orch.<customer_domain>

As Ataccama ONE exclusively uses hostnames of its components (targets) for mutual communication, all relevant DNS records, as listed in this section, must be configured in the customer’s DNS. You can use A or CNAME records.

Since NGINX is used as a reverse proxy for the Ataccama ONE application infrastructure, DNS records should also contain the IP address of the reverse proxy server (NGINX).

TLS certificates

User access to the frontend is secured by TLS and it is mandatory to use valid TLS certificates containing relevant hostnames. The customer must provide certificates in their full form for all required external hostnames (see the table).

The supported configurations are as follows:

A single wildcard certificate for all domains. Recommended for simplicity.
Separate certificates, one for each domain.
A single certificate using Subject Alternative Names (SAN) to list all required domains.

We strongly recommend keeping track of the certificate expiration dates. If a certificate expires, the application could malfunction. After rotating any certificates, all components must be restarted to ensure the new certificates are used.

All certificates, including their public, intermediate (CA), and private elements, must be shared with the operator through a secure channel before the installation.

Certificate	Supported	Recommended
Single, wildcard	Yes	Yes (best option)
Single, SAN listing for all names	Yes	Yes
One certificate per hostname	Yes	No
Other	No	No

Certificate

Supported

Recommended

Single, wildcard

Yes

Yes (best option)

Single, SAN listing for all names

Yes

One certificate per hostname

Yes

Other

License files

Ataccama provides license files for all the components you want to install. Before the installation, make sure to request these files and provide them to the operator.

Access to S3

Starting from 13.8.0, authentication is required for downloading MMM basic content, which must be available for the application to start. The access and secret keys are set in the mmm_content_pack_s3_repositories variable in the Ansible inventory (for more information, see defaults/_vars.yml file in the installation package).

Make sure to provide the necessary details to the operator.

Storage

Platform services are installed in module home directories, which are by default subdirectories of /opt. This can be changed for every module. In addition to module executables and libraries, these directories keep the module data, logs, and temporary files. Therefore, make sure there is sufficient disk space in /opt (or the new home directory, if you are not using the default settings).

Every platform service runs under its own user. If needed, you can change the username for each Ataccama ONE service.

The third-party software deployed as part of Ataccama ONE installation uses the directories mandated by the Filesystem Hierarchy Standard (FHS), such as /var/lib/postgresql or /var/lib/opensearch. These directories must also have enough free space available.

Space requirements for specific directories are provided in the following sections.

Common directories

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
Root	`/`	8 GB	Main OS partition. Stores binaries, libraries, and configuration files.
Log files	`/var/log`	5 GB	Log files of non-Ataccama services
`journald` files	`/var/log/journal`	5 GB	`journald` log files (can contain both system and Ataccama service logs)
Install directory `/opt`	`/opt`	5 GB	Used for Ataccama ONE and third-party software installation. All applications are installed in their own location (see the following sections).
Prometheus exporters	`/opt/prometheus`	See `/opt`.	Prometheus exporters
Fluentbit	`/opt/td-agent-bit`	See `/opt`.	Fluentbit log shipper
Temp directory	`/var/tmp`	5 GB	Temporary files, working directory of ONE services. Keep in mind that some servers have specific requirements.

Root

/

8 GB

Main OS partition. Stores binaries, libraries, and configuration files.

Log files

/var/log

5 GB

Log files of non-Ataccama services

journald files

/var/log/journal

5 GB

journald log files (can contain both system and Ataccama service logs)

Install directory /opt

/opt

5 GB

Used for Ataccama ONE and third-party software installation. All applications are installed in their own location (see the following sections).

Prometheus exporters

/opt/prometheus

See /opt.

Prometheus exporters

Fluentbit

/opt/td-agent-bit

See /opt.

Fluentbit log shipper

Temp directory

/var/tmp

5 GB

Temporary files, working directory of ONE services. Keep in mind that some servers have specific requirements.

Application server

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
Ataccama	`/opt/ataccama`	`/opt` + 20 GB	Ataccama ONE
DQIT	`/opt/dqit`	Included in the default `/opt` size	DQIT

Ataccama

/opt/ataccama

/opt + 20 GB

Ataccama ONE

DQIT

/opt/dqit

Included in the default /opt size

DQIT

Database server

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
PostfreSQL	`/var/lib/postgresql`	128 GB	PostgreSQL data. PostgreSQL is used as an internal database of several ONE services.

PostfreSQL

/var/lib/postgresql

128 GB

PostgreSQL data. PostgreSQL is used as an internal database of several ONE services.

Dependency server

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
OpenSearch	`/var/lib/opensearch`	50 GB	OpenSearch (for Ataccama ONE) data files (search indices). Used as storage.
Keycloak	`/opt/keycloak`	Included in the default `/opt size`	Keycloak identity provider and SSO
MinIO	`/opt/minio`	`/opt` + 40 GB	ONE Object Storage

OpenSearch

/var/lib/opensearch

50 GB

OpenSearch (for Ataccama ONE) data files (search indices). Used as storage.

Keycloak

/opt/keycloak

Included in the default /opt size

Keycloak identity provider and SSO

MinIO

/opt/minio

/opt + 40 GB

ONE Object Storage

MDM

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
MDM	`/opt/ataccama/one/mdm/storage`	50 GB	ONE MDM
Temp directory	`/var/tmp`	10 times the data source size (for example, 300 GB - 2 TB). The exact size depends on the customer data sources.	MDM temporary files

MDM

/opt/ataccama/one/mdm/storage

50 GB

ONE MDM

Temp directory

/var/tmp

10 times the data source size (for example, 300 GB - 2 TB). The exact size depends on the customer data sources.

MDM temporary files

RDM

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
RDM	`/opt/ataccama/one/rdm/storage`	75 GB	ONE RDM

RDM

/opt/ataccama/one/rdm/storage

75 GB

ONE RDM

Monitoring

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
OpenSearch	`/var/lib/opensearch`	30 GB	OpenSearch (for the logging stack) data files (search indices). Used as storage.
Prometheus	`/var/lib/prometheus`	20 GB	Prometheus metrics storage. The default retention period is 30 weeks.

OpenSearch

/var/lib/opensearch

30 GB

OpenSearch (for the logging stack) data files (search indices). Used as storage.

Prometheus

/var/lib/prometheus

20 GB

Prometheus metrics storage. The default retention period is 30 weeks.

Orchestration

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
Ataccama	`/opt/ataccama`	Included in the default `/opt` size	Ataccama orchestration server (application install directory).

Ataccama

/opt/ataccama

Included in the default /opt size

Ataccama orchestration server (application install directory).

Processing

All recommended sizes should be considered as starting values. All filesystems must be set up in a way that accommodates increasing space requirements as needed.

Name Mount or directory Recommended size Description

Name	Mount or directory	Recommended size	Description
Ataccama	`/opt/ataccama`	Included in the default `/opt` size	DPE module. Application install directory.
Ataccama	`/opt/filesystem`	Included in the default `/opt` size	Example of DPE filesystem. The exact directory will differ based on your configuration.
Ataccama	`/opt/ataccama/dpe/storage`	100 GB with the ability to be expanded as needed (disk size per processing node)	DPE data files.

Ataccama

/opt/ataccama

Included in the default /opt size

DPE module. Application install directory.

Ataccama

/opt/filesystem

Included in the default /opt size

Example of DPE filesystem. The exact directory will differ based on your configuration.

Ataccama

/opt/ataccama/dpe/storage

100 GB with the ability to be expanded as needed (disk size per processing node)

DPE data files.

SMTP email server

If configured, Ataccama ONE can send email notifications using a customer-provided SMTP server (MDA). The connection to this server is possible through a secured SMTP protocol.

We do not support webhooks or other types of connection for sending emails.

SMTP server configuration	Supported
Managed by Ansible	Yes
STARTTLS support	Mandatory
SMTP AUTH	PLAIN, LOGIN
Port	Any
Hostname	Any

SMTP server configuration

Supported

Managed by Ansible

Yes

STARTTLS support

Mandatory

SMTP AUTH

PLAIN, LOGIN

Port

Any

Hostname

Any

Backups

While important, backup and recovery planning is not necessary for deployment automation and is handled separately.

Other software running on target servers

As mentioned previously, the Ataccama ONE Platform is installed into subdirectories of the /opt directory. Directory names are fixed and do not collide with any common backup, monitoring, or administration software. This allows you to install monitoring or backup agents (or similar software) alongside it. Monitoring and supporting this kind of software falls under the customer’s responsibility.

Additional software	Allowed	Managed by Ansible
Any	Yes	No

Additional software

Allowed

Managed by Ansible

Any

Yes

High availability

Currently not supported.

Manual changes

We do not support manually changing configuration or performing maintenance tasks. All maintenance tasks must be done using Ansible.

The automation cannot account for manual changes and expects none are made. In the best case scenario, any manual changes are undone once the installation starts, however in the worst case one, this leads to breakage.

Change type	Supported
Running Ansible with updated inventory	Yes
Manual	No

Change type

Supported

Running Ansible with updated inventory

Yes

Manual

Pre-installation checklist

You can use this list to verify that everything is ready for the installation.

Decide what is installed and who is in charge of the process.
- Select optional components and request the necessary licenses. **[ ] Select the installation size and server roles.
- Decide where, how, and who runs the Ansible installation.
- Make sure the responsible persons are aware of their role in the process and that there is adequate technical support on the customer side.
Prepare the target servers.
- Install OS.
- Prepare the necessary storage.
  - Format the local filesystems.
  - Mount the remote storage and allocate enough space.
- Configure SSH access.
- Configure the firewall.
  - Grant the operator remote access to the controller (if used).
  - Set up SSH access from the controller to the target servers.
  - Allow the target servers to access the public internet.
- Configure DNS.
  - Set up internal domain names.
  - Set up external domain names.
  - Make sure DNS works with VPN (if used).
- Prepare TLS certificates for all external hostnames and make them available to the operator.
  - Check TLS certificates for validity.
  - Make sure the certificates are not self-signed.
  - Make sure the certificates are saved in their full form (fullchain).
- Prepare the license files and make them available to the operator.
- Check the access to third-party databases (such as AWS RDS or PostgreSQL) and make sure you have the correct credentials with permission to create schemas.
Make sure the operator has all the necessary information. This includes the following:
- Support contact information, contact information of the team or person taking over the platform after the installation, date and time of the installation.
- Fully qualified hostnames of the following components:
  - The controller
  - Target servers
  - Data sources
  - External domain
- Access credentials to the following components:
  - The controller
  - Target servers
  - Data sources
  - Remote access (if used)
    
    VPN
    
    Remote desktop
Verify that the correct people and components can access the infrastructure as needed.
- Configure the VPN and make sure it works for the operator.
- Check that the operator can access and use the controller (through SSH, RDP, or another supported protocol).
- Confirm with the operator that the controller can access the target servers.
  - Distribute the keys or passwords to the relevant persons.
  - Open the firewall.
  - Verify that the SSH access works.
  - Verify that the controller and the target servers have access to the internet and that OS-specific packages are installed (for example, repositories are working as expected, RHEL has enabled subscriptions, and so on).
  - Check that sudo is enabled and that the SSH user can work with it without a password.

Summary reports

While Ansible runs, it stores information of the work done on the target servers and compiles it at the end into a set of summary files in YAML format. These files are kept in the summaries directory on the orchestration server and can be processed into a report describing the performed actions. To generate a report, Ansible converts the available summary files into a human-friendly HTML file.

The report consists of two files:

summary_short.html: A short overview of the finished run.
summary_full.html: Contains additional details that are of interest mostly to system administrators.

Post-installation checklist

When the installation is complete (that is, the Ansible run finished without any errors), go through this checklist to verify that everything works as expected.

Logging and monitoring

Make sure all systemd units are up and running.
- Go to Prometheus at monitoring.<customer_domain>/targets or open Status > Targets in the panel. All targets should be UP.
- If any target is DOWN, investigate in OpenSearch Dashboards what went wrong.
Make sure that all domains are available to the user.
- Try to access all domains defined as external according to the selected installation type. To log in, use the credentials defined in the vars.yml file before the installation. Alternatively, instead of custom credentials, you can use the default ones defined in _vars.yml.
Verify the default Ansible-managed alerts that are distributed with the installation. Go to Prometheus at monitoring.<customer_domain>/alerts or open Alerts in the panel and check the firing rules for possible issues.
Check Grafana dashboards to verify that all components are present and show activity.

Check OpenSearch Dashboards logs for possible errors.

In OpenSearch Dashboards (kibana.<customer_domain>), navigate to logs and in the more options menu select Discover. Check all available logs.

The following example shows the OpenSearch Dashboards filter that you can use to query logs:

Example filter

https://kibana.<customer_domain>/app/discover#/?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-24h,to:now))&_a=(columns:!(SYSLOG_IDENTIFIER,hostname,message,severity),filters:!(),hideChart:!t,index:'*.*',interval:auto,query:(language:kuery,query:'SYSLOG_IDENTIFIER%20:%20mmm-backend'),sort:!(!('@timestamp',desc)))

Keycloak

In the Keycloak Admin Console, make sure everything works as expected.
- Go to one.<customer_domain>/auth and log in.
- Check that the correct realm has been imported.
- Go to Users > View all users to verify that all users from the realm have been imported.
  - If needed, turn off the temporary lock in Users > [user] > Details > User Temporarily Locked. By default, each user that provides incorrect credentials three times in a row is temporarily disabled.
    
    By default, a user is prevented from logging in after five unsuccessful login attempts. To re-activate the user, select Users > [user] > Details > User Enabled.
- Verify that all roles have been successfully imported from the realm (Roles > View all roles).
- Verify that all clients and their tokens have been successfully configured (Clients > [client] > Credentials > Secret).

Was this page useful?