User Community Service Desk Downloads

Deploy a Self-Managed Edge

This guide describes how to deploy a self-managed Ataccama edge instance in your own AWS account using Terraform.

See Edge Processing for background on edge processing and how responsibilities are split between Ataccama and your organization.

The deployment splits into four phases:

  • You set up AWS infrastructure (VPC, S3 bucket, IAM role) and provide details to Ataccama.

  • Ataccama generates a Terraform deployment package customized for your environment.

  • You run Terraform from your workstation or CI/CD runner to deploy the edge into your account.

  • Ataccama confirms control plane connectivity, and the edge is registered.

Prerequisites

Complete Prepare AWS Infrastructure before starting.

To generate your deployment ZIP, Ataccama needs the values from AWS infrastructure preparation. Provide them to your Ataccama Customer Success Manager during onboarding.

Example values
account:
  region: "eu-central-1"
  accountId: "123456789011"
vpc:
  vpcId: "vpc-0f1924823027c00c7"
  privateSubnetIds:
    - "subnet-0fa6e2c437dd0e5fd"
    - "subnet-0156acffabb5aa872"
  publicSubnetIds:
    - "subnet-093c2ad4f707332e6"
    - "subnet-0439b68cf1fa5e1ba"
s3:
  bucketName: "bucket-s3-ecc58765"
  iamRoleArn: "arn:aws:iam::123456789011:role/s3-role"
  region: "eu-central-1"

Tooling

Install the following on the machine where you will run Terraform — your workstation or a CI/CD runner:

Confirm the AWS CLI is authenticated before proceeding:

aws sts get-caller-identity

Workstation network access

The machine running Terraform needs outbound HTTPS access to:

  • registry.terraform.io: For downloading Terraform providers (for example, the AWS provider).

  • ataccama.azurecr.io: For pulling container images and OCI artifacts.

  • AWS service endpoints (ECS, IAM, S3, SQS, Lambda, CloudWatch): For Terraform AWS provider API calls.

IAM permissions for Terraform

The IAM principal used to run Terraform needs administrative access to the target AWS account, or at minimum permissions to create and manage the following resources:

  • ECS

  • IAM

  • S3

  • SQS

  • Lambda

  • EFS

  • KMS

  • CloudWatch

Ataccama doesn’t receive an IAM role in your account. Cross-account communication is initiated outbound by the edge using an IAM role created by Terraform and scoped to the SQS operations required for control plane messaging.

Deployment artifacts

Before starting installation, confirm you have received the following materials from Ataccama:

  • Edge deployment ZIP via secure download. Contains Terraform manifests, component configurations, Lambda artifacts, and the bundled terraform-aws-edgeinstance module.

    Bucket name, IAM role ARN, and Region are pre-populated from values you supplied during onboarding.

  • Container registry credentials via secure credential sharing. Username and password for ataccama.azurecr.io. Unique to your edge instance.

Store credentials in a secrets manager or a .tfvars file that is never committed to source control.

Each credential set is scoped to a single edge instance; do not reuse across deployments.

Bundle contents

The edge deployment ZIP contains all Terraform configuration and component definitions.

Do not modify the bundled module or the preconfigured variable files.

<edge_name>-<version>-<timestamp>-bundle.zip
├── main.tf                          # Root Terraform configuration
├── terraform.tf                     # Provider version constraints
├── variables.tf                     # Input variable declarations
├── outputs.tf                       # Output definitions
├── terraform.tfvars.json            # Edge + cluster configuration (pre-populated)
├── *.auto.tfvars.json               # Component configurations (pre-populated, ~10 files)
├── registry.auto.tfvars.json        # Container registry credentials — you fill this in
├── observability.auto.tfvars.json   # Observability config (turned off by default)
├── artifacts/                       # Lambda deployment packages
│   ├── edgeinteractivejobs/*.zip
│   └── dqresultsreader/*.zip
└── terraform-aws-edgeinstance/      # Bundled Ataccama module — do not modify

Install the edge

Step 1: Fill in container registry credentials

Extract the edge deployment ZIP, open registry.auto.tfvars.json, and fill in the credentials provided by Ataccama:

{
  "registry_secret": {
    "username": "<username-from-ataccama>",
    "password": "<password-from-ataccama>"
  }
}

To avoid writing secrets to disk, export them as an environment variable:

TF_VAR_registry_secret='{"username":"...","password":"..."}'

(Optional) Use a custom registry prefix

By default, container images are pulled from ataccama.azurecr.io/saas-edge/. To pull from your own registry or pull-through cache, set registry_prefix in registry.auto.tfvars.json:

{
  "registry_prefix": "internal-registry.corp.local/ataccama-saas-edge/",
  "registry_secret": {
    "username": "",
    "password": ""
  }
}

With this example, image references resolve as follows:

ataccama.azurecr.io/saas-edge/dqc-runtime-job:16.6.5-saas-edge
→ internal-registry.corp.local/ataccama-saas-edge/dqc-runtime-job:16.6.5-saas-edge

Step 2: Configure a Terraform backend

Configure a Terraform backend according to your organization’s standards — for example, an S3 backend with DynamoDB state locking. Add a backend block to terraform.tf or create a separate backend configuration file.

This stores your Terraform state remotely and enables collaboration and state locking.

Step 3: Initialize Terraform

From the root of the extracted bundle directory:

cd <extracted-bundle-directory>
terraform init

The Ataccama edge module is bundled locally; no external module registry access is required. However, access to the public Terraform Registry is required to download Terraform providers (such as the AWS provider).

Expected output
Initializing the backend...
Initializing provider plugins...
- Installing hashicorp/aws ...
Terraform has been successfully initialized!

Step 4: Review and apply

terraform plan -out=edge.tfplan

Review the planned resources. Terraform creates resources in your AWS account only, including:

  • ECS cluster and Fargate task definitions.

  • IAM roles and policies (scoped to your account; no Ataccama access).

  • SQS queues for control plane communication.

  • Lambda functions for auxiliary processing jobs.

  • CloudWatch log groups.

  • VPC security groups.

When satisfied with the plan, apply it:

terraform apply edge.tfplan

Deployment typically completes in 10–20 minutes. Don’t interrupt the process once it has started.

Terraform waits for ECS tasks to start successfully before completing.

Step 5: Verify and register

After terraform apply completes:

  1. Navigate to CloudWatch > Log groups > /ataccama/edge/ and confirm log streams are being written without repeated errors.

  2. Email your Ataccama contact with:

    • The edge name (from Terraform outputs).

    • Your AWS account ID and Region.

    • Confirmation that terraform apply completed without errors.

Ataccama will verify control plane connectivity and confirm the edge is registered.

Terraform waits for ECS tasks to start successfully during the apply process. If it completed without errors, the main workloads are running.

Configure data sources to use the edge

When creating or editing your data source connection, select the edge instance you want to use. All edge instances available for your environment appear in this list.

Select edge instance in data source connection

Test and save the connection. Then browse and import metadata for a schema, table, or file of your choosing. As a result, a new catalog item appears in your Catalog.

For detailed instructions, see Sources and Import Metadata.

If any step results in an error, contact Ataccama Support.

Edge runtime network access

All edge traffic is outbound from your VPC. No inbound firewall rules, VPN tunnels, or peering connections are required.

AWS service traffic can be kept fully private using Interface VPC Endpoints for ECS, S3, SQS, Lambda, and CloudWatch Logs. Traffic to ataccama.azurecr.io must still exit via NAT Gateway.

Destination Protocol Port Purpose

ataccama.azurecr.io

HTTPS

443

Pull container images and OCI artifacts.

AWS SQS (your Region)

HTTPS

443

Control plane task exchange (cross-account IAM).

AWS service APIs

HTTPS

443

ECS, S3, Lambda, CloudWatch, IAM API calls.

Data source connectivity

Data sources must be reachable from the edge’s ECS security group on the appropriate port.

Terraform doesn’t configure this connectivity — you’re responsible for routing and firewall rules between the edge VPC and your data source endpoints.

Upgrade the edge

Your edge version remains supported for 90 days after a new release is available. When your version is approaching the end of this window, a warning is displayed in the Ataccama Cloud Portal.

Upgrades are not applied automatically; you control the timing.

Exceeding the 90-day supported window might result in degraded functionality or loss of control plane connectivity.

Don’t skip versions: apply each release in sequence. If you have missed multiple versions, contact Ataccama Support before proceeding.

To upgrade:

  1. Download the new edge deployment ZIP from the Cloud Portal.

  2. Extract it into a new directory. Keep the previous directory as a backup.

  3. Open registry.auto.tfvars.json and fill in your container registry credentials (same as for the initial installation).

  4. Configure the Terraform backend. Must match the backend used for the initial installation.

  5. Run the following sequence:

    terraform init
    terraform plan -out=edge.tfplan
    terraform apply edge.tfplan

Terraform applies only the changes between the previous and new versions. Resources that haven’t changed are not touched and no data is lost during an upgrade.

Observability

Currently, observability is turned off by default. The edge deployment ZIP includes observability.auto.tfvars.json with observability turned off.

Telemetry shippingto Ataccama’s monitoring stack is planned for a future release.

Troubleshooting edge deployment

terraform init fails: module not found

Run terraform init from the root of the extracted directory, where main.tf is located. The terraform-aws-edgeinstance/ subdirectory must be present alongside main.tf.

terraform apply fails with UnauthorizedAccess

Run aws sts get-caller-identity to confirm which IAM principal is active. Verify it has the permissions listed in IAM permissions for Terraform.

ECS tasks stuck in PENDING or immediately STOPPED

Check CloudWatch Logs for the affected task. Common causes include:

  • Incorrect container registry credentials: Verify username and password in registry.auto.tfvars.json.

  • No internet egress: Confirm the subnet’s route table points to a NAT gateway and that outbound HTTPS (port 443) is permitted by security groups and network ACLs.

  • SCP or firewall blocking egress: If your organization enforces AWS Service Control Policies, ensure ECS task roles aren’t blocked from calling SQS or pulling from ataccama.azurecr.io.

Edge not showing as connected in Cloud Portal after 15 minutes

Check CloudWatch Logs for SQS connectivity errors.

Contact Ataccama Support with your edge name and the relevant log output.

Contacting Ataccama Support

When contacting Ataccama Support, provide:

  • Edge name (visible in the Cloud Portal and in Terraform outputs).

  • AWS Region and account ID.

  • CloudWatch log excerpts.

  • Edge version (from the ZIP filename).

Destroy the edge

terraform destroy permanently deletes all edge AWS resources.

Ensure all processed data has been saved to your own systems before proceeding. Notify Ataccama afterwards so the edge registration can be removed from the control plane.

terraform destroy

Appendix: AWS resources deployed by Terraform

The following AWS resources are created by Terraform in your account.

Foundation (always deployed)

  • KMS: Two customer-managed keys (general and DQ-encryption), plus a KMS alias. Rotation every 90 days, 7-day deletion window.

  • Secrets Manager: One secret holding the Ataccama container registry credentials (KMS-encrypted), used when the Ataccama registry is accessed directly.

  • S3: One bucket for Lambda artifacts. Versioned, KMS-encrypted; lifecycle rule expires old versions after seven days.

  • EFS: One file system, KMS-encrypted. Two access points (drivers, otel). Mount target per private subnet. Dedicated security group with NFS ingress rules from each workload.

    Mounts Ataccama connectors to processing jobs and the metadata-browsing Lambda, and collects observability data before shipping (when enabled).

  • VPC endpoints: S3 gateway attached to private route tables, plus interface endpoints for KMS, Secrets Manager, and SQS. Shared security group, private DNS enabled, VPC-CIDR ingress.

ECS management cluster

  • Plane Manager: Permanent ECS service, two replicas. Owns its security group and IAM task role.

    Consumes the local SQS job-status queue and uses sts:AssumeRole to reach both the Ataccama edge-access role and the customer result-S3 role.

  • Connectors Rollout: Task definition only; no permanent service. Run on demand by EventBridge when the container image changes. Mounted to the EFS drivers access point.

  • Observability service: Turned off by default. When enabled, Terraform adds an ECS service running an OpenTelemetry collector, CloudWatch Exporter, and config-init container, plus an internal NLB (listeners on gRPC 4317 and HTTP 4318), an OTLP secret in Secrets Manager, and a CloudWatch log group per container. S3 bucket stores configuration; EFS provides supporting storage.

ECS job cluster

  • Processing jobs: Six task definitions (DQC, anomaly detection, anomaly detection auxiliary, metadata import, Snowflake pushdown, create-table). Ephemeral — started on demand by the Plane Manager, stopped when work completes.

  • One shared security group for all job tasks, egress any.

  • Per task definition: Task execution role, task role, CloudWatch log group.

Lambda functions

  • Metadata Browsing / Connection Testing: Java 21, 1.3 GB, x86_64, VPC-attached. Mounts the EFS drivers access point.

    Alias current plus provisioned concurrency. Event-source mapping on the Ataccama-managed SQS request queue.

  • DQ Results Reader: Java 21, 1 GB, x86_64, VPC-attached. Alias current plus provisioned concurrency.

    Event-source mapping on the Ataccama-managed result-reader SQS queue.

  • Cleanup: Python 3.12, non-VPC, no EFS. Invoked daily by EventBridge Scheduler. Trims old Lambda versions (keeps three).

Messaging and events (local)

  • SQS: One main queue and one DLQ for ECS job-status-changed events. KMS-encrypted. Max-receive 3 → DLQ.

  • EventBridge rule: ECS Task State Change on the job cluster forwards to the local SQS queue.

  • EventBridge rule: Custom event com.ataccama.edge.connectorsrollout / image_update triggers ecs:RunTask on Connectors Rollout.

  • EventBridge Scheduler: Daily invocation of the Cleanup Lambda via a dedicated scheduler IAM role.

CloudWatch

  • Log group per ECS service, per job task definition, and per Lambda.

  • Container Insights on both ECS clusters.

IAM

Terraform creates the IAM roles needed for services to function and communicate. No IAM role is granted to Ataccama inside the customer account.

Was this page useful?