User Community Service Desk Downloads

AI Governance and Security

A comprehensive reference for understanding how the AI features of Ataccama ONE handle your data, which models and infrastructure are used, what controls are available, and how governance and compliance are maintained.

Key commitments at a glance

  • Suggestion-based output: All AI features produce suggestions that require human review and acceptance before taking effect.

  • Metadata-first approach: Generative AI features operate on metadata (such as table names, column names, descriptions, data types). Access to actual data values is limited to specific, optional tools that administrators can turn off.

  • No model training on your data: Generative AI models are never trained or fine-tuned on your data, metadata, or interaction logs.

  • Full administrator control: All AI features can be individually turned on or off. Data access tools, prompt logging, and Agent capabilities are configurable.

  • Role-based access control: AI feature access is governed by the same role-based access control framework used across the Ataccama ONE platform.

  • Regional data residency: LLM API calls are routed to the LLM provider’s region aligned with your deployment geography.

AI feature categories

Ataccama ONE includes three categories of AI capabilities, each with different architectures, data exposure profiles, and deployment models.

Traditional machine learning

Traditional ML features are deployed locally within your environment and available in Data Quality & Catalog and ONE MDM product suites. They do not call external LLM services and do not send data outside your infrastructure.

Local training only

Traditional ML models are trained exclusively within your environment on your own data. The trained model and its outputs are isolated to that single instance.

No data leaves your environment for these features.

Feature Purpose How it works

Business term suggestion

Suggests which business terms might apply to data columns, accelerating cataloging.

Uses locality-sensitive hashing fingerprints and an ensemble k-NN classifier. The model starts blank and learns from profiling data and user feedback using active learning.

Trained locally per customer.

Anomaly detection

Identifies statistically significant and unexpected changes in data over time.

Uses Isolation Forests for time-independent detection and time series decomposition (trend, seasonality, residual) for time-dependent detection. Operates on aggregate profiling metrics.

Trained locally per customer.

Master data matching

Identifies duplicate or related records across data sources.

Statistical and rule-based matching deployed within your environment.

Embedded generative AI

Embedded Gen AI features use large language models (LLMs) provided through managed cloud services to assist with content generation and comprehension tasks. These features are triggered by explicit user action and operate exclusively on metadata — no actual data values are sent to the model.

All embedded Gen AI features follow a Retrieval-Augmented Generation (RAG) pattern: the platform fetches relevant metadata from its repository, fills a prompt template, sends it to the LLM, and validates the response before presenting it to the user.

Where applicable (for example, generated DQ expressions), the platform validates that the output is syntactically correct and executable before returning it.

Feature Purpose

Generate description

Drafts descriptions for catalog items, attributes, business terms, and data quality (DQ) rules.

Text to SQL

Generates SQL queries from natural language on the SQL catalog item creation screen.

Text to rule expression

Generates Ataccama ONE expressions from natural language descriptions.

Chat with documentation

Answers product questions using Ataccama documentation as a knowledge base.

Explain SQL

Produces plain-language explanations of SQL queries.

Rule suggestions

Suggests applicable rules from the library for a given term.

ONE expression to text

Describes what an Ataccama expression does in plain language. Purely informational and not persisted.

Translations

Translates metadata content (names, descriptions) into a selected language. Not persisted.

Text tools

Fixes grammar and improves writing style in rich text editors. Output requires user approval.

Debug DQ rules

Generates sample test input values for a DQ rule. Ephemeral; not persisted.

Similar rules

Detects whether a rule being created duplicates an existing rule in the catalog.

AI Agent

The AI Agent is a goal-based tool agent that automates complex, multi-step data management tasks. Unlike the embedded features which handle single-shot generation, the Agent can plan, execute, validate, and refine multi-step workflows.

A typical Agent workflow:

  1. You provide a goal or task description.

  2. Agent develops an execution plan.

  3. Agent executes the plan step by step, calling platform tools (APIs).

  4. After each step, the Agent validates results and refines the plan as needed.

  5. You review and validate the final results.

The Agent interacts with the platform exclusively through defined tools, which are bounded API interfaces. It cannot access systems, data, or functionality outside the scope of its registered tools.

Current tool categories include search and discovery, catalog inspection, data quality rule management, governance and metadata enrichment, reference data management, transformation management, and utility functions. For a full list, see ai-agent-tools-reference.adoc.

Data exposure summary

The following table summarizes what data each AI category exposes to models, and where that processing happens.

AI category What is sent to the model Accesses actual data? Where processing occurs

Traditional ML

Aggregate profiling metrics (record counts, frequency analysis, patterns). Statistical fingerprints of column data.

Yes, but locally only

Your environment. No external calls.

Embedded Gen AI

Metadata only: table names, column names, data types, business terms, descriptions, Ataccama documentation.

No

Managed cloud LLM service (Azure region-aligned).

AI Agent

Primarily metadata (same as embedded Gen AI). Some optional tools can access data values (data sampling, SQL queries).

Optional, admin-controlled

Managed cloud LLM service (Azure region-aligned). Tool execution on the platform.

Data access is optional and admin-controlled.

The AI Agent includes optional tools (data sampling, SQL queries) that can access actual data values. These tools are critical for use cases like data exploration and quality validation, but they can be turned off by administrators in Global settings > Gen AI. When turned off, the Agent operates exclusively on metadata.

Model providers and infrastructure

Ataccama’s generative AI features use large language models provided through managed cloud services. The platform might employ models from Anthropic or OpenAI depending on the deployment configuration, accessed through one of the following service providers:

Azure AI Foundry

Managed LLM hosting on Microsoft Azure, covered by Azure’s enterprise data privacy and security commitments. Models are accessed via encrypted API calls.

Snowflake Cortex REST API (on Azure)

LLM access through Snowflake’s managed Cortex service running on Azure infrastructure. Calls are made to an Ataccama-managed Snowflake account.

Ataccama selects and manages the model versions used across the platform. Model upgrades are applied by Ataccama as improved versions become available to ensure you benefit from the latest performance and safety improvements.

Data residency

LLM API calls are routed to the Azure region aligned with your deployment geography.

If your environment is deployed in the EU, AI requests are processed within EU Azure regions. If deployed in the US, requests stay within US regions. This applies to both the Azure AI Foundry and Snowflake Cortex paths.

Each customer’s AI interactions are kept fully separate. Your prompts and metadata never influence or are visible to any other organization.

Network security

All API calls to external LLM providers are made over the public internet using TLS encryption in transit. Connections are encrypted end-to-end between the platform and the model provider.

Provider data handling

Ataccama’s agreements with model providers include the following protections:

  • No model training on your inputs: Azure guarantees that prompts are not used to train or improve shared models.

  • Customer isolation: Your interactions are processed independently. Prompts from one organization do not affect model behavior for any other.

  • Encryption: All data is encrypted in transit between Ataccama and the provider.

Data use and privacy

No training on your data

Your data and metadata are never used for training or fine-tuning shared generative AI models.

This applies across the full stack:

  • LLM providers: Azure guarantees that prompts are not used to train models. Ataccama does not perform any fine-tuning or model training.

  • Ataccama platform: Generative AI interaction data (prompts, responses) is not used to train models. Your interactions do not influence the behavior of models used by other organizations.

Exception: Traditional machine learning

Traditional machine learning models (business term suggestion, anomaly detection) are trained but exclusively within your own environment on your own data.

This local training is necessary for these models to produce meaningful results tailored to your specific data landscape. The trained model and its outputs remain isolated to your single instance.

Prompt and interaction logging

Ataccama retains logs of AI prompts and interactions for a period of 30 days to support troubleshooting, service quality, and debugging. These logs are encrypted and stored securely.

Prompt logs are never used for model training. They exist solely for operational and support purposes.

Access controls and administration

Administrators retain full control over the availability and configuration of all AI features in their environment.

Role-based access control

All AI feature access is governed by the same role-based access control (RBAC) framework used across the Ataccama ONE platform. Access to the AI Agent and individual Gen AI capabilities can be restricted by role, ensuring only authorized personnel can interact with AI features and approve AI-generated outputs.

RBAC for the AI Agent is managed through the Ataccama identity provider (IDP). Administrators can control which user roles have access to the Agent and to specific tool categories.

Administrator controls

The following controls are available to administrators:

Control Scope Description

Feature-level toggles

All AI features

Individual Gen AI features, the AI Agent, and traditional ML features can each be turned on or off independently.

Global AI disable

All Gen AI

The entire Gen AI service can be turned off as a whole.

Data access toggle

AI Agent

The data sampling and SQL query tools can be turned off, restricting the Agent to metadata-only operation.

Role-based access

All AI features

Access to AI features is controlled through RBAC. Administrators define which roles can interact with AI capabilities.

Business term suggestion

Traditional ML

Can be turned off globally or for individual business terms.

Anomaly detection

Traditional ML

Explicitly opt-in; users enable it per DQ monitoring project, profiling configuration, or catalog item.

AI Agent controls

The AI Agent provides specific administrator-facing controls through Global settings > Gen AI:

  • Turn on or off the data sampling tool to control whether the Agent can access actual data values.

  • Manage user access to Agent capabilities through role-based access controls.

Transparency and auditability

Ataccama provides multiple layers of transparency into AI operations to support governance requirements and human oversight.

Execution visibility

You can view the AI Agent’s execution steps, review its reasoning at each stage, and inspect all proposed changes before they take effect. A Review changes interface allows you to examine modifications before accepting them.

Audit logging

The platform logs the complete lifecycle of AI Agent interactions, including:

  • Your original prompt and the Agent’s execution plan.

  • Each individual tool call: which tool was invoked, what parameters were sent, and what was returned.

  • The Agent’s responses and final outputs.

All logs are encrypted. Audit data can be used for compliance reviews, incident investigation, and internal governance reporting.

Output validation

Where applicable, the platform validates AI-generated output before presenting it to users. For example, when the AI generates a DQ expression or SQL query, the platform checks that the output is syntactically valid and executable in the current context.

If validation fails, the platform iterates with the model to correct the issue before surfacing the result.

Governance principles

Ataccama’s AI governance program is guided by the NIST AI Risk Management Framework and emphasizes the following objectives:

  • Performance and accuracy: Continuous model evaluation and improvement. Output validation where feasible (expression validation, syntax checking).

  • Fairness and bias mitigation: Regular testing to identify and correct potential biases in AI outputs.

  • Explainability and transparency: Clear documentation for all AI-driven recommendations. Visible execution plans and reasoning for the AI Agent.

  • Security and privacy: Encryption in transit for all LLM calls. Metadata-first data minimization. No model training on your data.

  • Human oversight: All AI outputs are suggestions requiring user review. Changes are never applied automatically without user acceptance.

  • Legal and regulatory compliance: Ongoing monitoring and alignment with AI-related laws and guidelines, including GDPR and other applicable data protection regulations.

All updates to AI features go through the Ataccama Software Development Lifecycle, with multiple checks including product review, automated testing, and manual quality assurance before release.

Compliance and certifications

Ataccama holds a SOC 2 certification, demonstrating adherence to established security, availability, and confidentiality controls. For details on Ataccama’s broader security posture, certifications, and compliance documentation, contact your Customer Success Manager or refer to the Ataccama Trust Center.

Ataccama complies with applicable data protection regulations, including GDPR, and requires all third-party AI service providers to uphold equivalent compliance standards.

Internal governance audits are conducted biannually, covering:

  • Automated monitoring and manual reviews of AI feature behavior.

  • Evaluations of model performance, fairness, and security practices.

  • Documentation, reporting, and remediation procedures for identified issues.

Intellectual property and ownership

Ownership of inputs and outputs

You retain ownership of your data and any AI-generated outputs derived from your metadata. Ataccama maintains intellectual property rights related to AI logic, model architectures, and underlying platform technology.

Third-party AI providers

You acknowledge the licensing terms of third-party providers (Azure AI, Snowflake, Anthropic, OpenAI) as applicable to your deployment. Ataccama communicates these terms clearly as part of the customer agreement.

Your responsibilities

You are expected to use AI features responsibly, ethically, and lawfully. Key responsibilities include:

  • Lawful and ethical use: Compliance with all applicable laws and ethical standards when using AI features.

  • Precise prompting: Providing clear, specific goals and instructions when interacting with the AI Agent. Well-defined prompts help the Agent develop accurate plans and reduce unintended actions.

  • Human oversight: Reviewing and validating all AI-generated outputs; only approve actions that align with your intentions. For this purpose, the Agent provides a "Review changes" option for inspecting modifications before they are finalized.

  • Change validation: Carefully reviewing all modifications that the AI Agent makes to catalog items, rules, descriptions, and other platform assets before accepting and propagating them.

  • Data and privacy management: Appropriate classification and secure management of data inputs to AI systems.

  • Reporting and feedback: Prompt reporting of any identified issues such as biases, inaccuracies, or security concerns.

For best practices on using Generative AI features effectively, see Gen AI Best Practices.

Upcoming capabilities

Ataccama is actively working on expanding flexibility around AI model management:

  • Bring Your Own Key or Bring Your Own Model (BYOK/BYOM): Planned for 2026, this capability will allow you to use your own model provider credentials or deploy your own model endpoints, giving you full control over which models process your data and metadata.

Material updates to AI features and governance policies are documented in the product release notes, with clear communication about changes and effective dates.

Contact information

For more information about AI governance and security, to request compliance documentation, or for assistance with AI feature configuration, reach out to your Customer Success Manager (CSM). Our team is ready to provide further guidance and support.

Was this page useful?