AI Governance and Security
A comprehensive reference for understanding how the AI features of Ataccama ONE handle your data, which models and infrastructure are used, what controls are available, and how governance and compliance are maintained.
Key commitments at a glance
-
Suggestion-based output: All AI features produce suggestions that require human review and acceptance before taking effect.
-
Metadata-first approach: Generative AI features operate on metadata (such as table names, column names, descriptions, data types). Access to actual data values is limited to specific, optional tools that administrators can turn off.
-
No model training on your data: Generative AI models are never trained or fine-tuned on your data, metadata, or interaction logs.
-
Full administrator control: All AI features can be individually turned on or off. Data access tools, prompt logging, and Agent capabilities are configurable.
-
Role-based access control: AI feature access is governed by the same role-based access control framework used across the Ataccama ONE platform.
-
Regional data residency: LLM API calls are routed to the LLM provider’s region aligned with your deployment geography.
AI feature categories
Ataccama ONE includes three categories of AI capabilities, each with different architectures, data exposure profiles, and deployment models.
Traditional machine learning
Traditional ML features are deployed locally within your environment and available in Data Quality & Catalog and ONE MDM product suites. They do not call external LLM services and do not send data outside your infrastructure.
|
Local training only
Traditional ML models are trained exclusively within your environment on your own data. The trained model and its outputs are isolated to that single instance. No data leaves your environment for these features. |
| Feature | Purpose | How it works |
|---|---|---|
Business term suggestion |
Suggests which business terms might apply to data columns, accelerating cataloging. |
Uses locality-sensitive hashing fingerprints and an ensemble k-NN classifier. The model starts blank and learns from profiling data and user feedback using active learning. Trained locally per customer. |
Anomaly detection |
Identifies statistically significant and unexpected changes in data over time. |
Uses Isolation Forests for time-independent detection and time series decomposition (trend, seasonality, residual) for time-dependent detection. Operates on aggregate profiling metrics. Trained locally per customer. |
Master data matching |
Identifies duplicate or related records across data sources. |
Statistical and rule-based matching deployed within your environment. |
Embedded generative AI
Embedded Gen AI features use large language models (LLMs) provided through managed cloud services to assist with content generation and comprehension tasks. These features are triggered by explicit user action and operate exclusively on metadata — no actual data values are sent to the model.
All embedded Gen AI features follow a Retrieval-Augmented Generation (RAG) pattern: the platform fetches relevant metadata from its repository, fills a prompt template, sends it to the LLM, and validates the response before presenting it to the user.
Where applicable (for example, generated DQ expressions), the platform validates that the output is syntactically correct and executable before returning it.
| Feature | Purpose |
|---|---|
Generate description |
Drafts descriptions for catalog items, attributes, business terms, and data quality (DQ) rules. |
Text to SQL |
Generates SQL queries from natural language on the SQL catalog item creation screen. |
Text to rule expression |
Generates Ataccama ONE expressions from natural language descriptions. |
Chat with documentation |
Answers product questions using Ataccama documentation as a knowledge base. |
Explain SQL |
Produces plain-language explanations of SQL queries. |
Rule suggestions |
Suggests applicable rules from the library for a given term. |
ONE expression to text |
Describes what an Ataccama expression does in plain language. Purely informational and not persisted. |
Translations |
Translates metadata content (names, descriptions) into a selected language. Not persisted. |
Text tools |
Fixes grammar and improves writing style in rich text editors. Output requires user approval. |
Debug DQ rules |
Generates sample test input values for a DQ rule. Ephemeral; not persisted. |
Similar rules |
Detects whether a rule being created duplicates an existing rule in the catalog. |
AI Agent
The AI Agent is a goal-based tool agent that automates complex, multi-step data management tasks. Unlike the embedded features which handle single-shot generation, the Agent can plan, execute, validate, and refine multi-step workflows.
A typical Agent workflow:
-
You provide a goal or task description.
-
Agent develops an execution plan.
-
Agent executes the plan step by step, calling platform tools (APIs).
-
After each step, the Agent validates results and refines the plan as needed.
-
You review and validate the final results.
The Agent interacts with the platform exclusively through defined tools, which are bounded API interfaces. It cannot access systems, data, or functionality outside the scope of its registered tools.
Current tool categories include search and discovery, catalog inspection, data quality rule management, governance and metadata enrichment, reference data management, transformation management, and utility functions. For a full list, see ai-agent-tools-reference.adoc.
Data exposure summary
The following table summarizes what data each AI category exposes to models, and where that processing happens.
| AI category | What is sent to the model | Accesses actual data? | Where processing occurs |
|---|---|---|---|
Traditional ML |
Aggregate profiling metrics (record counts, frequency analysis, patterns). Statistical fingerprints of column data. |
Yes, but locally only |
Your environment. No external calls. |
Embedded Gen AI |
Metadata only: table names, column names, data types, business terms, descriptions, Ataccama documentation. |
No |
Managed cloud LLM service (Azure region-aligned). |
AI Agent |
Primarily metadata (same as embedded Gen AI). Some optional tools can access data values (data sampling, SQL queries). |
Optional, admin-controlled |
Managed cloud LLM service (Azure region-aligned). Tool execution on the platform. |
|
Data access is optional and admin-controlled. The AI Agent includes optional tools (data sampling, SQL queries) that can access actual data values. These tools are critical for use cases like data exploration and quality validation, but they can be turned off by administrators in Global settings > Gen AI. When turned off, the Agent operates exclusively on metadata. |
Model providers and infrastructure
Ataccama’s generative AI features use large language models provided through managed cloud services. The platform might employ models from Anthropic or OpenAI depending on the deployment configuration, accessed through one of the following service providers:
- Azure AI Foundry
-
Managed LLM hosting on Microsoft Azure, covered by Azure’s enterprise data privacy and security commitments. Models are accessed via encrypted API calls.
- Snowflake Cortex REST API (on Azure)
-
LLM access through Snowflake’s managed Cortex service running on Azure infrastructure. Calls are made to an Ataccama-managed Snowflake account.
Ataccama selects and manages the model versions used across the platform. Model upgrades are applied by Ataccama as improved versions become available to ensure you benefit from the latest performance and safety improvements.
Data residency
LLM API calls are routed to the Azure region aligned with your deployment geography.
If your environment is deployed in the EU, AI requests are processed within EU Azure regions. If deployed in the US, requests stay within US regions. This applies to both the Azure AI Foundry and Snowflake Cortex paths.
Each customer’s AI interactions are kept fully separate. Your prompts and metadata never influence or are visible to any other organization.
Network security
All API calls to external LLM providers are made over the public internet using TLS encryption in transit. Connections are encrypted end-to-end between the platform and the model provider.
Provider data handling
Ataccama’s agreements with model providers include the following protections:
-
No model training on your inputs: Azure guarantees that prompts are not used to train or improve shared models.
-
Customer isolation: Your interactions are processed independently. Prompts from one organization do not affect model behavior for any other.
-
Encryption: All data is encrypted in transit between Ataccama and the provider.
Data use and privacy
No training on your data
Your data and metadata are never used for training or fine-tuning shared generative AI models.
This applies across the full stack:
-
LLM providers: Azure guarantees that prompts are not used to train models. Ataccama does not perform any fine-tuning or model training.
-
Ataccama platform: Generative AI interaction data (prompts, responses) is not used to train models. Your interactions do not influence the behavior of models used by other organizations.
|
Exception: Traditional machine learning
Traditional machine learning models (business term suggestion, anomaly detection) are trained but exclusively within your own environment on your own data. This local training is necessary for these models to produce meaningful results tailored to your specific data landscape. The trained model and its outputs remain isolated to your single instance. |
Prompt and interaction logging
Ataccama retains logs of AI prompts and interactions for a period of 30 days to support troubleshooting, service quality, and debugging. These logs are encrypted and stored securely.
Prompt logs are never used for model training. They exist solely for operational and support purposes.
Access controls and administration
Administrators retain full control over the availability and configuration of all AI features in their environment.
Role-based access control
All AI feature access is governed by the same role-based access control (RBAC) framework used across the Ataccama ONE platform. Access to the AI Agent and individual Gen AI capabilities can be restricted by role, ensuring only authorized personnel can interact with AI features and approve AI-generated outputs.
RBAC for the AI Agent is managed through the Ataccama identity provider (IDP). Administrators can control which user roles have access to the Agent and to specific tool categories.
Administrator controls
The following controls are available to administrators:
| Control | Scope | Description |
|---|---|---|
Feature-level toggles |
All AI features |
Individual Gen AI features, the AI Agent, and traditional ML features can each be turned on or off independently. |
Global AI disable |
All Gen AI |
The entire Gen AI service can be turned off as a whole. |
Data access toggle |
AI Agent |
The data sampling and SQL query tools can be turned off, restricting the Agent to metadata-only operation. |
Role-based access |
All AI features |
Access to AI features is controlled through RBAC. Administrators define which roles can interact with AI capabilities. |
Business term suggestion |
Traditional ML |
Can be turned off globally or for individual business terms. |
Anomaly detection |
Traditional ML |
Explicitly opt-in; users enable it per DQ monitoring project, profiling configuration, or catalog item. |
Transparency and auditability
Ataccama provides multiple layers of transparency into AI operations to support governance requirements and human oversight.
Execution visibility
You can view the AI Agent’s execution steps, review its reasoning at each stage, and inspect all proposed changes before they take effect. A Review changes interface allows you to examine modifications before accepting them.
Audit logging
The platform logs the complete lifecycle of AI Agent interactions, including:
-
Your original prompt and the Agent’s execution plan.
-
Each individual tool call: which tool was invoked, what parameters were sent, and what was returned.
-
The Agent’s responses and final outputs.
All logs are encrypted. Audit data can be used for compliance reviews, incident investigation, and internal governance reporting.
Output validation
Where applicable, the platform validates AI-generated output before presenting it to users. For example, when the AI generates a DQ expression or SQL query, the platform checks that the output is syntactically valid and executable in the current context.
If validation fails, the platform iterates with the model to correct the issue before surfacing the result.
Governance principles
Ataccama’s AI governance program is guided by the NIST AI Risk Management Framework and emphasizes the following objectives:
-
Performance and accuracy: Continuous model evaluation and improvement. Output validation where feasible (expression validation, syntax checking).
-
Fairness and bias mitigation: Regular testing to identify and correct potential biases in AI outputs.
-
Explainability and transparency: Clear documentation for all AI-driven recommendations. Visible execution plans and reasoning for the AI Agent.
-
Security and privacy: Encryption in transit for all LLM calls. Metadata-first data minimization. No model training on your data.
-
Human oversight: All AI outputs are suggestions requiring user review. Changes are never applied automatically without user acceptance.
-
Legal and regulatory compliance: Ongoing monitoring and alignment with AI-related laws and guidelines, including GDPR and other applicable data protection regulations.
All updates to AI features go through the Ataccama Software Development Lifecycle, with multiple checks including product review, automated testing, and manual quality assurance before release.
Compliance and certifications
Ataccama holds a SOC 2 certification, demonstrating adherence to established security, availability, and confidentiality controls. For details on Ataccama’s broader security posture, certifications, and compliance documentation, contact your Customer Success Manager or refer to the Ataccama Trust Center.
Ataccama complies with applicable data protection regulations, including GDPR, and requires all third-party AI service providers to uphold equivalent compliance standards.
Internal governance audits are conducted biannually, covering:
-
Automated monitoring and manual reviews of AI feature behavior.
-
Evaluations of model performance, fairness, and security practices.
-
Documentation, reporting, and remediation procedures for identified issues.
Intellectual property and ownership
Your responsibilities
You are expected to use AI features responsibly, ethically, and lawfully. Key responsibilities include:
-
Lawful and ethical use: Compliance with all applicable laws and ethical standards when using AI features.
-
Precise prompting: Providing clear, specific goals and instructions when interacting with the AI Agent. Well-defined prompts help the Agent develop accurate plans and reduce unintended actions.
-
Human oversight: Reviewing and validating all AI-generated outputs; only approve actions that align with your intentions. For this purpose, the Agent provides a "Review changes" option for inspecting modifications before they are finalized.
-
Change validation: Carefully reviewing all modifications that the AI Agent makes to catalog items, rules, descriptions, and other platform assets before accepting and propagating them.
-
Data and privacy management: Appropriate classification and secure management of data inputs to AI systems.
-
Reporting and feedback: Prompt reporting of any identified issues such as biases, inaccuracies, or security concerns.
For best practices on using Generative AI features effectively, see Gen AI Best Practices.
Upcoming capabilities
Ataccama is actively working on expanding flexibility around AI model management:
-
Bring Your Own Key or Bring Your Own Model (BYOK/BYOM): Planned for 2026, this capability will allow you to use your own model provider credentials or deploy your own model endpoints, giving you full control over which models process your data and metadata.
Material updates to AI features and governance policies are documented in the product release notes, with clear communication about changes and effective dates.
Was this page useful?