Automate Recurring Data Processing
Schedule transformation plans to run automatically — nightly exports, weekly aggregations, monthly reporting feeds — without manual intervention. Every scheduled run applies the same logic for consistent results, and execution history is captured in the Processing Center for auditing.
Why automate recurring processing
Manual data operations consume time and create risk:
-
Manual exports require ongoing attention: Someone has to remember to run each job, and errors can go unnoticed.
-
Transformations apply inconsistently: Different people might run slightly different logic.
-
Errors are detected late: Problems are typically caught only after downstream systems break or reports show wrong data.
-
Reports are not ready on time: Manual runs delay availability at the start of the business day.
-
Manual processing does not scale: Each new processing job adds setup and maintenance overhead.
Scheduled transformation plans eliminate these issues by running transformations automatically on your schedule.
How scheduled plans automate transformations
Setting up a scheduled transformation involves four parts:
-
Build your transformation logic on the visual canvas.
-
Configure output destinations, such as database tables, file storage, or reference data.
-
Set a schedule, such as daily at 2 AM, weekly on Monday, or a custom cron expression.
-
Monitor execution in the Processing Center.
Once scheduled, transformations run automatically without manual intervention.
Example: Automate nightly customer data export
A partner system requires a daily export of customer data to Amazon S3.
The source table customers looks like this:
| customer_id | first_name | last_name | phone | created_date | status | customer_type | |
|---|---|---|---|---|---|---|---|
C-001 |
Jane |
Smith |
|
555-123-4567 |
2021-03-10 |
|
|
C-002 |
John |
Doe |
|
(555) 234-5678 |
2020-07-15 |
|
|
C-003 |
Alice |
Brown |
|
555.345.6789 |
2022-11-01 |
|
|
C-004 |
Bob |
Lee |
555-456-7890 |
2023-01-20 |
|
|
What the partner expects:
-
Only
ACTIVEENTERPRISEandBUSINESScustomers with a valid email. -
Phone as digits only.
-
created_dateas ISO stringyyyy-MM-dd. -
first_nameandlast_namecombined intofull_name.
Expected output:
| customer_id | full_name | phone | created_date | account_status | |
|---|---|---|---|---|---|
C-001 |
Jane Smith |
|
5551234567 |
2021-03-10 |
|
C-002 |
John Doe |
|
5552345678 |
2020-07-15 |
|
C-003 is excluded (INACTIVE, CONSUMER type).
C-004 is excluded (no email).
Step 1: Create the transformation plan
-
Go to Data Quality > Data transformations.
-
Select Create transformation plan > Standalone plan. Standalone plans run independently on a schedule and write output to a destination, which is what you need here.
-
Name your plan, for example,
Nightly Customer Export to Partner. -
Add a description explaining what it exports and when. For example:
Export active customer data to partner S3 bucket daily at 2 AM. -
Select Create.
Step 2: Add input and filter
-
Add a Catalog item input step and select your source catalog item.
-
Add a Filter step to keep only the records the partner needs. In this case, keep
ACTIVEENTERPRISEandBUSINESScustomers with a valid email.
| Use AI assistance in the expression field — describe the condition in plain language and it will generate the syntax for you. |
Monitor your filter results. If more records than expected are excluded, check your filter expression and verify the source data hasn’t changed unexpectedly. Consider outputting excluded records to a separate logging table during initial runs to confirm the filter behaves as intended.
Step 3: Transform the data
-
Add a Transform data step to standardize the formats the partner requires. For this example: lowercase email, digits-only phone, uppercase customer ID.
-
Add an Add attributes step to derive new columns:
-
full_name: Combined fromfirst_nameandlast_name. -
created_date_str: The date converted to an ISO string for the CSV export.
-
-
Add a Delete attributes step to remove columns that should not reach the partner:
-
first_name,last_name: Replaced byfull_name. -
customer_type: Not needed by the partner. -
Original
created_date: Replaced bycreated_date_str.
-
| AI assistance is available in any expression field — for example: "Combine first_name and last_name into a single full_name attribute" or "Remove all non-digit characters from a phone number." |
Step 4: Configure the output destination
Add the appropriate output step for your destination:
Option A: Export to file storage
Use when a partner or downstream system expects files (CSV) delivered to S3 or other file storage.
Add a File export step and configure:
-
Connection: Your S3 or file storage connection (ensure write credentials are configured).
-
File path and format: Set the destination path and file format.
Option B: Write to database table
Use when the output needs to be available for querying or reporting directly from a database.
Add a Database output step and configure:
-
Connection and target table: Select the database connection and destination table.
-
Write strategy: Insert, Update, or Merge.
Option C: Load to reference data table
Use when the output will serve as reference data for lookups and validations across your organization.
Add a Reference data output step and configure:
-
Target table: Select the reference data table or create a new one.
-
Update method: Upsert, Append, or Replace.
-
Matching attributes: Required for Upsert.
Step 5: Preview and validate
-
Select Compute data preview and verify the output at each step. Check that the filter kept the right rows, transformations look correct, and derived columns are as expected.
-
Select Validate plan and fix any errors before proceeding.
Step 6: Test run the plan manually
-
From the three-dot menu, select Run plan.
-
Monitor the execution in the Processing Center.
-
Verify the output in the destination:
-
File export: Confirm the file appears in S3 with the expected data.
-
Database output: Query the target table to verify records.
-
Reference data output: Review the loaded records in the reference data table.
-
Fix any issues before scheduling.
Step 7: Schedule the plan
-
Open the plan and select Schedule from the three-dot menu.
-
Configure the recurrence:
Setting Configuration for this example Repeat
Daily
Time
2:00 AM
Timezone
Your organization’s timezone
Effective from
Today’s date or a future start date
-
Select Schedule to save.
The plan now runs automatically on your configured schedule.
Common scheduling patterns
| Pattern | Configuration | Typical use |
|---|---|---|
Nightly |
Daily, 02:00 AM |
Reporting tables, data warehouse loads, partner exports |
Weekly |
Weekly, Monday 08:00 AM |
Sales summaries, compliance reports |
End of month |
Cron: |
Financial close data, billing exports |
Every 6 hours |
Cron: |
Frequent reference data updates, operational dashboards |
Schedule transformations to run after source systems have finished loading — add buffer time to handle occasional upstream delays.
Handle dependencies and timing
When transformation plans depend on other processes or each other:
Wait for upstream jobs to complete
Schedule your transformation to run after source data is ready. If source systems finish loading by 1:00 AM, schedule the transformation for 2:00 AM. Add buffer time to handle occasional upstream delays and monitor upstream completion times if they vary.
Chain multiple transformation plans
For complex workflows with multiple stages, run plans in sequence with buffer time between each:
| Plan | What it does | Schedule |
|---|---|---|
First |
Extract and cleanse raw data, write to staging table |
1:00 AM |
Second |
Transform staged data, write to final table |
3:00 AM |
Third |
Export final table to partner |
4:00 AM |
Use consistent naming for related plans — for example, Customer Pipeline - 1 Extract, Customer Pipeline - 2 Transform, Customer Pipeline - 3 Export — to make dependencies clear.
Troubleshooting scheduled transformations
Plan fails on schedule but works manually
Problem: The scheduled run fails, but running the plan manually from the three-dot menu succeeds.
Cause: One of the following:
-
Source data is not ready when the scheduled time runs.
-
Database connection issues occur during the scheduled time window.
-
Resource contention occurs with other scheduled jobs.
Solution:
-
Adjust the schedule to run later, when source data is guaranteed to be ready.
-
Check the Processing Center logs for the specific error message.
-
Stagger schedule times to avoid resource conflicts.
Exported files are missing or incomplete
Problem: A scheduled export completes without errors, but the output file is missing, empty, or contains fewer records than expected.
Cause: One of the following:
-
The source catalog item was empty when the plan ran.
-
The Filter step removed all records.
-
The output path or credentials are misconfigured.
Solution:
-
Review filter conditions to ensure they aren’t too restrictive.
-
Verify output destination credentials and paths are correct.
Schedule doesn’t run at the expected time
Problem: The plan does not execute at its configured time.
Cause: One of the following:
-
Timezone misconfiguration.
-
The schedule is paused.
-
The Effective from date is in the future.
Solution:
-
Verify the timezone matches your organization’s location.
-
Check that the schedule toggle is active in the plan overview.
-
Review the Effective from date in schedule settings.
Was this page useful?