Filter Step
Filters out all records that don’t pass the ONE expression condition.
|
This step is available for standalone plans, embedded plans, and transformation catalog items. It is not available for transformation rules. |
Overview
The Filter step removes records from your data flow based on a condition. Only records that satisfy the condition continue to downstream steps; all other records are discarded.
Use this step to:
-
Remove invalid or incomplete records before processing.
-
Select a subset of data based on business criteria.
-
Exclude records with null values in required fields.
-
Apply quality thresholds to filter out low-quality data.
-
Limit data to specific categories, regions, or time periods.
Configuration
Define the filter condition
-
In the step configuration, enter your filter condition using a ONE expression.
-
The expression must evaluate to a Boolean value (
trueorfalse). -
Records where the condition evaluates to
trueare passed through. -
Records where the condition evaluates to
falseare removed.
|
Think of the filter condition as "keep records where this is true." |
Condition examples
status = 'Active'
Keeps only records with Active status.
order_date >= toDate('2024-01-01', 'yyyy-MM-dd') and order_date < toDate('2025-01-01', 'yyyy-MM-dd')
Keeps orders from 2024 only.
email is not null and email != ''
Removes records without email addresses.
amount > 100
Keeps only records above the threshold.
country in {'US', 'CA', 'MX'} and customer_type = 'Business'
Keeps business customers from North America.
matches(@"^[A-Z]{2}-[0-9]{4}$", product_code, false)
Keeps only records matching the expected code format.
quantity > 0 and price is not null and customer_id is not null
Keeps only complete, valid records.
Common filter patterns
Filter vs Condition step
Use Filter when you want to keep only matching records and discard the rest.
Use Condition when you need to route records to different processing paths based on a condition (both matching and non-matching records continue processing).
| Aspect | Filter | Condition |
|---|---|---|
Outputs |
Single output (matching records only) |
Two outputs ( |
Non-matching records |
Discarded |
Routed to |
Use case |
Remove unwanted data |
Route data to different processing branches |
Best practices
- Filter early
-
Apply filters early in your transformation plan to reduce the amount of data processed by downstream steps, improving performance.
- Be explicit about nulls
-
Null values can cause unexpected behavior. Explicitly handle nulls in your conditions:
field is not null and field = 'value'. - Test edge cases
-
Verify your filter logic handles edge cases correctly, including null values, empty strings, and boundary conditions.
- Document filter purpose
-
Add a description to explain why records are being filtered. This helps others understand the business logic.
Was this page useful?