Search using AQL Expressions
Ataccama Query Language (AQL) search can be used for all entities in ONE that do not have search configurations (that is, that do not have the full-text search configured). For entities that do have search configurations, it is still possible to use AQL after selecting it in the search bar. Keep in mind that this can only be done from the All tab on the entity listing screen.
Overview
AQL aims at covering basic search and filtering functionality through easy-to-write conditions.
Every listing in AQL can be narrowed down by providing a filter. Its usage boils down to identifying a property and applying a condition on it. The condition does not specify which entity it is used for; that information comes from the context.
For example, if we want to filter sources whose names contain postgres
as a substring, we could use the following query:
query {
sources(
filter: "name like 'postgres'"
versionSelector: {draftVersion: true}
) {
totalCount
}
}
You can think of these filters as an SQL WHERE
clause.
While the syntax resembles that of SQL in the case of the easiest conditions, AQL is more powerful when it comes to placing conditions on related entities as well.
For example, if you want to filter sources whose names contain a substring postgres
and are referenced by at least two catalog items that each contain an attribute of data type STRING
, you could use the following:
query {
sources(
filter: "name like 'postgres' and @catalogItem (source).count(attributes.any(dataType = 'STRING') ) > 2"
versionSelector: { draftVersion: true }
) {
totalCount
}
}
Or, if you want to find an entity with a specific ID:
query {
sources(
filter: "$id = '#gid#'"
versionSelector: {draftVersion: true}
) {
totalCount
}
}
/*This AQL query uses a semi-expression.
A semi-expression is a pattern, into which a result of a ONE Runtime expression is inserted.
In this case, "#id#" is a semi-expression, which is evaluated by ONE Runtime as the value of the column `gid`, for example, `813adb4c-7746-49ce-gv46-a00d2454432e``, and the value of the semi-expression is translated to `$id` and used as a filter.*/
Combining conditions
Multiple simpler conditions can be expressed at the same time using one of logical operators:
-
NOT
(alias:!
) - Requires the condition not to be satisfied. -
AND
(&
,&&
) - Requires both conditions to be satisfied. -
OR
(|
,||
) - Requires at least one condition to be satisfied.Example 1. Examples on sources-
name like "mysql" and profilingLimitAbsoluteMax > 100
-
name like "mysql" or profilingLimitAbsoluteMax < 100
-
The precedence of the logical operators is as noted but can be altered by using parentheses. Parentheses can be freely put around any condition.
For example, if you have the following conditions:
-
a is 1 and not b is 2 or c is 3
-
(a is 1) and not (b is 2 or c is 3)
-
a is 1 and (not (b is 2) or (c is 3))
Let’s see what would happen if you applied them on the following data:
-
a = 0, b = 0, c = 0
- Matched by none. -
a = 0, b = 2, c = 3
- Matched by 1. -
a = 1, b = 2, c = 3
- Matched by 1, 3. -
a = 1, b = 0, c = 0
- matched by 1, 2, 3.
Nullability
Null means that a value is not present, which is different from a situation when it is, for example, an empty string (""
), zero (0
), or false
.
You can test the nullability of these types of properties:
-
Scalar
-
Single embedded
-
Reference
-
propertyName IS NULL
-
propertyName IS NOT NULL
-
name is not null
(for example, on sources) -
profilingResult is not null
(on catalog items, single embedded) -
source is not null
(on catalog items, reference)
There are other equality and inequality operators, see the following sections. |
Scalar property conditions
Scalar conditions can test whether a value equals a constant, is greater than a constant, or looks like a constant.
They can also be used with the count
aggregation operator.
Currently, it’s not possible to have an expression on the right side of a condition. |
Scalar conditions can also refer to scalar system properties:
* $id
- STRING
property of the entity’s GID
.
* $fulltext
- Combines all properties into a single STRING
.
* $path
- STRING
property of the entity’s entity path.
Primitive values (constants)
A constant value used in scalar conditions can be of type:
-
Boolean - Possible values are
true
andfalse
(case-insensitive). -
Number - Must be JSON5-compatible.
-
Positive and negative:
Infinity
,NaN
(case-sensitive). -
Hexadecimal integer:
0x0
,0XBA0BAB
,0xc0c0a
. -
Decimal integers:
0
,42
,-175
. -
Decimal real numbers:
5.05
,-17.666
,15.
,.5
. -
Decimal real numbers with exponent:
15.5e3
,5e-2
.
-
-
String - Must be JSON5-compatible:
-
Single-quoted:
'cat'
,'single'quote'
,'with new\nline'
. -
Double-quoted:
"dog"
,"double\"quote"
,"with\ttab"
.
-
The range of numbers is not checked by the syntax, but it is checked by the receiving scalar type. |
Binary operators
You can compare a value of a scalar property using common comparison operators:
-
Equal
=
(alias:is
,eq
,==
). -
Not equal
!=
(is not
,ne
,neq
,<>
). -
Less than
<
(lt
). -
Greater than
>
(gt
). -
Less than or equal
<=
(le
,lte
). -
Greater than or equal
>=
(ge
,gte
). -
Like
like
(match
,~
).
Not all operators can be used for all scalar data types:
-
Equal and not equal work for all types.
-
Less than, greater than, and their equal variants work for
INTEGER
,LONG
,FLOAT
,DOUBLE
. -
Like works for
STRING
. Currently, it searches for a substring and it is case-insensitive.
-
booleanProperty IS TRUE
-
longProperty > 5
-
stringProperty LIKE 'prop'
-
$id = "0e40b980-d671-4021-876f-a5dbb3d2052c"
-
profilingLimitEnabled is true
-
profilingLimitAbsoluteMax = 100
-
name == "mysql aws"
-
$fulltext like "party"
In list operator
To search for a value that is in a list or not, use an IN
operator.
The list of values must contain at least one value. All values in the list must be either of type Boolean, number, or string. |
-
stringProperty IN ('dog', 'cat')
-
longProperty NOT IN (2, 3, 5, 7)
-
name in ("mysql aws", "postgres local")
(on sources)
Combining scalar conditions
You can combine multiple conditions on the same scalar property using logical operators and parentheses. This is equivalent to distributing the property to all conditions.
Scalar conditions cannot be combined with nullability conditions. In general, we recommend using scalar conditions sparingly as the feature can be confusing. |
-
longProperty > 5 and < 10
is equivalent tolongProperty > 5 and longProperty < 10
. -
stringProperty (like 'burger' or 'sandwich') and not like 'cheese'
is equivalent to(stringProperty like 'burger' or stringProperty like 'sandwich') and not stringProperty like 'cheese'
.
Single property conditions
It is possible to make conditions on a single embedded, single reference, or parent entity property, and combine them with other conditions.
The full syntax is:
-
singleEmbeddedProperty { nestedCondition }
-
referenceProperty { nestedCondition }
-
$parent { nestedCondition }
The part between the brackets can be any condition described in this article applied to the respective entity.
If the nested condition does not need any combining operators, you can use the shorthand syntax:
-
singleEmbedded.longProperty = 2
-
singleReference.booleanProperty IS TRUE
-
$parent.stringProperty = 'x'
Take into account the following:
* Single embedded and reference properties can be tested for nullability, the parent entity ones cannot.
* The long and shorthand syntaxes can be combined: |
-
$parent { name like "party" & numberOfProfiledRecords > 100} and dataType = 'STRING'
(on catalog item attributes) -
source.$fulltext like "m"
(on catalog items) -
source.name like "mysql"
Array property conditions
Queries for array embedded properties, array referenced properties, and back references can also include conditions. The nested condition is aggregated and the number of matching entities is evaluated.
Similarly as for single properties, there are two possible syntaxes:
-
arrayPropertyName { aggregation( nestedCondition ) }
- Allows to combine multiple aggregation conditions on one array property. -
arrayPropertyName.aggregation( nestedCondition )
- Requires only one aggregation.
Aggregation operators
The following operators can be used in aggregations:
-
all( nestedCondition )
- Requires a nested condition. All entities must be matched. -
none( nestedCondition )
- Requires a nested condition. No entities must be matched. -
any( nestedCondition )
,some( nestedCondition )
- The nested condition is optional. Requires at least one matching entity. -
count( nestedCondition ) scalarCondition
- The nested condition is optional. The number of required matched entities is tested with a scalar condition.
Array properties cannot be tested for nullability.
If the array is empty (that is, it contains no nested entities), the all operator is true , and the number of matching entities is the same as the number of all entities.
|
-
arrayPropertyName.all( booleanProperty IS TRUE )
-
arrayPropertyName.none( stringProperty IS NULL )
-
arrayPropertyName.some( longProperty > 100 )
-
arrayPropertyName.any()
-
arrayPropertyName.count( longProperty > 100 ) = 1
-
arrayPropertyName.count() > 5 and < 10
-
credentials.count() > 1
(on sources) -
credentials.some(name like "correct")
-
credentials.none(name is not "correct")
-
credentials.all(name is not "incorrect")
-
credentials.any($fulltext like "a")
-
source.credentials.some(username like "admin")
(on catalog items)
Back references
You can also define conditions on a set of incoming references by expressing them on the origin entities. This mechanism uses the same aggregation operators as the previously described array embedded properties.
To do this, specify the referencing entity and the referencing property (both are optional):
-
@ referencingNode ( referencingProperty )
- ThereferencingProperty
defined inreferencingNode
points to the current entity. -
@ referencingNode ()
- Any reference property defined in the referencing entity points to the current entity. -
@ ( referencingProperty )
- ThereferencingProperty
defined in any entity points to the current entity. -
@ ()
- Any property defined in any entity points to the current entity.
Although optional, we recommend always specifying the referencingNode and its referencingProperty to avoid unexpected behavior.
|
Currently, different properties of different nodes are evaluated separately due to a known issue. |
-
@source(tag).any()
- Checks that the current entity is referenced at least once by asource
entity via thetag
property. -
@source(tag) { count(valid IS TRUE) > 2 or any(master IS TRUE and priority > 9000 ) }
- Checks that the current entity is referenced by asource
entity via thetag
property at least twice such that it isvalid
, or at least once such that it ismaster
and its priority is greater than9000
. -
@catalogItem (source).count() == 0
- Not referenced by catalog items. Used on sources. -
@catalogItem (source).count({name = "party"}) > 0
- Referenced at least once with a catalog item named "party". Used on sources.
AQL timestamp conditions
All standard operators are supported:
-
<property> is null
-
<property> is not null
-
<property> in (<value1>, <value2>, …)
-
<property> not in (<value1>, <value2>, …)
-
<property> = <value> , <property> != <value>
-
<property> < <value>, <property> > <value>
-
<property> ⇐ <value>, <property> >= <value>
The value on the right side can be provided in one of two possible formats. The value is always expressed as a string.
Absolute value
The absolute value is a generalized ISO-6801
format:
(YYYY-MM-DD)? # Optional date part
[T ]? # Optional date time separator
(hh:mm(:ss)?(.sss)?)? # Optional time part
(Z|[+-]h(h(:?mm(:?ss)?)?)?)? # Optional time zone part
The value is then parses in the following order:
-
OffsetDateTime
- All parts are mandatory. -
LocalDateTime
- Date and time. The server time zone is applied. -
OffsetTime
- Time and time zone. The current date in the provided time zone is used. -
LocalTime
- Time. The current date in the server time zone is used. -
LocalDate
- Date. The beginning of the day in the server time zone is used.
We strongly recommend specifying all parts unless when used for ad-hoc queries. |
Relative value
The timestamp starts with now
representing the current timestamp.
The format uses the following units: second
, minute
, hour
, day
, week
, month
, year
.
The current timestamp can be divided by a unit and shifted by a multiple of a unit:
---
now
(TZ)? (/ UNIT)?
([+-] AMOUNT UNITs?)?
---
The division truncates the timestamp to the beginning of the particular unit. It can be used to find everything created since midnight of the current day, or in the current month, in comparison to finding everything created in the last 24 hours or the last 30 to 31 days.
You need to provide the time zone for units day
, week
, month
, year
so that the time when the day starts can be derived.
If the time zone is not provided, the server time zone is used.
The format of the time zone is the same as for the absolute value and the beginning of a week is always Monday.
The addition or subtraction is applied after the division (if the division was specified).
The amount
is an integer and the unit
can be in plural.
Example use cases and queries
- Find all catalog items that come from particular schema
-
Query:
schema is '<schema_name>'
or$parent.originPath is '<schema_name>'
- Find all attributes by their name or assigned term
-
Query:
-
attributes.any(name is '<attribute_name>')
-
termInstances.any(displayName is '<term_name>')
for catalog item terms andattributes.any(termInstances.any(displayName is '<attribute_name>'))
-
- Find all tables without owner/DQ/profiles (that is, tables that are not managed yet)
-
Query:
validityAggr.overallQuality != NULL
;profiles.count() = 0
- Find all attributes without term assigned
-
Query:
attributes.any(termInstances.count() == 0)
- Find all tables with data quality under 80%
-
Query:
validityAggr.overallQuality < 80
- Find all tables with more than one million records
-
Query:
numberOfRecords > 1000000
orprofiles.any(totalNumOfRecords > 1000000)
The second option also returns tables that had a million records at least once during the profiling. - Reference to an AQL timestamp filter with absolute values
-
Example:
timestamp < '2022-06-05T17:15:15+02:00' # As denoted timestamp > '2022-01-01' # Midnight in the server time zone timestamp = '11:35Z' # Today at 11:35 in UTC
- Relative value AQL timestamp queries
-
Example:
now # simply now now +2 / day + 8 hours # 8 am today in Prague, Czech Republic now - 24 hours # 24 hours ago now / year # Midnight of January 1st in the server time zone
Was this page useful?