ONE Expressions
This article describes expressions used within ONE steps. Places where the expressions can be used are mentioned in the description sections of the appropriate steps.
Operands
An operand is any expression with a type corresponding to a valid type for a given operation. Operands can be divided into four categories:
-
Literals - Numeric, string, or logical constants (
TRUE
,FALSE
,UNKNOWN
(deprecated), all keywords are case-insensitive) and null literal (case-insensitive). -
Columns - Defined by their names. If there is a space character in the column name, the name must be enclosed in square brackets (
[]
). If the step retrieves data from multiple inputs, the column names are specified using the dot notation format, that is,input_name.column_name
. If the step uses only one input, the dot notation may be omitted. -
Set - Can be used only in combination with the
IN
operation where the set represents a constant expression. A set can only occur on the right side of theIN
operation. -
Complex expressions.
Expression operands can be of a defined column type, such as INTEGER
, FLOAT
, LONG
, STRING
, DATETIME
, DAY
, and BOOLEAN
.
If a number assigned to either an INTEGER
or LONG
variable overflows or underflows the interval of permitted values for that type (that is, -2147483648;+2147483647
for INTEGER
and -9223372036854775808;+9223372036854775807
for LONG
,) then the number wraps around the interval.
For example, the value 2147483649
assigned to an INTEGER
variable is interpreted as -2147483647
.
Operands are automatically converted to a wider type if needed.
This is relevant for numeric data types INTEGER
, LONG
, and FLOAT
(widening INTEGER
⇒ LONG
⇒ FLOAT
) and datetime types DAY
and DATETIME
(DAY
⇒ DATETIME
).
In case of comparisons, set, and conditional operations, all operands are first converted to the most general type and then the operation is performed.
Handling null values
Operations and functions handle arguments with null value according to SQL rules.
One important exception is for STRING
data type.
A null string and an empty string are considered equal, so null string arguments are handled as empty (zero length) strings.
For example, "abc" == NULL
, or even "`abc" > NULL`, is a legal comparison and gives non-null Boolean result analogous to "abc" == ""
, or "abc" > ""
respectively, whereas in SQL language these expressions give both null (UNKNOWN
) value.
Variables
The expression can be formed as a sequence of assignment expressions followed by one resulting expression.
Multiple expressions are delimited by a semicolon (;
).
Assignment expressions have the following syntax: variable := expression
.
The first occurrence of a variable on the left-hand side defines this variable and its type. A reference to a variable in an expression is valid only after its definition. Each following occurrence of a variable, including on the left-hand side of an assignment expression, must conform to the variable’s type.
a := 2;
b := 4 - a;
3 * b
Operations and functions
ONE provides the following function and operation categories:
-
Arithmetic operations
-
Logical operations
-
Comparison operations
-
Set operations
-
Other operations
-
Date functions
-
String functions
-
Bitwise functions
-
MinMax functions
-
Aggregating functions
-
Conditional expressions
-
Conversion and formatting functions
-
Word set operation functions
All functions and operations that do not have the "locale" parameter set or defined are set to use the default ONE locale.
The step locale setting does not influence this behavior.
|
Arithmetic operations
This category includes common arithmetic operations—addition, subtraction, multiplication, and division.
The result of an arithmetic operation applied to the types INTEGER
and/or LONG
is always INTEGER
or LONG
.
The result is of type LONG
if at least one operand was of type LONG
.
The type NUMBER in the description of input and output types stands for data types INTEGER , LONG , or FLOAT .
|
Operator | Usage | Description | ||
---|---|---|---|---|
- |
|
Subtraction of numeric operands |
||
- |
|
Negation of numeric operand
|
||
/ |
|
Division of numeric operands |
||
* |
|
Multiplication of numeric operands |
||
% |
|
Modulo: the remainder after numerical division of |
||
+ |
|
Addition of numeric operands |
||
div |
|
Division of integer operands without a remainder. |
Logical operations
Common logical operations AND, OR, XOR, and NOT (all keywords are case-insensitive).
Operator | Usage | Description |
---|---|---|
AND |
|
Logical conjunction. |
NOT |
|
Logical negation. |
OR |
|
Logical sum. |
XOR |
|
Exclusive OR. |
Comparison operations
Relational operators.
Operator | Usage | Description |
---|---|---|
< |
|
Tests if the value |
<= |
|
Tests if the value |
<>, != |
|
Tests the negated equivalence of two values. |
=, == |
|
Tests the equivalence of two values. |
> |
|
Tests if the value |
>= |
|
Tests if the value |
Set operations
For sets, a few basic operations--IN
, IS IN
, NOT IN
, IS NOT IN
-- are implemented.
Set members are literals of types defined for columns or column names themselves.
Operator | Usage | Description |
---|---|---|
in |
|
Tests whether |
is in |
|
Tests whether |
is not in |
|
Tests whether |
not in |
|
Tests whether |
company IN {"Smith inc.", "Smith Moving inc.",
"Speedmover inc.", [candidate column], clear_column}
a IN {1, 2, 5, 10}
b IN {TRUE, FALSE}
Other operations
Function | Usage | Description | ||
---|---|---|---|---|
getParameterValue |
string getParameterValue(string key [, string default]) |
The function returns the value of the property or the default value if there is no property with that key. All property names must have a prefix defining where the property should be read from. The following prefixes are available:
|
||
getRuntimeVersion |
string getRuntimeVersion() |
The function returns the product version. |
||
is |
a is b |
Tests if A typical use is |
||
is not |
a is not b |
Tests if A typical use is |
||
setParameterValue |
string setParameterValue(string key, string value) |
The function sets the value of the property.
All property names must have a prefix defining the category where the property should be stored to.
Not all property categories allow the definition of new properties (for example, Java system properties - prefix A typical use is |
||
geoDistance |
float geoDistance(float lat1, float long1, float lat2, float long2) |
Calculates the shortest distance (in meters) between two points on Earth. A point is defined by its latitude and longitude in degrees.
|
||
namedSequence |
integer sequence(string name [, integer start= |
Generates the next number each time it is called. Behind the scenes, the following happens: when called, the operation checks whether it was already called with the given name in the running plan or in the online server.
If it was not called previously, it creates a counter object that stores the last generated number; in the first iteration, the number is the value of the Otherwise, it takes the already created counter object, increments the value of the object by the If there are more named sequences with different |
||
random |
integer random([[integer from= |
Generates a random number from the interval defined by the parameters The The default values are |
||
randomUUID |
string randomUUID() |
A static factory to retrieve a type 4 (pseudo randomly generated) UUID. The UUID is generated using a cryptographically strong pseudorandom number generator. |
||
sequence |
integer sequence(integer [, start= |
It generates the next number each time it is called. Behind the scenes, the following happens: when called, the operation checks whether it was already called with the given parameters in the running plan or in the online server.
If it was not called previously, it creates a counter object that stores the last generated number; in the first iteration, the number is the value of the Otherwise, it takes the already created counter object, increments the value of the object by the The default values are |
Date functions
In ONE, a date is represented by DAY
and DATETIME
types.
The DAY
type represents a date to the detail level of days, DATETIME
to the detail level of milliseconds.
The time values that are compatible with each format are named and described in the following table.
Date part name | Range | Included in date type |
---|---|---|
|
Any positive number |
|
|
1 - 12 |
|
|
1 - max.month |
|
|
0 - 23 |
|
|
0 - 59 |
|
|
0 - 59 |
|
A day starts at 00:00:00 and ends at 23:59:59. If a given function requires identification of a date part as parameter, the identifier is written in the expression in the form of a string literal, for example, "MONTH", otherwise the expression is evaluated as incorrect.
Identifiers are case sensitive and must be written in uppercase.
For example: expression='dateAdd(inDate,10,"DAY")'
.
All the listed date parts are represented by positive integers. Milliseconds are not supported by the date functions.
Date type DATE-TYPE stands for date types DAY or DATETIME in the description of input and output types.
|
Function | Usage | Description |
---|---|---|
dateAdd |
datetype dateAdd(datetype srcDate, integer srcValue, string fieldName) |
Adds the specified This function allows subtraction, so the If any of the operands are invalid or if an unsupported |
dateDiff |
integer dateDiff(datetype startDate, datetype endDate, string fieldName) |
Returns the difference between If the result exceeds the maximum range of A combination of date type |
datePart |
integer datePart(datetype srcDate, string fieldName) |
This function returns the value of the field If any of the parameters are invalid, the expression reports an error.
For the fields |
dateTrunc |
datetype dateTrunc(datetype srcDate, string fieldName) |
The function truncates less important parts of the The function can be used even for the If any of the parameters are invalid, the expression reports an error. Example: for |
getDate |
day getDate(datetype srcExpression) |
Returns the date in the format defined by the specified srcExpression (type |
getRequestTime |
datetime getRequestTime() |
Returns the time when the current request processing was started. This is the ONE application start time in batch mode and the web service request time in online mode. |
now |
datetime now() |
Returns the current time with the type |
today |
day today() |
Returns the current date in type |
String functions
The following are common functions used for string processing.
Function | Usage | Description | ||
---|---|---|---|---|
capitalize |
string capitalize(string srcStr) |
Transforms all words in the string A word consists of alphabetic characters (letters). All other characters are considered separators. |
||
capitalizeWithException |
string capitalizeWithException(string srcStr [, boolean leaveExcsAsTheyAre= |
Transforms all words in the string A word consists of alphabetic characters (letters). All other characters are considered separators. A current downside is that words from |
||
containsWord |
boolean containsWord(string srcStr, string srcWord [, string separator=" "]) |
Searches for the occurrence of the word A word is a sequence of letters with no whitespaces.
Words in the string are defined as sequences of letters separated by a space (
|
||
countNonAsciiLetters |
integer countNonAsciiLetters(string srcStr) |
Returns the number of characters from the string |
||
cpConvert |
string cpConvert(string srcStr, string actualCp, string correctCp) |
Takes a string as an input wrongly read using the For example, if you have a file that is entirely in |
||
diceCoefficient |
float diceCoefficient(string srcStr1, string srcStr2 [, string tokenSeparator= |
Returns the The method uses a set similarity measure called Sørensen-Dice coefficient.
It uses the following formula: Since the measure is originally used for comparing two sets, this implementation offers three ways of slicing the input string to "tokens" (set elements). They are distinguished by the method signature:
|
||
distinct |
string distinct(string srcStr, null) |
Returns a string which contains the concatenated first occurrence of parts of the original string |
||
doubleMetaphone |
string doubleMetaphone(string srcStr [, boolean isAlternate= |
Encodes If |
||
editDistance |
integer editDistance(string srcStr1, string srcStr2 [, integer maxDistance] [, boolean caseInsensitive= |
Returns the edit distance between strings The parameter The parameter The difference between Levenshtein and Edit distance lies in the definition of distance of two switched adjacent characters. The Levenshtein one considers the switch as two changes whereas the Edit distance considers the switch to be a single change. If both of the strings are null, then the result is This function is also known as Damerau-Levenshtein distance. |
||
eraseSpacesInNames |
string eraseSpacesInNames(string srcStr, integer minLength, boolean onlyUpper) |
Removes spaces between separate characters (words of length 1``) in the string The parameter The parameter |
||
find |
boolean find(string srcRegex, string srcStr [, boolean caseInsensitive= |
Verifies whether the string The parameter If the string |
||
hamming |
integer hamming(string srcStr1, string srcStr2 [, boolean caseInsensitive= |
Returns the Hamming distance between strings The parameter If both of the strings are null, then the result is |
||
indexOf |
integer indexOf(string srcStr, string subStr [, integer fromIndex= |
Returns the index within the string If the substring is not found, the value If the value The index of the first character is |
||
isInFile |
boolean isInFile(string srcStr, string fileName) |
Searches for the string The function returns |
||
isNumber |
boolean isNumber(string srcStr) |
Verifies whether the string |
||
jaccardCoefficient |
float jaccardCoefficient(string srcStr1, string srcStr2 [, string tokenSeparator= |
Returns the The method uses a set similarity measure called Jaccard coefficient.
It uses the following formula: Since the measure is originally used for comparing two sets, this implementation offers three ways of slicing the input string to "tokens" (set elements). They are distinguished by the method signature:
|
||
jaroWinkler |
float jaroWinkler(string srcStr1, string srcStr2 [, boolean caseInsensitive= |
Returns the Jaro-Winkler score for strings This is a combination of Jaro’s algorithm with Winkler’s boost for identical prefixes. For an extensive description of the calculation, see Jaro-Winkler distance. To turn off the Winkler’s prefix boost (obtain pure Jaro), set the boost threshold to a value greater than By default, the function is case insensitive. When case sensitivity is turned on, uppercase and lowercase variants of the same letter are considered as two completely different characters. There is currently no support or optimization for diacritical marks (the function considers |
||
lastIndexOf |
integer lastIndexOf(string srcStr, string subStr [, integer fromIndex= |
Returns the index within the string If the substring is not found, the value If the value The index of the first character is |
||
left |
string left(string srcStr, integer count) |
Returns a new string that is a substring of |
||
length |
integer length(string srcStr) |
Returns the number of characters in the string srcStr. |
||
levenshtein |
integer levenshtein(string srcStr1, string srcStr2 [, integer maxDistance] [, boolean caseInsensitive= |
Returns the Levenshtein distance between strings The parameter The parameter If both of the strings are null, then the result is |
||
lower |
string lower(string srcStr) |
Transforms all characters of the string srcStr to lowercase. |
||
matches |
boolean matches(string srcRegex, string srcStr [, boolean caseInsensitive= |
Verifies whether the string The parameter If the string For information about regular expressions, Regular expressions. |
||
metaphone |
string metaphone(string srcStr) |
Encodes the string |
||
ngram |
string ngram(string srcStr, string inSeparator, string outSeparator, integer size) |
Takes`srcStr` as a sequence of items separated by N-grams in the output sequence are separated by The length of both separators as well as of the n-grams has to be at least If there are not enough words for creating an n-gram, the function returns |
||
preserveCase |
string preserveCase(string srcStr, string maskStr) |
Transforms all characters in |
||
removeAccents |
string removeAccents(string srcStr) |
Returns a copy of the string |
||
replace |
string replace(string srcStr, string what, string withWhat) |
Replaces occurrences of the string The function returns |
||
replicate |
string replicate(string srcStr, integer n) |
Returns |
||
right |
string right(string srcStr, integer count) |
Returns a new string that is a substring of |
||
sortWords |
string sortWords(string srcStr [, string srcLocale= |
Returns a string that consists of sorted parts of the string The parameter If the Boolean parameter If the string |
||
soundex |
string soundex(string srcStr) |
Returns the soundex value of the |
||
squeezeSpaces |
string squeezeSpaces(string srcStr) |
Removes whitespace characters from both ends of the string |
||
substituteAll |
string substituteAll(string srcPattern, string srcReplacement, string srcStr [, boolean caseInsensitiveFlag= |
Replaces all occurrences of If the parameter |
||
substituteMany |
string substituteMany(string srcPattern, string srcReplacement, string srcStr, integer srcVolume [, boolean caseInsensitiveFlag= |
Replaces all occurrences of If the parameter |
||
substr |
string substr(string srcStr, integer beginIndex [, integer strLen]) |
Returns a new string that is a substring of the string If If An empty substring is returned as a The index of the first character is |
||
transliterate |
string transliterate(string srcStr, string charsFrom, string charsTo) |
Transforms characters of the string The transformation replaces all occurrences of any character named in the parameter |
||
trashConsonants |
string trashConsonants(string srcStr) |
Removes all consonants (only Latin letters are concerned) and their accented equivalents from the string |
||
trashDiacritics |
Deprecated, replaced by the function |
|||
trashNonDigits |
string trashNonDigits(string srcStr) |
Returns a string that consists only of the digits included in the original string |
||
trashNonLetters |
string trashNonLetters(string srcStr) |
Returns a string that consists only of the letters included in the original string srcStr. All other characters are discarded. |
||
trashVowels |
string trashVowels(string srcStr) |
Removes all vowels (only Latin letters are concerned) and their accented equivalents from the string |
||
trim |
string trim(string srcStr [, string fillChars]) |
Removes whitespace or specified characters from both ends of the string For more information, see the trim method of the class |
||
trimLeft |
string trimLeft(string srcStr [, string fillChars]) |
Removes whitespace or specified characters from the left side of the string For more information, see the trim method of the class |
||
trimRight |
string trimRight(string srcStr [, string fillChars]) |
Removes whitespace or specified characters from the right side of the string For more information, see the trim method of the class |
||
upper |
string upper(string srcStr) |
Transforms all characters of the string |
||
word |
string word(string srcStr, integer srcIdx [, string srcSeparator= |
Returns the Words are defined as sequences of letters separated by The index of the first word is |
||
wordCombinations |
string wordCombinations(string srcStr, integer size, string srcSeparator, string combSeparator) |
Returns all Words are defined as sequences of characters separated by The words in the combination are separated by If any of the arguments is null or |
||
wordCount |
integer wordCount(string srcStr [, string srcSeparator= |
Returns the number of non-empty words in the string Words are defined as sequences of characters separated by |
Coding functions
Functions for hashing, encoding, or decoding strings.
Function | Usage | Description |
---|---|---|
coding.fromBase64 |
string coding.fromBase64(string source) |
Decodes the source string from Base64. |
coding.HMacMD5 |
string coding.HMacMD5(string message, string key) |
Creates a hash-based message authentication code using MD5 algorithm. |
coding.HMacSHA1 |
string coding.HMacSHA1(string message, string key) |
Creates a hash-based message authentication code using SHA-1 algorithm. |
coding.HMacSHA512 |
string coding.HMacSHA512(string message, string key) |
Creates a hash-based message authentication code using SHA-512 algorithm. |
coding.htmlEscape |
string coding.htmlEscape(string source, boolean allEntities) |
Converts XML/HTML special characters ( If set to |
coding.htmlUnescape |
string coding.htmlUnescape(string source, boolean allEntities) |
Transforms entities representing XML/HTML special characters ( The entity |
coding.md5 |
string coding.md5(string source) |
Creates a hash of the source string using MD5 algorithm. |
coding.toBase64 |
string coding.toBase64(string source) |
Encodes the source string to Base64. |
coding.urlDecode |
string coding.urlDecode(string source, string charset) |
Decodes a URL-encoded string, converting hexadecimal byte representations ( |
coding.urlEncode |
string coding.urlEncode(string source, string charset) |
Encodes a string for safe URL use by converting unsupported characters into their hexadecimal ( |
encode.base64 |
Deprecated, use |
|
encode.HMacMD5 |
Deprecated, use |
|
encode.HMacSHA1 |
Deprecated, use |
|
encode.HMacSHA512 |
Deprecated, use |
|
encode.md5 |
Deprecated, use |
Math functions
Traditional mathematical functions. They return float value and usually have one float argument.
Function | Usage | Description |
---|---|---|
math.acos |
float math.acos(float x) |
Arc cosine of x.
Defined for |
math.asin |
float math.asin(float x) |
Arc sine of x.
Defined for |
math.atan |
float math.atan(float x) |
Arc tangent of x. |
math.cos |
float math.cos(float x) |
Cosine of x. The argument is assumed in radians. |
math.e |
float math.e() |
Returns Euler’s number (e) = 2,7182818285. |
math.exp |
float math.exp(float x) |
Exponential function ex.
Equivalent to |
math.log |
float math.log(float x) |
Natural logarithm of x.
Defined for |
math.log10 |
float math.log10(float x) |
Common logarithm of x.
Defined for |
math.pi |
float math.pi() |
Returns π (PI) = 3,1415926536. |
math.pow |
|
Returns x power to y.
The function is undefined and returns |
math.sin |
float math.sin(float x) |
Sine of x. The argument is assumed in radians. |
math.sqr |
numeric math.sqr(numeric x) |
Returns square of x. |
math.sqrt |
float math.sqrt(float x) |
Square root of x.
Defined for |
math.tan |
float math.tan(float x) |
Tangent of x. The argument is assumed in radians. Undefined for π/2 + k*π. |
Bitwise functions
Bitwise functions are logical operations applied on separate bits of the operands.
Function | Usage | Description |
---|---|---|
bitand |
|
Bitwise AND |
bitneg |
|
Bitwise NOT, or complement |
bitor |
|
Bitwise inclusive OR |
bitxor |
|
Bitwise exclusive or |
MinMax functions
Functions used for computation of minimum or maximum values.
Function | Usage | Description |
---|---|---|
max |
any max(any a, any b) |
Returns the greater of two operands.
If either of the operands is null, Strings are compared lexicographically.
For Boolean values, the following applies: |
min |
any min(any a, any b) |
Returns the lesser of two operands.
If either of the operands is null, Strings are compared lexicographically.
For Boolean values, the following applies: |
safeMax |
any safeMax(any a, any b) |
Returns the greater of two operands. If either of the operands is null, then the value of the other operand is returned. Strings are compared lexicographically.
For Boolean values, the following applies: |
safeMin |
any safeMin(any a, any b) |
Returns the lesser of two operands. If either of the operands is null, then the value of the other operand is returned. Strings are compared lexicographically.
For Boolean values, the following applies: |
Aggregating functions
Aggregating functions are special functions that can be used only in the context of ONE steps that support grouping of records. Currently, there are two such steps: Representative Creator and Group Aggregator.
Depending on the context, expressions containing aggregate functions distinguish between two types of sources: inner, used in arguments of any aggregated function, and outer, used outside of functions. These might be generally different, for example when the sum of a certain attribute of all records in a group is added to another attribute of a record that has an entirely different format and usage.
Every aggregating function has a variant for conditional evaluation.
The name of the variant is derived from the original name with the appended suffix if
.
The conditional variant has one extra argument that is inserted before the original arguments and contains a Boolean expression.
The expression specifies when the appropriate record is included into aggregation.
For example, the expression avg(salary)
can have the conditional variant avgif(score < 100, salary)
.
Nesting of aggregate functions is not allowed.
For example, the expression countif(salary < avg(salary)) is invalid.
|
Function | Usage | Description | ||
---|---|---|---|---|
avg |
|
If the input is of an integer type, the function returns the average value of non-null values in a group.
The result is an integer number rounded down.
For example, If the input is of a float type, the function returns the average value of non-null values in a group.
The result is a number of a float type not rounded.
For example If the input is of a day type, the function returns the average value of non-null values in a group.
The result is a day type rounded down.
For example, If the input is of a daytime type, the function returns the average value of non-null values in a group.
The result is a daytime type rounded down.
For example, |
||
avgif |
|
Conditional version of See also the description of |
||
concatenate |
string concatenate(string expression [, string separator= |
Returns a concatenated string made up of non-null values in a group, separated by the value in The resulting string never exceeds |
||
concatenateif |
string concatenateif(boolean boolExpression, string expression [, string separator= |
Conditional version of See also the description of |
||
count |
integer count([any expression]) |
Returns the number of members or non-null expression values in a group. Both 32-bit and 64-bit integer data types are supported. |
||
countDistinct |
integer countDistinct(any expression) |
Returns the number of distinct non-null values in a group. |
||
countDistinctif |
integer countDistinctif(boolean boolExpression, any expression) |
Conditional version of See also the description of |
||
countif |
integer countif(boolean boolExpression [, any expression]) |
Conditional version of |
||
countUnique |
integer countUnique(any expression) |
Returns the number of non-null values in a group that occur only one time. |
||
countUniqueif |
integer countUniqueif(boolean boolExpression, any expression) |
Conditional version of See also the description of |
||
detectCycles |
|
Returns the first link of the oriented graph that creates a cycle. The function takes adjacency lists of parents and their children and performs recursive depth-first search to find cycles. Returns |
||
first |
any first(any expression) |
Returns the first value in a group (including null values). This aggregation value depends on the order of group members, which is given by context. |
||
firstif |
any firstif(boolean boolExpression, any expression) |
Conditional version of See also the description of |
||
frequency |
integer frequency(any expression) |
The function assigns a number to each record that represents the total number of records in the same group (from the aggregation context) that share the same value of the The frequency function can be used to select a representative record as the one with the most or least common value of some attribute.
|
||
last |
any last(any expression) |
Returns the last value in a group (including null values). This aggregation value depends on the order of group members, which is given by context. |
||
lastif |
any lastif(boolean boolExpression, any expression) |
Conditional version of See also the description of |
||
maximum |
any maximum(any expression [, any result]) |
Returns the maximum of non-null values in a group. When the second argument is used, the function returns its (first non-null) value for record having maximum. |
||
maximumif |
any maximumif(boolean boolExpression, any expression [, any result]) |
Conditional version of See also the description of |
||
minimum |
any minimum(any expression [, any result]) |
Returns the minimum of non-null values in a group. When the second argument is used, the function returns its (first non-null) value for record having minimum. |
||
minimumif |
any minimumif(boolean boolExpression, any expression [, any result]) |
Conditional version of See also the description of |
||
modus |
any modus(any expression [, any result]) |
Returns the most frequent non-null value of |
||
modusif |
any modusif(boolean boolExpression, any expression [, any result]) |
Conditional version of Returns the first non-null value of result members having the most frequent non-null value |
||
sum |
|
Returns the sum of non-null values in a group. For Boolean arguments, this function performs the logical sum (OR), for example, |
||
sumif |
|
Conditional version of See also the description of |
Conditional expressions
Conditional expressions are a special type of expressions where the resulting value depends on the evaluation of certain conditions. These functions do not have strictly defined argument types, instead they are flexible and their arguments are defined by the specific functionality of each expression.
Function | Usage | Description |
---|---|---|
case |
any case({boolean expr, any exprValue}… [, any defaultExpr= |
Returns the value of expression The types of all values of |
decode |
any decode(any decodeExpr {, any expr, any exprValue}… [, any defaultExpr= |
Returns the value of expression The types of all values of |
iif |
any iif(boolean ifExpr, any trueExpr, any elseExpr) |
Returns |
nvl |
any nvl({any expr}…) |
Returns the value of the first expression |
case (
id is null, "_" + input + "_",
id = 1, substr(input, length(input) / 2),
"default value"
)
decode (
id,
0,
'zero',
1,
'one',
2,
'two',
3,
'three'
)
iif (
value == 2,
'ok',
'bad'
)
nvl (
value1,
value2,
value3
)
Conversion and formatting functions
Conversion functions are used for conversions and formatting the input expression.
Function | Usage | Description |
---|---|---|
getMilliseconds |
long getMilliseconds(datetype srcExpression) |
Returns internal representation of datetime value, that is, the number of milliseconds since the epoch which starts at January 1, 1970, 00:00:00 GMT.
The returned value can be converted back into |
math.abs |
numeric [math.]abs(numeric expr) |
Returns the absolute value of the expression |
math.ceil, math.ceiling |
|
Converts the expression |
math.floor |
integer [math.]floor(float expr) |
Converts the expression |
math.longCeil, math.longCeiling |
|
Converts the expression expr to the nearest higher long value. |
math.longFloor |
long [math.]longFloor(float expr) |
Converts the expression |
math.round |
float [math.]round(float expr [, integer decimalPlaces= |
Rounds the expression |
toDate |
|
Returns the date specified in the expression expr converted to date type If a single argument is used, For example, if the date in string format is '30/05/2023', then the conversion is |
toDateTime |
|
Returns the date specified in the expression If a single argument is used, For example, if the date in string format is '30/05/2023 15:00:00', then the conversion is |
toFloat |
|
Converts the expression |
toInteger |
integer toInteger(string expr) |
Converts the expression expr to an INTEGER value. If the conversion is not successful, then NULL is returned. |
toLong |
|
Converts the expression |
toString |
|
Converts the expression The parameter If the parameter If the |
XML functions
Functions for working with XML formatted data.
Function | Usage | Description |
---|---|---|
xpath |
string xpath(string xml, string xpath) |
Retrieves the XML element from the source XML string on the specified XPath. |
Word set operation functions
Word set operation functions operate on two strings, interpreting them as sets of words separated by the given separator (or a space, by default) and return the integer cardinality of the resulting set or, in some cases, the resulting set itself.
If the parameter multiset
is set to TRUE
, the sets are treated as "multisets".
For example, two identical words in one set form two members of the set rather than one.
When multiset
is FALSE
(default), identical words are first deduplicated.
Three of the functions (intersection
, difference
, symmetricDifference
) have variants with the prefix lcs
(for example, lcsIntersection
).
These variants are based on the concept that the intersection of two (word) lists is defined as the Longest Common Subsequention.
In this case, the order or duplicity of elements in the lists is significant.
For example, two lists a = 'C A D B A'
and b = 'A E B C D'
, whose "ordinary" intersection is set 'A B C D'
, have the "lcs" intersection 'A B'
(or 'C D'
, 'A D'
), and so lcsIntersection(a, b) = 2
, lcsDifference(a, b) = 2
(or 3
in the multiset case), lcsSymmetricDifference(a, b) = 5
(or 6
).
The two types of difference functions can be executed with an optional integer parameter, singularity
, which distinguishes sets that have common members from sets without common members.
When this parameter is used, the function returns a value (typically a very large number) when the sets have an empty intersection.
For example:
* difference('A B', 'C D') = 2
: The typical result.
The difference between completely different sets might have the same value as the difference between, for example, very similar sets 'A B C D E'
and 'A B C F G'
.
* difference('A B', 'C D', 1000) = 1000
: Using the singularity
parameter produces a different result, which shows that the difference between completely different sets is high.
Most of the functions have a variant with the suffix Exp
(for example, intersectionExp
).
These variants have one appended optional argument, comparator
.
This argument is a functional parameter and it defines how the elements in sets are compared to determine if they are the same.
The functional parameter must take two string values and return a Boolean result.
The result must not depend on the argument order.
The functional argument in general is similar to the function definition and has the following syntax:
(param1, param2, ..) { expression }
The columns and variables can be accessed in the function body.
When using the functional comparator, the calculation of intersection of two sets is quite different from an "ordinary" set operation, because we cannot assume the comparator conforms to the equality relation. In both set operands A respectively B, we determine subsets A' respectively B' of members that have at least one matching member in the opposite set. Since these subsets might have different sizes, the size of the intersection is defined as the first one.
Examples of using these functional parameters:
-
symmetricDifference(std_name, std_hh_names, ' ', multi)
: Ordinary comparison. -
symmetricDifferenceExp(std_name, std_hh_names, ' ', multi, (x,y) { x == y })
: Ordinary comparison, equivalent to the non-Exp
behavior. -
symmetricDifferenceExp(std_name, std_hh_names, ' ', multi, (x,y) { editDistance(x,y) < 2 })
: Approximative matching, elements different at most in one error are assumed to be equal.
When std_name = "AAA BBB AAA ABC"
and std_hh_names = "AAB AAA CCC BBC"
and multi = TRUE
, then the first case gives 6
(one pair of AAA
), the second one is 6
(all three AAA
paired), and the third one is 1
(only CCC
remains unpaired).
Function | Usage | Description | ||
---|---|---|---|---|
set.approxSymmetricDifference |
integer [set.]approxSymmetricDifference(string set1, string set2 [, string separator= |
Returns the number of words which cannot be matched with any word in the other set.
The matching is based on one of difference functions, 'levenshtein' (default), 'editDistance', or 'hamming', and the
|
||
set.contains |
boolean set.contains(string str, string item [, string separator= |
Checks whether the |
||
set.containsExp |
boolean set.containsExp(string str, string item [, string separator= |
Checks whether the |
||
set.difference |
integer [set.]difference(string set1, string set2 [, string separator= |
Returns the cardinality of the difference of sets ( |
||
set.differenceExp |
integer set.differenceExp(string set1, string set2 [, string separator= |
Returns the cardinality of the difference of sets ( |
||
set.differenceResult |
string set.differenceResult(string set1, string set2 [, string separator= |
Returns the difference of sets ( |
||
set.differenceResultExp |
string set.differenceResultExp(string set1, string set2 [, string separator= |
Returns the difference of sets ( |
||
set.distinct |
string set.distinct(string str [, string separator= |
Returns the set of distinct elements from the source string. |
||
set.distinctExp |
string set.distinctExp(string str [, string separator= |
Returns the set of distinct elements from the source string.
The elements are compared according to the specified |
||
set.filterExp |
string set.filterExp(set, string separator= |
Returns the string set containing elements for which the Boolean functional parameter is For example, |
||
set.indexOf |
integer set.indexOf(string str, string item [, string separator= |
Returns the position of |
||
set.indexOfExp |
integer set.indexOfExp(string str, string item [, string separator= |
Returns the position of |
||
set.intersection |
integer [set.]intersection(string set1, string set2 [, string separator= |
Returns the cardinality of the intersection of sets. |
||
set.intersectionExp |
integer set.intersectionExp(string set1, string set2 [, string separator= |
Returns the cardinality of the intersection of sets. |
||
set.intersectionResult |
string set.intersectionResult(string set1, string set2 [, string separator= |
Returns the intersection of sets. |
||
set.intersectionResultExp |
string set.intersectionResultExp(string set1, string set2 [, string separator= |
Returns the intersection of sets. |
||
set.item |
string set.item(string str, integer index [, string separator= |
Returns the element of the string set at the specified position, including empty elements. |
||
set.lastIndexOf |
integer set.lastIndexOf(string str, string item [, string separator= |
Returns the position of the last |
||
set.lastIndexOfExp |
integer set.lastIndexOfExp(string str, string item [, string separator= |
Returns the position of the last |
||
set.lcsDifference |
integer set.lcsDifference(string set1, string set2 [, string separator= |
Returns the cardinality of the difference of sets ( |
||
set.lcsDifferenceExp |
integer set.lcsDifferenceExp(string set1, string set2 [, string separator= |
Returns the cardinality of the difference of sets ( |
||
set.lcsDifferenceResult |
string set.lcsDifferenceResult(string set1, string set2 [, string separator= |
Returns the difference of sets ( |
||
set.lcsDifferenceResultExp |
string set.lcsDifferenceResultExp(string set1, string set2 [, string separator= |
Returns the difference of sets ( |
||
set.lcsIntersection |
integer set.lcsIntersection(string set1, string set2 [, string separator= |
Returns the cardinality of the intersection of sets (lcs-based intersection). |
||
set.lcsIntersectionExp |
integer set.lcsIntersectionExp(string set1, string set2 [, string separator= |
Returns the cardinality of the intersection of sets (lcs-based intersection). |
||
set.lcsIntersectionResult |
string set.lcsIntersectionResult(string set1, string set2 [, string separator= |
Returns the intersection of sets (lcs-based intersection). |
||
set.lcsIntersectionResultExp |
string set.lcsIntersectionResultExp(string set1, string set2 [, string separator= |
Returns the intersection of sets (lcs-based intersection). |
||
set.lcsSymmetricDifference |
integer set.lcsSymmetricDifference(string set1, string set2 [, string separator= |
Returns the cardinality of the symmetric difference of sets (lcs-based intersection). |
||
set.lcsSymmetricDifferenceExp |
integer set.lcsSymmetricDifferenceExp(string set1, string set2 [, string separator= |
Returns the cardinality of the symmetric difference of sets (lcs-based intersection). |
||
set.lcsSymmetricDifferenceResult |
string set.lcsSymmetricDifferenceResult(string set1, string set2 [, string separator= |
Returns the symmetric difference of sets (lcs-based intersection). |
||
set.lcsSymmetricDifferenceResultExp |
string set.lcsSymmetricDifferenceResultExp(string set1, string set2 [, string separator= |
Returns the symmetric difference of sets (lcs-based intersection). |
||
set.mapExp |
string set.mapExp(set [, string separator= |
Returns a string set whose each element is transformed by the functional parameter.
For example, |
||
set.size |
integer set.size(string str [, string separator= |
Returns the number of elements in the string, including empty elements. |
||
set.sort |
string set.sort(string str [, string separator= |
Returns a sorted set of elements. |
||
set.subSequence |
string set.subSequence(string str, integer from [, integer count= |
Returns elements of the string set starting at the specified position, including empty elements. The |
||
set.sumExp |
numeric set.sumExp(set [, string separator= |
Returns the sum of values calculated by the functional parameter for each element. The functional parameter expression must return an integer, long, or float result. For example, |
||
set.symmetricDifference |
integer [set.]symmetricDifference(string set1, string set2 [, string separator= |
Returns the cardinality of the symmetric difference of sets. |
||
set.symmetricDifferenceExp |
integer set.symmetricDifferenceExp(string set1, string set2 [, separator= |
Returns the cardinality of the symmetric difference of sets. |
||
set.symmetricDifferenceResult |
string set.symmetricDifferenceResult(string set1, string set2 [, string separator= |
Returns the symmetric difference of sets. |
||
set.symmetricDifferenceResultExp |
string set.symmetricDifferenceResultExp(string set1, string set2 [, string separator= |
Returns the symmetric difference of sets. Example:
|
||
set.union |
integer [set.]union(string set1, string set2 [, string separator= |
Returns the cardinality of the union of sets. |
||
set.unionExp |
integer set.unionExp(string set1, string set2 [, string separator= |
Returns the cardinality of the union of sets. |
||
set.unionResult |
string set.unionResult(string set1, string set2 [, string separator= |
Returns the union of sets. |
||
set.unionResultExp |
string set.unionResultExp(string set1, string set2 [, string separator= |
Returns the union of sets. |
Regular expressions
The syntax for regular expressions in ONE follows the rules for regular expressions used in Java, described in Class Pattern documentation.
The following sections contain information about regular expression usage extensions in ONE.
' @" ' - syntax (single escaping)
When writing regular expressions, take into consideration that a regular expression is manipulated as a Java string.
In literal Java strings, the backslash (\
) is an escape character.
The literal string \\
is a single backslash.
In regular expressions, the backslash is also an escape character.
The regular expression \\
matches a single backslash.
As a Java string, this regular expression becomes \\\\
.
To avoid having to use double escaping, prefix the string in quotes with an at sign (@
).
In that case, the string between the at sign and double quotation marks (@"some_string"
) is taken as a literal and no characters are considered escape characters in the context of the Java string.
For example, to substitute all occurrences of the caret character (^
) and the closing bracket (]
) with x
in the string ab[^]
(the expected result is the string ab[xx
), you could use the following expression:
substituteAll("[\\^\\]]","x","ab[^]")
Or, using the ' @" ' syntax:
substituteAll(@"[\^\]]","x","ab[^]")
Capturing groups
Matching regular expressions in the input is done by analyzing the input expression string (the string resulting from applying the expression to the input). Sections of the input string (called capturing groups, enclosed in parentheses) are identified and marked for further use in creating the output. These capturing groups can be referenced using backreference.
In the case of a match, the matched data from the input is sent to the predefined output columns. Each output column has a substitution property that is the value which will be sent to the output. It can contain the previously mentioned backreferences with the following syntax:
-
$i
, wherei
is a number between0
and9`
: Backreference to a capturing group with group number lower than 10. -
${i}
, wherei
is a natural number other than0
: Backreference to a capturing group with any natural group number. -
$`
: Returns the substring before the processed (matched) part of the input string. -
$'
: Returns the substring after the processed (matched) part of the input string. -
$&
: Returns the processed (matched) part of the input string. -
$$
: returns the dollar sign characters ($
).
The capturing groups can be used in expressions substituteAll
, substitutaMany
, and in the Regex Matching step.
For example, to substitute all pairs of letter-digit couples with only the digit from the couple (that is, the input string a1b2c3d4e5
leads to the output 12345
), you could use the following expression:
substituteAll("([a-z])([0-9])","${2}","a1b2c3d4e5")
Was this page useful?