Operations#
Operations are functions that can be applied to one or more columns of a table to produce new Virtual columns. Filtering and sorting on these virtual columns can give you deeper insights into your model and training set and help you find issues with your data, your model, or perhaps your entire approach to the problem at hand. The exact operations available will depend on the type of data in the selected columns. In general we can divide operations into two main categories: Global operations and Local operations.
Global operations#
Global operations need the full context of every row of its input columns to produce a result for any individual row. The Occurrences operation, which counts the number of times each value(s) occur in one or more columns, is an example of a global operation. A complete list of the available global operations, along with their input requirements, can be seen below.
Operation |
Input Type |
Output type |
Description |
---|---|---|---|
Traversal index |
Any |
Number |
Traversal index maximizing the walk within the input column coordinate space |
Cluster by threshold |
Any |
Number |
Groups data points by specified distance threshold. |
Rank |
Any |
Number |
The rank of each row, as sorted by the input value(s) |
Group |
Any |
Same as input |
Group index shared by all rows with the same value(s) in the input column(s) |
Nearest neighbor |
Numbers |
Number |
Distance to closest neighbor within the input column coordinate space |
Derivative |
Numbers |
Numbers |
For a numeric input column evolving over time, this operation returns the difference between the next value and the previous one, divided by two (and handling first and last values appropriately). |
Deviation |
Numbers |
Numbers |
For a numeric input column evolving over time, this operation returns the difference between a value and the average between the next and previous one. |
From previous |
Numbers |
Numbers |
For a numeric input column evolving over time, this operation returns the difference between this value and the previous one. |
To next |
Numbers |
Numbers |
For a numeric input column evolving over time, this operation returns the difference between the next value and the current one. |
Occurrence |
Any |
Number |
Number of columns with the same value(s) in the input column(s) |
Primary element |
Any |
Boolean |
Whether an element is first in a group or not |
Normalize |
Number |
Number |
Normalizes the column so that its sum is 1. |
In foreign table |
Number |
Boolean |
Whether the row is present in a foreign table |
Run constants |
Number |
Any |
The entire ‘constants’ structure of the referenced Run |
Foreign table row |
Number |
Any |
The entire row of a foreign table, as referenced by a foreign key in the input table row |
Foreign table row edited |
Number |
Boolean |
Whether the foreign table row has any pending edits |
Index |
None |
Number |
Row index within table |
Local operations#
Local operations only need the context of a single row to produce a result. We can further subdivide local operations into three categories: Unary operations, Order-independent operations, and Order-dependent operations.
Unary operations#
Unary operations take a single column as input, and produce a result for each row in that column. The Character count operation, which returns the length of a string, is an example of a unary operation. A complete list of the available unary operations, along with their input requirements, can be seen below.
Operation |
Input Type |
Output type |
Description |
---|---|---|---|
Pick[0] |
Any |
Any |
A single element picked from an array. |
PickProperty_ANWJ |
Any |
Any |
A child property picked from a composite property. |
Pick random |
Any |
Any |
A random element picked from the input array (if any) |
Abs |
Number |
Number |
The absolute value of the numeric input value |
Character count |
String |
Number |
The number of characters in the input string (including whitespace). |
Log |
Number |
Number |
The log value of the numeric input value |
Non-zero |
Numbers |
Boolean |
Whether the numeric input value is non-zero |
Not |
Any |
Boolean |
The boolean ‘not’ of the input value(s) |
Sign |
Number |
Number |
The sign (i.e. -1, 0, or 1) of the input value(s). |
Inverse |
Number |
Number |
The inverse (i.e. 1/x) of the input value(s). |
Raw |
Any |
Same as input |
The raw value of the input property, i.e. with value maps and/or string roles removed from the schema (recursive when required) |
Word count |
String |
Number |
The number of words in the input string |
Zero |
Number |
Boolean |
Whether the numeric input value is zero |
* A |
Numbers |
Numbers |
An input value multiplied by a constant number. |
+ A |
Numbers |
Numbers |
An input value with an added constant value. |
^ A |
Numbers |
Numbers |
An input value raised to a constant power. |
Overlap ratio |
Bounding boxes |
Numbers |
Quotient stating how much one bounding box is overlapped by others within an image |
Unique overlap ratio |
Bounding boxes |
Numbers |
Quotient stating how much one bounding box is uniquely overlapped by others within an image |
Rectangles (absolute) |
Bounding boxes |
Rectangles |
The rectangle geometry of the bounding box list (in absolute min/max pixel coordinates) |
Rectangles (relative) |
Bounding boxes |
Rectangles |
The rectangle geometry of the bounding box list (in min/max coordinates relative to the reference image size) |
Area |
Rectangles |
Numbers |
The area of each rectangle within the input list |
Aspect |
Rectangles |
Numbers |
The aspect (i.e. width divided by height) of each rectangle within the input list |
Width |
Rectangles |
Numbers |
The width of each rectangle within the input list |
Height |
Rectangles |
Numbers |
The height of each rectangle within the input list |
Week since epoch |
Datetime string |
Number |
For input datetime string, returns the week since epoch. |
Hour of day |
Datetime string |
Number |
For input datetime string, returns the hour [0..23]. |
Day of week |
Datetime string |
Number |
For input datetime string, returns the day of week. |
Milliseconds since epoch |
Datetime string |
Number |
For an input datetime string, returns the number of milliseconds since epoch. |
ConstantBool |
None |
Boolean |
A constant boolean value |
ConstantInt |
None |
Number |
A constant integer value |
ConstantFloat |
None |
Number |
A constant floating point value |
ConstantString |
None |
String |
A constant string value |
Order-independent operations#
Order-independent operations take two or more columns as input, and produce a result for each row in those columns. The order of the input columns does not matter. The Sum operation, which adds two or more columns together, is an example of an order-independent operation. A complete list of the available order-independent operations, along with their input requirements, can be seen below.
Operation |
Input Type |
Output type |
Description |
---|---|---|---|
Sum |
Numbers |
Number |
The sum of all numeric input values |
Average |
Numbers |
Number |
The arithmetic average of the numeric input values. |
Common |
Any |
Any |
The common value within the input array (if any) |
Count |
String |
Number |
The number of elements within the input array. |
Length |
Numbers |
Number |
The length of the input values (i.e. the square root of the sum of the squares of the input values) |
Equal |
Any |
Boolean |
Whether all input values are equal. |
Not equal |
Any |
Boolean |
Whether some of the input values differ. |
Max |
Numbers |
Number |
The largest numeric value of all input values |
Min |
Numbers |
Number |
The smallest numeric value of all input values |
Multiply |
Numbers |
Number |
The product of all numeric input values. |
Range |
Numbers |
Number |
The absolute range (i.e. ‘max value - min value’) across all numeric input values. |
Unique |
Any |
Any |
A list of all unique input values (i.e. with duplicates removed) |
Sort |
Numbers |
Numbers |
The input vector, sorted in ascending order |
Median |
Numbers |
Number |
The median of the input vector |
Normalize |
Numbers |
Numbers |
The input vector in normalized form |
Order-dependent operations#
Order-dependent operations take two or more columns as input, and produce a result for each row in those columns. The order of the input columns affects the results of these operations. The “Subtract” operation, which subtracts one or more columns from another, is an example of an order-dependent operation. A complete list of the available order-dependent operations, along with their input requirements, can be seen below.
Operation |
Input Type |
Output type |
Description |
---|---|---|---|
Divide |
Numbers |
Number |
The first value divided by the next. In the case of more terms than two, each subsequent term is used as a divisor |
To string |
Any |
String |
The input value(s) converted to a string. In the case of multiple input columns, values are comma separated. In the case of nested input columns, the values are represented as JSON. |
Hash |
Any |
Number |
A numeric hash value calculated from all input values (including scalars, nested inputs and arrays |
Subtract |
Numbers |
Number |
The first numeric input value minus all following ones. |
Delta |
Numbers |
Number |
The difference between one column and the next (i.e. ‘next value - this value’ |
IoU |
Bounding boxes |
Numbers |
For each bounding box in the first column, the maximum intersection-over-union against any bounding box in the second column |
TP |
Bounding boxes |
Boolean |
Whether each predicted bounding box is a True Positive, i.e. has a corresponding ground truth bounding box. |
FP |
Bounding boxes |
Boolean |
Whether each predicted bounding box is a False Positive, i.e. has no corresponding ground truth bounding box. |
FN |
Bounding boxes |
Boolean |
Whether each ground truth bounding box is a False Negative, i.e. has no corresponding predicted bounding box. |
Greater than or equal |
Numbers |
Boolean |
Whether each value is greater than or equal to the next. |
Greater than |
Numbers |
Boolean |
Whether each value is greater than the next. |
Less than or equal |
Numbers |
Boolean |
Whether each value is less than or equal to than the next. |
Less than |
Numbers |
Boolean |
Whether each value is less than the next. |