Operations#

Operations are functions that can be applied to one or more columns of a table to produce new Virtual columns. Filtering and sorting on these virtual columns can give you deeper insights into your model and training set and help you find issues with your data, your model, or perhaps your entire approach to the problem at hand. The exact operations available will depend on the type of data in the selected columns. In general we can divide operations into two main categories: Global operations and Local operations.

Global operations#

Global operations need the full context of every row of its input columns to produce a result for any individual row. The Occurrences operation, which counts the number of times each value(s) occur in one or more columns, is an example of a global operation. A complete list of the available global operations, along with their input requirements, can be seen below.

Operation

Input Type

Output type

Description

Traversal index

Any

Number

Traversal index maximizing the walk within the input column coordinate space

Cluster by threshold

Any

Number

Groups data points by specified distance threshold.

Rank

Any

Number

The rank of each row, as sorted by the input value(s)

Group

Any

Same as input

Group index shared by all rows with the same value(s) in the input column(s)

Nearest neighbor

Numbers

Number

Distance to closest neighbor within the input column coordinate space

Derivative

Numbers

Numbers

For a numeric input column evolving over time, this operation returns the difference between the next value and the previous one, divided by two (and handling first and last values appropriately).

Deviation

Numbers

Numbers

For a numeric input column evolving over time, this operation returns the difference between a value and the average between the next and previous one.

From previous

Numbers

Numbers

For a numeric input column evolving over time, this operation returns the difference between this value and the previous one.

To next

Numbers

Numbers

For a numeric input column evolving over time, this operation returns the difference between the next value and the current one.

Occurrence

Any

Number

Number of columns with the same value(s) in the input column(s)

Primary element

Any

Boolean

Whether an element is first in a group or not

Normalize

Number

Number

Normalizes the column so that its sum is 1.

In foreign table

Number

Boolean

Whether the row is present in a foreign table

Run constants

Number

Any

The entire ‘constants’ structure of the referenced Run

Foreign table row

Number

Any

The entire row of a foreign table, as referenced by a foreign key in the input table row

Foreign table row edited

Number

Boolean

Whether the foreign table row has any pending edits

Index

None

Number

Row index within table

Local operations#

Local operations only need the context of a single row to produce a result. We can further subdivide local operations into three categories: Unary operations, Order-independent operations, and Order-dependent operations.

Unary operations#

Unary operations take a single column as input, and produce a result for each row in that column. The Character count operation, which returns the length of a string, is an example of a unary operation. A complete list of the available unary operations, along with their input requirements, can be seen below.

Operation

Input Type

Output type

Description

Pick[…]

Any

Any

A single element picked from an array.

PickProperty[…]

Any

Any

A child property picked from a composite property.

Pick random

Any

Any

A random element picked from the input array (if any)

Abs

Number

Number

The absolute value of the numeric input value

Character count

String

Number

The number of characters in the input string (including whitespace).

Log

Number

Number

The log value of the numeric input value

Non-zero

Numbers

Boolean

Whether the numeric input value is non-zero

Not

Any

Boolean

The boolean ‘not’ of the input value(s)

Sign

Number

Number

The sign (i.e. -1, 0, or 1) of the input value(s).

Inverse

Number

Number

The inverse (i.e. 1/x) of the input value(s).

Raw

Any

Same as input

The raw value of the input property, i.e. with value maps and/or string roles removed from the schema (recursive when required)

Word count

String

Number

The number of words in the input string

Zero

Number

Boolean

Whether the numeric input value is zero

* A

Numbers

Numbers

An input value multiplied by a constant number.

+ A

Numbers

Numbers

An input value with an added constant value.

^ A

Numbers

Numbers

An input value raised to a constant power.

Overlap ratio

Bounding boxes

Numbers

Quotient stating how much one bounding box is overlapped by others within an image

Unique overlap ratio

Bounding boxes

Numbers

Quotient stating how much one bounding box is uniquely overlapped by others within an image

Rectangles (absolute)

Bounding boxes

Rectangles

The rectangle geometry of the bounding box list (in absolute min/max pixel coordinates)

Rectangles (relative)

Bounding boxes

Rectangles

The rectangle geometry of the bounding box list (in min/max coordinates relative to the reference image size)

Area

Rectangles

Numbers

The area of each rectangle within the input list

Aspect

Rectangles

Numbers

The aspect (i.e. width divided by height) of each rectangle within the input list

Width

Rectangles

Numbers

The width of each rectangle within the input list

Height

Rectangles

Numbers

The height of each rectangle within the input list

Non-maximum suppression (NMS)

Bounding boxes

Bounding boxes

A bounding box list where non-maximum-suppression has been performed (with a user-specified IoU threshold)

Week since epoch

Datetime string

Number

For input datetime string, returns the week since epoch.

Hour of day

Datetime string

Number

For input datetime string, returns the hour [0..23].

Day of week

Datetime string

Number

For input datetime string, returns the day of week.

Milliseconds since epoch

Datetime string

Number

For an input datetime string, returns the number of milliseconds since epoch.

ConstantBool

None

Boolean

A constant boolean value

ConstantInt

None

Number

A constant integer value

ConstantFloat

None

Number

A constant floating point value

ConstantString

None

String

A constant string value

Order-independent operations#

Order-independent operations take two or more columns as input, and produce a result for each row in those columns. The order of the input columns does not matter. The Sum operation, which adds two or more columns together, is an example of an order-independent operation. A complete list of the available order-independent operations, along with their input requirements, can be seen below.

Operation

Input Type

Output type

Description

Sum

Numbers

Number

The sum of all numeric input values

Average

Numbers

Number

The arithmetic average of the numeric input values.

Common

Any

Any

The common value within the input array (if any)

Count

String

Number

The number of elements within the input array.

Length

Numbers

Number

The length of the input values (i.e. the square root of the sum of the squares of the input values)

Equal

Any

Boolean

Whether all input values are equal.

Not equal

Any

Boolean

Whether some of the input values differ.

Max

Numbers

Number

The largest numeric value of all input values

Min

Numbers

Number

The smallest numeric value of all input values

Multiply

Numbers

Number

The product of all numeric input values.

Range

Numbers

Number

The absolute range (i.e. ‘max value - min value’) across all numeric input values.

Unique

Any

Any

A list of all unique input values (i.e. with duplicates removed)

Sort

Numbers

Numbers

The input vector, sorted in ascending order

Median

Numbers

Number

The median of the input vector

Normalize

Numbers

Numbers

The input vector in normalized form

Order-dependent operations#

Order-dependent operations take two or more columns as input, and produce a result for each row in those columns. The order of the input columns affects the results of these operations. The “Subtract” operation, which subtracts one or more columns from another, is an example of an order-dependent operation. A complete list of the available order-dependent operations, along with their input requirements, can be seen below.

Operation

Input Type

Output type

Description

Divide

Numbers

Number

The first value divided by the next. In the case of more terms than two, each subsequent term is used as a divisor

To string

Any

String

The input value(s) converted to a string. In the case of multiple input columns, values are comma separated. In the case of nested input columns, the values are represented as JSON.

Hash

Any

Number

A numeric hash value calculated from all input values (including scalars, nested inputs and arrays

Subtract

Numbers

Number

The first numeric input value minus all following ones.

Delta

Numbers

Number

The difference between one column and the next (i.e. ‘next value - this value’

IoU

Bounding boxes

Numbers

For each bounding box in the first column, the maximum intersection-over-union against any bounding box in the second column

TP

Bounding boxes

Boolean

Whether each predicted bounding box is a True Positive, i.e. has a corresponding ground truth bounding box.

FP

Bounding boxes

Boolean

Whether each predicted bounding box is a False Positive, i.e. has no corresponding ground truth bounding box.

FN

Bounding boxes

Boolean

Whether each ground truth bounding box is a False Negative, i.e. has no corresponding predicted bounding box.

Greater than or equal

Numbers

Boolean

Whether each value is greater than or equal to the next.

Greater than

Numbers

Boolean

Whether each value is greater than the next.

Less than or equal

Numbers

Boolean

Whether each value is less than or equal to than the next.

Less than

Numbers

Boolean

Whether each value is less than the next.