tlc.helpers

Utility helper classes for working with 3LC concepts.

Each helper is a class with static methods grouping operations around a single concept (color, geometry, schema, etc.). Reach for them via the explicit form from tlc.helpers import ColorHelper.

Package Contents

Classes

Class

Description

AnnotationColumn

Structural view of an annotation column.

AnnotationHelper

Discover and inspect annotation columns in a table without depending on column names or schema constants.

AnnotationType

Supported annotation column types.

ColorHelper

Helpers for converting between common color representations.

DateTimeHelper

A class with helper methods for working with timestamps.

GeometryHelper

Helper class for geometry.

ImageHelper

Helper class for image operations.

KeypointHelper

Static helpers to read keypoint geometry and metadata from 3LC Tables.

ProjectHelper

Helper methods for project-level operations.

ProjectLayout

Static helpers for constructing URLs that conform to the 3LC project folder layout.

SchemaHelper

A class with helper methods for working with Schema objects

SegmentationHelper

Helper class for segmentation operations.

API

class AnnotationColumn

Structural view of an annotation column.

Variables:
  • name – Top-level column name in the table.

  • type – The detected annotation type.

  • label_path – Full dot-separated path to the label leaf, or None if the column has no label field.

label_path: str | None = None
name: str = None
type: AnnotationType = None
class AnnotationHelper

Discover and inspect annotation columns in a table without depending on column names or schema constants.

static find(
table: Table,
*,
type: AnnotationType | None = None,
) AnnotationColumn | None

Locate the annotation column in a table.

The detection is purely structural — column names are not consulted; sample-type config and well-known sub-field names are. Asking for type=BOUNDING_BOXES matches both new and legacy bounding box columns; type=LEGACY_BOUNDING_BOXES matches legacy only. All other type filters are exact.

Parameters:
  • table – The table to inspect.

  • type – Optional type filter. If omitted, any annotation column matches.

Returns:

The single matching AnnotationColumn, or None if no annotation column matches.

Raises:

ValueError – If more than one column matches. The error lists the candidate names; pass AnnotationHelper.get() with an explicit column_name to disambiguate.

static get(
table: Table,
column_name: str,
) AnnotationColumn

Classify a known annotation column by name.

Parameters:
  • table – The table containing the column.

  • column_name – Top-level column name.

Returns:

The AnnotationColumn view of the column.

Raises:
  • KeyError – If the column does not exist in the table.

  • ValueError – If the column exists but is not annotation-shaped.

class AnnotationType

Bases: enum.Enum

Supported annotation column types.

BOUNDING_BOXES = bounding_boxes
KEYPOINTS = keypoints
LEGACY_BOUNDING_BOXES = legacy_bounding_boxes
ORIENTED_BOUNDING_BOXES = oriented_bounding_boxes
SEGMENTATION = segmentation
class ColorHelper

Helpers for converting between common color representations.

static hex_to_rgb_tuple(
hex_color: str,
) tuple[int, int, int]

Convert a #RRGGBB (or RRGGBB) hex string to an (R, G, B) tuple of 0-255 ints.

static rgb_tuple_to_hex(
rgb: tuple[int, int, int],
) str

Convert an (R, G, B) tuple of 0-255 ints to a #RRGGBB hex string.

class DateTimeHelper

A class with helper methods for working with timestamps.

static compare_timestamps(
timestamp_1: str | datetime | None,
timestamp_2: str | datetime,
) timedelta

Compare timestamps with time zone information.

The function parses the timestamps and computes a difference in seconds.

Parameters:
  • timestamp_1 – The first timestamp to compare.

  • timestamp_2 – The second timestamp to compare.

Returns:

The difference in seconds between the timestamps. A positive value indicates that timestamp_1 is later than timestamp_2.

Raises:

ValueError – if the timestamp is invalid.

class GeometryHelper

Helper class for geometry.

static create_isotropic_bounds_2d(
x_min: float,
x_max: float,
y_min: float,
y_max: float,
) tuple[float, float, float, float]

Create isotropic bounds for a set of 2D points.

static create_isotropic_bounds_3d(
x_min: float,
x_max: float,
y_min: float,
y_max: float,
z_min: float,
z_max: float,
*,
force_z_min: bool = False,
) tuple[float, float, float, float, float, float]

Create isotropic bounds for a set of 3D points.

static load_obj_geometry(
obj_path: str,
scale: float = 1.0,
transform: ndarray | None = None,
bounds_3d: tuple[float, float, float, float, float, float] | None = None,
) Geometry3D

Load vertices and triangles from a obj file.

The obj file should contain vertices and faces. The faces will be fan-triangulated if they are not triangles. The triangles will be assigned the material color of the face.

Parameters:
  • obj_path – The path to the obj file.

  • scale – The scale factor to apply to the vertices.

  • transform – The transformation matrix to apply to the vertices (shape (3,3) or (4,4)).

  • bounds_3d – The 3D bounds of the geometry. If None, the bounds will be computed from the vertices.

Returns:

A Geometry3D object.

class ImageHelper

Helper class for image operations.

static get_exif_image_dimensions(
image_url: str | Path | Url,
) tuple[int, int]

Get the dimensions of an image, accounting for Exif orientation.

Parameters:

image_url – The URL of the image.

Returns:

The image dimensions (height, width).

static get_exif_image_dimensions_from_bytes(
image_bytes: bytes,
) tuple[int, int]

Get the dimensions of an image from bytes, accounting for Exif orientation.

Parameters:

image_bytes – The bytes of the image.

Returns:

The image dimensions (height, width).

class KeypointHelper

Static helpers to read keypoint geometry and metadata from 3LC Tables.

Includes COCO defaults (names, skeleton, colors, flip indices) for convenience.

COCO_FLIP_INDICES: ClassVar[list[int]] = [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]

Flip indices for the COCO person keypoints.

COCO_KEYPOINT_DEFAULT_POSE: ClassVar[list[float]] = [0.5, 0.15, 0.47, 0.14, 0.53, 0.14, 0.45, 0.15, 0.55, 0.15, 0.4, 0.25, 0.6, 0.25, 0.38, 0.4, 0.62, 0...
COCO_KEYPOINT_NAMES: ClassVar[list[str]] = ['nose', 'left_eye', 'right_eye', 'left_ear', 'right_ear', 'left_shoulder', 'right_shoulder', 'left_...

Names of the 17 COCO person keypoints.

COCO_SKELETON: ClassVar[list[int]] = [3, 1, 4, 2, 1, 0, 0, 2, 5, 6, 5, 7, 6, 8, 7, 9, 8, 10, 11, 12, 11, 13, 12, 14, 13, 15, 14, 16, 5, 1...

Default skeleton of the 17 COCO person keypoints.

COCO_SKELETON_COLORS: ClassVar[list[tuple[int, int, int]]] = [(102, 204, 255), (51, 153, 255), (102, 0, 204), (51, 102, 255), (255, 128, 0), (153, 255, 204), (12...

Colors for the COCO skeleton line segments.

COCO_SKELETON_NAMES: ClassVar[list[str]] = ['left_ear_to_left_eye', 'right_ear_to_right_eye', 'left_eye_to_nose', 'nose_to_right_eye', 'left_sh...

Names for the COCO skeleton line segments.

static edit_default_keypoints(
table: Table,
keypoints: list[float] | list[list[float]] | list[tuple[float, float]] | ndarray,
label_column_name: str = KEYPOINTS_2D,
table_name: str = 'edited_default_keypoints',
*,
table_url: Url | None = None,
) Table

Edit the default keypoints for a keypoint column in a Table.

The default keypoint values will be stored in the Table’s rows schema and used for drawing new instances in the 3LC Dashboard.

Parameters:
  • table – The Table to edit

  • keypoints – The new keypoints

  • label_column_name – The name of the keypoint column to edit

  • table_name – The name of the new table

  • table_url – The URL of the new table

Returns:

The edited Table

static edit_default_lines(
table: Table,
lines: list[int] | list[list[int]] | ndarray,
label_column_name: str = KEYPOINTS_2D,
table_name: str = 'edited_default_lines',
*,
table_url: Url | None = None,
) Table

Edit the default lines for a keypoint column in a Table.

static edit_oks_sigmas(
table: Table,
oks_sigmas: list[float] | None = None,
label_column_name: str = KEYPOINTS_2D,
table_name: str = 'edited_oks_sigmas',
*,
table_url: Url | None = None,
) Table

Edit the OKS sigmas for a keypoint column in a Table.

Parameters:
  • table – The Table to edit

  • oks_sigmas – The new OKS sigmas

  • label_column_name – The name of the keypoint column to edit

  • table_name – The name of the new table

  • table_url – The URL of the new table

Returns:

The edited Table

static flatten_lines(
lines: Sequence[int] | Sequence[Sequence[int]] | ndarray,
) list[int]

Returns a flat list of lines.

Parameters:

lines – Can be lists of (i0, j0, i1, j1, …), numpy arrays or nested lists.

Returns:

A flat list of lines

static flatten_points(
points: Sequence[float] | Sequence[Sequence[float]] | ndarray,
) list[float]

Returns a flat list of points.

Parameters:

points – Can be lists of (x,y) or (x, y, z), numpy arrays or nested lists.

Returns:

A flat list of points

static flatten_triangles(
triangles: Sequence[int] | Sequence[Sequence[int]] | ndarray,
) list[int]

Returns a flat list of triangles.

Parameters:

triangles – Can be lists of (i, j, k, …), numpy arrays or nested lists.

Returns:

A flat list of triangles

static get_flip_indices_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[int] | None

Return horizontal flip index mapping list, or None.

static get_keypoint_attributes_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[dict[str, Any]] | None

Return keypoint attribute dicts (e.g., names/ids) from the schema, or None.

static get_keypoint_shape_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[int] | None

Return [num_keypoints, num_channels] inferred from the table, or None.

Channels: 2 => x,y only; 3 => x,y plus an extra channel (e.g., visibility).

static get_line_attributes_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[dict[str, Any]] | None

Return line attribute dicts (matching the skeleton order), or None.

static get_lines_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[int] | None

Return flattened skeleton index pairs [i0, j0, i1, j1, …], or None.

static get_oks_sigmas_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[float] | None

Return OKS sigma values list, or None.

static get_points_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[float] | None

Return default vertex coordinates ([x0, y0, …]) or None.

static get_triangle_attributes_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[dict[str, Any]] | None

Return triangle attribute dicts (matching the triangle order), or None.

static get_triangles_from_table(
table: Table,
label_column_name: str = KEYPOINTS_2D,
) list[int] | None

Return flattened triangle index triplets [i, j, k, …], or None.

static parse_bounding_box(
bounding_box: Sequence[float] | ndarray,
format: str = 'xyxy',
) ndarray

Parse bounding box in various formats.

Parameters:
  • bounding_box – Bounding box as list, tuple, or array

  • format – Currently only “xyxy” is supported

Returns:

Array of shape (1, 4) with [x_min, y_min, x_max, y_max]

static parse_keypoints_with_visibility(
keypoints: list[float] | list[list[float]] | list[tuple[float, float]] | ndarray,
) tuple[ndarray, ndarray | None]

Parse keypoints in various formats, extracting coordinates and optional visibility.

Supported keypoint formats:

  • Flat list: [x1, y1, x2, y2, ...] or [x1, y1, v1, x2, y2, v2, ...]

  • List of pairs: [[x1, y1], [x2, y2], ...]

  • List of triplets: [[x1, y1, v1], [x2, y2, v2], ...]

  • NumPy array of shape (K, 2) or (K, 3)

Parameters:

keypoints – The keypoints to parse (see formats above).

Returns:

A tuple of (keypoints_array, visibility_array) where keypoints_array has shape (K, 2) with x,y coordinates, and visibility_array has shape (K,) with visibility flags (or None if visibility was not present in the input).

static parse_per_keypoint_channel(
num_keypoints: int,
visibility: Sequence[int] | ndarray | None = None,
confidence: Sequence[float] | ndarray | None = None,
derived_visibility: ndarray | None = None,
) tuple[ndarray | None, ndarray | None]

Parse and validate per-keypoint visibility or confidence channel.

Parameters:
  • num_keypoints – Expected number of keypoints

  • visibility – Optional visibility values

  • confidence – Optional confidence values

  • derived_visibility – Optional visibility derived from keypoint parsing

Returns:

Tuple of (visibility_array, confidence_array), both shape (1, K) or None

Raises:

ValueError – If both visibility and confidence are provided

class ProjectHelper

Helper methods for project-level operations.

This class provides static helpers for project functionality that goes beyond URL/path construction (which lives in ProjectLayout).

static register_project_url_alias(
token: str,
path: str | Url,
*,
project_name: str | None = None,
root_url: Url | str | None = None,
force: bool = True,
) None

Register and persist a URL alias for a project.

A project URL alias is a per-project alias that is persisted in the project’s configuration file and will be loaded for other users that share the same project but not necessarily the same startup-config. The alias is also registered in the current session and is immediately available for use.

Parameters:
  • token – The alias token to register. Must match the regex [A-Z][A-Z0-9_]*.

  • path – The path to alias.

  • project_name – The project name.

  • root_url – The root URL.

  • force – If True, force the registration of the alias even if it is already registered.

Raises:

ValueError – If the token is already registered and force is False.

class ProjectLayout

Static helpers for constructing URLs that conform to the 3LC project folder layout.

All methods are static. When project_name or root_url are omitted the fallback chain is:

  1. The active Run (via Session).

  2. The configured project root URL and the fallback project name.

static create_unique_table_url(
url: Url | str,
*,
require_writable: bool = False,
) Url

Create a unique version of a table URL, optionally ensuring writability.

Calls tlc.Url.create_unique() to find a unique URL. If require_writable is True and the resulting URL is not writable, a fallback URL is created under the configured project root URL.

Parameters:
  • url – The base URL to make unique.

  • require_writable – If True, ensure the returned URL is writable. When the original location is not writable, the URL is relocated under the configured project root URL.

Returns:

A unique URL (writable if require_writable is True).

Raises:

ValueError – If require_writable is True but a writable URL cannot be created.

static dataset_url(
*,
dataset_name: str | None = None,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for a dataset directory.

Parameters:
  • dataset_name – The dataset name. If not provided, the default dataset name is used.

  • project_name – The project name.

  • root_url – The root URL.

Returns:

A URL for the dataset directory.

static datasets_url(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for the datasets subdirectory of a project.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

A URL for the datasets directory.

static default_project_aliases_config_url(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for the default-alias config file for a project.

Such a file is automatically read by any 3LC client and makes it possible to share a project without requiring extra configuration.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

A URL for the default-alias config file.

static is_dataset_table_url(
url: Url | str,
) bool

Check if url is a standard dataset table URL.

A canonical dataset table URL has the form …//datasets//tables/

.

Parameters:

url – The URL to check.

Returns:

True if the URL matches the canonical layout, False otherwise.

static is_project_dataset_url(
url: Url | str,
) bool

Check if url is a standard project dataset URL.

A canonical project dataset URL has the form …//datasets/.

Parameters:

url – The URL to check.

Returns:

True if the URL matches the canonical layout, False otherwise.

static is_project_run_url(
url: Url | str,
) bool

Check if url is a standard project run URL.

A canonical project run URL has the form …//runs/.

Parameters:

url – The URL to check.

Returns:

True if the URL matches the canonical layout, False otherwise.

static is_run_metrics_table_url(
url: Url | str,
) bool

Check if url is a standard run metrics table URL.

A canonical run metrics table URL has the form …//runs//metrics_.

Parameters:

url – The URL to check.

Returns:

True if the URL matches the canonical layout, False otherwise.

static list_dataset_names(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) list[str]

List all dataset names in a project.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

A list of dataset names.

static list_project_names(
*,
root_url: Url | str | None = None,
include_scan_urls: bool = True,
) list[str]

List all project names under the root URL and optional scan URLs.

Parameters:
  • root_url – The root URL to scan. Defaults to the configured project root URL.

  • include_scan_urls – If True, also scan project-layout URLs listed in Configuration.scan_urls.

Returns:

A deduplicated list of project names.

static list_run_names(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) list[str]

List all run names in a project.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

A list of run names.

static list_table_names(
*,
dataset_name: str | None = None,
project_name: str | None = None,
root_url: Url | str | None = None,
) list[str]

List all table names in a dataset.

Parameters:
  • dataset_name – The dataset name.

  • project_name – The project name.

  • root_url – The root URL.

Returns:

A list of table names.

static max_url_relativization_depth(
url: Url | str,
) int

Return the maximum depth to relativize a URL residing in a project folder.

If the URL is not identified as residing within a project folder, no relativization will be applied.

static project_exists(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) bool

Check whether the project directory exists.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

True if the project directory exists, False otherwise.

static project_url(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for a project conforming to the 3LC project folder layout.

When project_name or root_url are not provided, the fallback chain is:

  1. The active Run (via Session).

  2. The configured project root URL and the fallback project name.

Parameters:
  • project_name – The project name. If not provided, the active or fallback project is used.

  • root_url – The root URL. If not provided, the configured project root URL is used.

Returns:

A URL for the project directory.

static run_url(
*,
run_name: str | None = None,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for a run conforming to the 3LC project folder layout.

Parameters:
  • run_name – The run name. If not provided, the default run name is used.

  • project_name – The project name. If not provided, the active or fallback project is used.

  • root_url – The root URL. If not provided, the configured project root URL is used.

Returns:

A URL for the run.

static run_url_parts(
url: Url | str,
) tuple[str, str, str] | None

Extract the root, project name, and run name from a run URL.

This is the inverse of run_url(). Supports the following URL layouts:

  • .../<project>/runs/<run>

  • .../<project>/runs/<run>/<metrics_table>

Parameters:

url – The URL to parse.

Returns:

A (root, project_name, run_name) tuple, or None if the URL does not match.

static runs_url(
*,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for the runs subdirectory of a project.

Parameters:
  • project_name – The project name.

  • root_url – The root URL.

Returns:

A URL for the runs directory.

static table_url(
*,
table_name: str | None = None,
dataset_name: str | None = None,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for a table conforming to the 3LC project folder layout.

Parameters:
  • table_name – The table name. If not provided, the default table name is used.

  • dataset_name – The dataset name. If not provided, the default dataset name is used.

  • project_name – The project name. If not provided, the active or fallback project is used.

  • root_url – The root URL. If not provided, the configured project root URL is used.

Returns:

A URL for the table.

static table_url_parts(
url: Url | str,
) tuple[str, str, str, str] | None

Extract the root, project name, dataset name, and table name from a table URL.

This is the inverse of table_url(). Supports the following URL layouts:

  • .../<project>/datasets/<dataset>/tables/<table>

  • .../<project>/datasets/<dataset>/tables/<table>/<sub_table>/...

Parameters:

url – The URL to parse.

Returns:

A (root, project_name, dataset_name, table_name) tuple, or None if the URL does not match.

static tables_url(
*,
dataset_name: str | None = None,
project_name: str | None = None,
root_url: Url | str | None = None,
) Url

Create a URL for the tables subdirectory of a dataset.

Parameters:
  • dataset_name – The dataset name.

  • project_name – The project name.

  • root_url – The root URL.

Returns:

A URL for the tables directory.

class SchemaHelper

A class with helper methods for working with Schema objects

ARROW_TYPE_TO_SCALAR_VALUE_MAPPING: ClassVar = None

A mapping from PyArrow types to ScalarValue types.

SCALAR_VALUE_TYPE_TO_ARROW_TYPE_MAPPING: ClassVar = None

A mapping from ScalarValue types to PyArrow types.

static build_pyarrow_schema_for_batch(
resolved_schema: Schema,
batch: Mapping[str, list[Any]],
) Schema

Build a PyArrow schema for batch, applying 3lc-schema overrides.

Uses PyArrow’s own inference against a single representative value per column (the first non-None entry), then replaces types for columns that have an explicit 3lc-schema override via tlc_schema_to_pyarrow_schema().

Parameters:
  • resolved_schema – The 3lc schema that applies to batch.

  • batch – The row-form batch (column-name → list of values).

Returns:

A pyarrow.Schema describing the batch.

static cast_scalar(
value: Any,
value_type: ScalarValue,
) Any

Cast a value which is a ScalarValue into its corresponding python type.

static cast_value(
value: typing.Any,
value_schema: tlc.schemas._schema.Schema,
on_error: typing.Literal[raise,
discard] = 'raise',
) Any

Cast any value into its corresponding python type based on the Schema.

static create_sparse_schema_from_scalar_value(
path: str,
scalar_value: ScalarValue,
) Schema

Creates a sparse schema from a path and a schema.

Parameters:
  • path – The (dot-separated) path to the nested schema.

  • scalar_value – The scalar value to create the sparse schema from.

Returns:

The sparse schema.

static create_sparse_schema_from_schema(
path: str,
schema: Schema,
) Schema

Creates a sparse schema from a path and a schema.

Parameters:
  • path – The (dot-separated) path to the nested schema.

  • schema – The schema to create the sparse schema from.

Returns:

The sparse schema.

static declare_bulk_data_columns(
schema: Schema,
columns: list[str],
) None

Declares a list of columns as bulk data columns.

Parameters:
  • schema – The schema to declare the bulk data columns in.

  • columns – The list of columns to declare as bulk data columns.

static default_scalar(
value_type: ScalarValue,
) Any

Returns the default value for a ScalarValue.

static default_value(
schema: Schema,
) Any

Returns the default value for a schema.

A schema holds either:

  • a ScalarValue (schema.value) which corresponds to a scalar type (potentially an array of scalars)

  • a dict of sub-Schemas (schema.values) corresponding compound types (potentially an array)

static display_name_to_column_name(
display_name: str,
index: int,
) str

Derive a column name from a Schema’s display_name.

Uses the display_name as-is if it passes column name validation, otherwise falls back to value_{index}.

Parameters:
  • display_name – The display_name of the schema.

  • index – The positional index of the schema (used for fallback naming).

Returns:

A valid column name string.

static find_pyarrow_types(
arrow_schema: Schema,
scalar_types: list[DataType],
) list[dict[str, object]]

Find all the paths in an Arrow schema that correspond to scalar types.

static from_pyarrow_datatype(
data_type: DataType,
) ScalarValue | None

Converts a DataType to a ScalarValue.

Parameters:

data_type – The pyarrow DataType object to convert.

Returns:

The type of the scalar value that corresponds to the pyarrow DataType.

static get_bulk_data_values(
schema: Schema,
path: list[str] | None = None,
) list[str]

Returns a list of bulk data values from a schema.

Parameters:
  • schema – The schema to get the bulk data values from.

  • path – The current path in the schema hierarchy, used for recursion.

Returns:

A list dot-separated paths to leaf schemas that are bulk data.

static get_nested_schema(
schema: Schema,
path: str,
) Schema | None

Retrieves a nested schema from a schema.

Parameters:
  • schema – The schema to retrieve the nested schema from.

  • path – The (dot-separated) path to the nested schema.

Returns:

The nested schema, or None if the path doesn’t exist.

static is_computable(
schema: Schema,
) bool

Returns True if the schema is computable.

static is_embedding_value(
schema: Schema,
) bool

Returns True if the schema is an atomic schema describing an unreduced embedding value.

static is_numeric_value(
schema: Schema,
) bool

Returns True if the schema is an atomic schema describing a numeric value.

static nested_relativizable_columns(
schema: Schema,
column_path_to_here: list[str] | None = None,
) list[list[str]]

Leaf paths whose row-form value is a URL that should be relativized on write.

Superset of nested_url_columns(): also includes leaves whose resolved sample type is an ExternalSampleType, since those always emit an absolute URL string even if the schema’s string role is not set to a URL/... value. Excludes URL/raw leaves (chunk-pattern offset-length encoding, handled separately).

Parameters:
  • schema – Schema to walk.

  • column_path_to_here – Internal accumulator; leave unset at call sites.

static nested_url_columns(
schema: Schema,
column_path_to_here: list[str] | None = None,
) list[list[str]]

Get columns from the schema that have string roles URL/X. Each column is represented as a list of strings, with subsequent strings denoting nested columns.

Parameters:
  • schema – The schema to retrieve the URL columns from.

  • column_path_to_here – The path to the current schema.

static object_input_urls(
obj: Any,
schema: Schema,
) list[Url]

Returns a list of all URLs referenced by this object, from scalar strings or lists of strings

Note: the result is likely to be relative with respect to the object’s URL

static populate_default_values(
row_data: Any,
row_schema: Schema,
) Any

Recursively populate default values according to row_schema.

static pyarrow_schema_to_tlc_schema(
arrow_schema: Schema,
**schema_kwargs: Any,
) Schema

Convert a PyArrow schema to a 3LC schema.

Parameters:
  • arrow_schema – The PyArrow schema to convert.

  • **schema_kwargs – Additional keyword arguments to pass to the Schema constructor.

Returns:

The 3LC schema.

static scalar_value_to_pyarrow_datatype(
value: ScalarValue,
) DataType

Converts a ScalarValue to a pyarrow DataType.

Parameters:

value – The scalar value to convert.

Returns:

The corresponding pyarrow datatype.

static set_nested_schema(
schema: Schema,
path: str,
value: Schema,
) None

Sets a nested schema in a schema.

Parameters:
  • schema – The schema to set the nested schema in.

  • path – The (dot-separated) path to the nested schema.

  • value – The value to set the nested schema to.

Raises:

ValueError – If the path to the schema does not exist or if the leaf node already exists.

static tlc_schema_to_pyarrow_schema(
tlc_schema: Schema,
) Schema

Convert a 3LC schema to a PyArrow schema.

Parameters:

tlc_schema – The 3LC schema to convert.

Returns:

The PyArrow schema.

static to_pyarrow_datatype(
schema_or_value: Schema | ScalarValue,
) DataType

Converts a Schema or ScalarValue to a pyarrow DataType.

Currently supports scalar types, lists of scalar types, structs, and lists of structs.

Parameters:

schema_or_value – The schema or scalar value to convert.

Returns:

The corresponding pyarrow datatype.

static to_simple_value_map(
value_map: dict[float, MapElement],
) dict[int, str]

Converts a value map with float keys and MapElement values to a map with int keys and str values

static top_level_url_values(
schema: Schema,
) list[str]

Return a list of sub-schemas that represent atomic URL values.

This function does not return the keys of nested URL values.

Parameters:

schema – The schema to retrieve the URL values from.

Returns:

A list of sub-value keys corresponding to URL values.

class SegmentationHelper

Helper class for segmentation operations.

static bounding_box_from_rle(
rle: CocoRle,
) list[float]

Convert an RLE mask to a bounding box.

Parameters:

rle – The RLE mask to convert

Returns:

The tight bounding box around the mask in COCO [x, y, width, height] format, where (x, y) is the top-left corner in absolute pixels.

static empty_rle(
height: int,
width: int,
) CocoRle

Create an empty RLE mask with the given dimensions.

Parameters:
  • height – Height of the mask

  • width – Width of the mask

Returns:

An empty RLE mask dictionary with ‘counts’ and ‘size’ fields

static mask_from_polygons(
polygons: list[list[float]],
height: int,
width: int,
*,
relative: bool = False,
) ndarray

Convert a list of polygons to a numpy array.

Parameters:
  • polygons – The list of polygons to convert

  • height – The height of the image

  • width – The width of the image

  • relative – Whether the polygons are relative to the image size

Returns:

A numpy array of shape (H, W, N) containing N binary masks

static mask_from_rle(
rle: dict[str, list[int] | bytes],
) ndarray

Convert an RLE mask to a numpy array.

Parameters:

rle – The RLE mask to convert

Returns:

A numpy array of shape (H, W, N) containing N binary masks

static masks_from_rles(
rles: list[CocoRle],
) ndarray

Convert multiple RLE masks to a numpy array.

Parameters:

rles – List of RLE dictionaries with ‘counts’ and ‘size’ fields

Returns:

A numpy array of shape (H, W, N) containing N binary masks

static polygons_from_mask(
mask: ndarray,
*,
relative: bool = False,
) list[float]

Convert a binary mask to a list of polygons using OpenCV contour detection.

Parameters:
  • mask – The binary mask to convert

  • relative – Whether to return polygons with coordinates relative to image dimensions

Returns:

List of polygons where each polygon is a flattened list of x,y coordinates

static polygons_from_rles(
rles: list[CocoRle],
*,
relative: bool = False,
) list[list[float]]

Convert a list of RLE encoded masks to polygons.

Parameters:
  • rles – List of RLE dictionaries with ‘counts’ and ‘size’ fields

  • relative – Whether to return polygons with coordinates relative to image dimensions

Returns:

List of polygons where each polygon is a flattened list of x,y coordinates

static rles_from_masks(
masks: ndarray,
) list[CocoRle]

Convert a stack of binary masks to RLE format.

Parameters:

masks – A numpy array of shape (H, W, N) containing N binary masks

Returns:

List of RLE dictionaries with ‘counts’ and ‘size’ fields

static rles_from_polygons(
polygons: list[list[float]],
height: int,
width: int,
*,
relative: bool = False,
) list[CocoRle]

Convert a list of polygons to RLE format.

Parameters:
  • polygons – The list of polygons to convert

  • height – The height of the image

  • width – The width of the image

  • relative – Whether the polygons are relative to the image size

Returns:

List of RLE dictionaries with ‘counts’ and ‘size’ fields