Migrate from 2.x to 3.x

This guide describes the common patterns involved in migrating from version 2.x to version 3.x of the 3LC Python API.

3.x makes the public surface explicit, redesigns the schema / sample-type system, and groups CV annotation runtime + schema in one place per shape. Most user code needs only mechanical updates (import paths, keyword arguments); a smaller set of behavior changes need attention even if the names look unchanged.

Dependency and install changes

Getting a 2.x-equivalent install

Several dependencies that were required in 2.x, and therefore always installed alongside 3lc, have been made optional in 3.x. This reduces the install footprint and import load time for workloads that do not need those dependencies. To get the same set of packages as a 2.x install, install the extras that were required in 2.x:

pip install '3lc[pandas,torch]'

See Dependencies for the full list of available extras, and the PyTorch installation notes for guidance on accelerator-specific torch wheels.

pandas is now an optional dependency

In 2.x, pandas was a hard runtime dependency of 3lc and was always installed alongside the package. In 3.x it has been moved to an optional extra to reduce the install footprint for users who do not need it.

pip install 3lc no longer pulls in pandas. To use the pandas-facing entry points, install the extra with pip install '3lc[pandas]', or pip install pandas or equivalent.

The entry points that require pandas to be importable are:

If your code already uses any of these, install the pandas extra and no code changes are required.

torch and torchvision are now optional dependencies

In 2.x, torch and torchvision were hard runtime dependencies of 3lc and import tlc would fail if either was missing. In 3.x they have been moved to an optional [torch] extra so that torch-free workflows (Tables, Runs, the Object Service, URL adapters) can run on a minimal install.

pip install 3lc no longer pulls in torch or torchvision. To use the torch-bound entry points, install the extra with pip install '3lc[torch]'. The [huggingface] extra depends on [torch] transitively, so pip install '3lc[huggingface]' continues to pull both in.

The entry points that require torch to be importable are:

Calling any of these without torch installed raises ImportError. The framework integration packages require the framework itself as well — their ImportError points at the framework’s own install command (pip install detectron2, pip install super-gradients, pip install datasets / transformers). FunctionalMetricsCollector and the MetricsCollector base class do not require torch.

For accelerator-specific builds (CUDA, ROCm, MPS), follow the official PyTorch install instructions to pick the right index URL: https://pytorch.org/get-started/locally/.

Namespace moves

In 2.x, almost every public name was reachable directly at the top-level tlc.* namespace via wildcard imports from tlc.client.* and tlc.core.*. In 3.x, the public surface has been curated and is declared explicitly: each public module sets __all__ to the names it exports, following the convention from PEP 8 — Public and Internal Interfaces. Anything not listed in __all__, or sitting behind an underscore-prefixed package path, is private and may change without notice.

  • tlc.* is now an explicit, curated set of names — the most commonly used types and free functions (Table, Run, Url, Schema, TableWriter, MetricsTableWriter, the init / active_run / log session helpers, collect_metrics, config).

  • Curated public sub-namespaces group related concerns: tlc.schemas, tlc.constants, tlc.metrics, tlc.helpers, tlc.export, tlc.reduction, tlc.data_types, tlc.url, tlc.integration, tlc.sample_types, tlc.objects, tlc.configuration. Each is equally public.

  • Implementation packages are private. tlc.core no longer exists as a public path — it has been renamed to tlc._core and is private. tlc.client has been removed entirely; its contents have been redistributed to the curated public sub-namespaces above. Underscore-prefixed names (tlc._core, tlc.schemas._foo, etc.) may move, rename, or be removed at any time.

  • tlcsaas has been renamed to _tlcsaas to reflect that it was always internal infrastructure for the 3LC client.

  • tlc.config is a bound shortcut for the live Configuration singleton (equivalent to tlc.configuration.Configuration.instance()). It is resolved lazily on first access, so import tlc does not force Configuration construction for callers that never touch config. The Configuration type itself now lives at tlc.configuration.Configuration rather than the top level — most code reaches it through the type of tlc.config and never needs to name it.

import tlc

print(tlc.config.project_root_url)
tlc.config.indexing.scan_urls = ["./data"]

Where everything moved

If a name lived at tlc.X in 2.x and is not listed below, see Removed from public API.

2.x location

3.x location

tlc.client.session.init / close / log / active_run / set_active_run / active_project_name

unchanged — tlc.init / tlc.close / tlc.log / tlc.active_run / tlc.set_active_run / tlc.active_project_name

tlc.client.helpers.*

tlc.helpers.*

tlc.client.reduce.* (also tlc.reduce_* free functions and tlc.create_reducer)

tlc.reduction.*

tlc.client.torch.metrics.* (also tlc.<X>MetricsCollector, tlc.Predictor, …)

tlc.metrics.*

tlc.client.torch.samplers.* (the standalone create_sampler, create_weighted_sampler, create_random_sampler, create_sequential_sampler, create_repeat_by_weight_sampler factories and the RangeSampler / RepeatByWeightSampler / SubsetSequentialSampler classes)

tlc.integration.torch.samplers.*

tlc.core.builtins.schemas.* (also tlc.<X>Schema for scalar/system schemas)

tlc.schemas.* (the Schema base class stays at tlc.Schema)

tlc.core.builtins.constants.* (also tlc.<NAME> constants)

tlc.constants.*

tlc.core.export.* (also tlc.<X>Exporter, tlc.register_exporter)

tlc.export.*

tlc.core.data_formats.* (also tlc.Geometry2DInstances, tlc.OBB2DInstances, …)

tlc.data_types.* — note: dataclass renames, see Annotation dataclass renames

tlc.UrlAdapter, tlc.UrlAdapterDirEntry, tlc.Scheme, tlc.register_url_adapter, tlc.register_url_alias, tlc.unregister_url_alias, tlc.get_alias_path, tlc.get_registered_url_aliases

tlc.url.* (tlc.Url itself stays at the top level)

tlc.core.objects.* (any tlc.core.X)

tlc._core.objects.* — private; reach for the public re-export instead

tlc.MutableObject, tlc.AddressableObject, tlc.SerializableObject, tlc.Object

tlc.objects.* — with tlc.objects.Object semantically redefined as the lightweight root; use tlc.objects.AddressableObject for the 2.x URL-bearing class

tlc.Configuration

tlc.configuration.Configuration — the tlc.config shortcut is unchanged; reach the type through it (or import from tlc.configuration) instead of the top level

tlc.client.torch.metrics.metrics_collectors.bounding_box_metrics_collector.BoundingBoxMetricsCollector (also tlc.BoundingBoxMetricsCollector)

tlc.integration.detectron2.BoundingBoxMetricsCollector

tlc.TableWriter, tlc.MetricsTableWriter

unchanged

tlc.Table, tlc.Run, tlc.Url, tlc.Schema, tlc.config, tlc.collect_metrics

unchanged

New names in the public sub-namespaces

The following are new in 3.x with no 2.x equivalent at any public path:

  • tlc.schemas: ConfidenceSchema, DatetimeStringSchema, EmbeddingSchema, Float64Schema, FractionSchema, ImageSchema (replaces 2.x ImageUrlSchema — see ImageSchema — one class, four modes), Int8Schema, Int16Schema, Int64Schema, IoUSchema, ProbabilitySchema, SemanticSegmentationSchema, Uint8Schema, Uint16Schema, Uint32Schema, Uint64Schema, UrlSchema.

  • tlc.url: IfExistsOption, list_url_adapters, list_url_schemes.

  • tlc.helpers: ColorHelper, AnnotationHelper, ProjectHelper, DateTimeHelper, ImageHelper, AnnotationColumn, AnnotationType (some have non-public 2.x counterparts under tlc.core.export.annotation_utils).

  • tlc.export: RowExporter, ExporterInfo, ExporterRegistry, ExporterSource, list_exporters, list_exporter_formats.

  • tlc.sample_types: SampleType, SampleTypeRegistry, register_sample_type, get_sample_types (see Schema and Sample Type redesign).

  • tlc.data_types: BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, SegmentationMasks (see CV and Annotation types).

tlc.constants prefix families

tlc.constants exposes a flat namespace — every constant is accessible directly at tlc.constants.X, with the same name it had at tlc.X in 2.x. The names group into:

Prefix family

Examples

Column names (no prefix)

LABEL, IMAGE, EXAMPLE_ID, EPOCH, BOUNDING_BOXES, SEGMENTATIONS, …

BOOL_ROLE_*

BOOL_ROLE_ACTION, BOOL_ROLE_NORMAL, BOOL_ROLE_SLIDER

DISPLAY_IMPORTANCE_*

DISPLAY_IMPORTANCE_DEFAULT, DISPLAY_IMPORTANCE_IMAGE, DISPLAY_IMPORTANCE_LOSS, …

NUMBER_ROLE_*

NUMBER_ROLE_LABEL, NUMBER_ROLE_BB_CENTER_X, NUMBER_ROLE_FRACTION, …

RUN_STATUS_*

RUN_STATUS_RUNNING, RUN_STATUS_COMPLETED, RUN_STATUS_PAUSED, …

STRING_ROLE_*

STRING_ROLE_URL, STRING_ROLE_DATETIME, STRING_ROLE_IMAGE_URL, …

UNIT_*

UNIT_ABSOLUTE, UNIT_RELATIVE

WIDGET_*

WIDGET_CHART, WIDGET_FILTERS_PANEL, WIDGET_COLUMN_HEADER, …

DEFAULT_*

DEFAULT_BB_MAX_COUNT, DEFAULT_LIST_MAX_LENGTH, DEFAULT_BULK_DATA_CHUNK_SIZE_MB, …

A handful of role constants that were defined but neither emitted by Python nor read by the dashboard have been removed: NUMBER_ROLE_PIXEL_COUNT, NUMBER_ROLE_BB_HALF_SIZE_X / _Y (use _BB_SIZE_X / _Y), NUMBER_ROLE_METRIC_STRING_INDEX, the bare NUMBER_ROLE_RGB_COMPONENT (use the channel-specific _RED / _GREEN / _BLUE), STRING_ROLE_NONE (use the literal ""), STRING_ROLE_INSTANCE_SEGMENTATION_URL.

Removed from public API

These were exposed at the top level in 2.x but are no longer addressable. Use the public surface instead, or reach into tlc._core.* privately if you genuinely need to:

  • Session class — private. Use the free functions tlc.init, tlc.close, tlc.active_run, tlc.set_active_run, tlc.active_project_name (which were always the recommended API).

  • Filter criteriatlc.FilterCriterion, tlc.FreeTextFilterCriterion, tlc.IntegerSetFilterCriterion, tlc.LogicalNotFilterCriterion, tlc.NumericRangeFilterCriterion, tlc.Region2DFilterCriterion, tlc.Region3DFilterCriterion, tlc.TextFilterCriterion, tlc.create_filter, tlc.create_optional_filter. The dashboard is the supported way to author filtered views in 3.x.

  • Registry classes and URL adapterstlc.ObjectTypeRegistry, tlc.ObjectRegistry, tlc.ObjectReference, tlc.UrlAliasRegistry, tlc.UrlAdapterRegistry, tlc.AdapterInfo, tlc.AdapterSource, and the concrete URL adapter classes (tlc.S3UrlAdapter, tlc.GcsUrlAdapter, tlc.HttpUrlAdapter, tlc.FileUrlAdapter, tlc.AbfsUrlAdapter, tlc.ApiUrlAdapter). Adapters are pluggable via the tlc.url_adapters entry-point group; subclass tlc.url.UrlAdapter and decorate with @tlc.url.register_url_adapter.

  • TableFrom* constructor classes — private. Use the corresponding tlc.Table.from_* factory method instead (from_coco, from_csv, from_parquet, from_pandas, from_dict, from_torch_dataset, from_yolo_url, from_hugging_face_hub, from_hugging_face_dataset).

  • TableFromTFRecordSet — was a placeholder, never implemented.

  • Bulk-data URL helperstlc.bulk_data_url_context, tlc.increment_and_get_bulk_data_url, tlc.relativize_bulk_data_url, tlc.reset_bulk_data_url, tlc.set_bulk_data_url_prefix. Bulk-data URL accounting is internal in 3.x.

  • Table.create_sampler() (deprecated in 2.x) — removed. Use the standalone factory functions in tlc.integration.torch.samplers: the general-purpose create_sampler(table, ...) dispatcher, or the explicit create_weighted_sampler, create_random_sampler, create_sequential_sampler, and create_repeat_by_weight_sampler.

  • Table.get_column() — use Table.get_column_as_pyarrow_array() instead.

  • Table.row_schema — use Table.rows_schema instead. The two properties were near-duplicates; the remaining one returns the table’s live schema reference and should be treated as read-only.

  • Table.from_yolo(...) (the deprecated YAML-parsing factory) — use Table.from_yolo_url(...) directly for a single image folder or text file. For a YOLO dataset YAML file with splits, install 3lc-ultralytics and use tlc_ultralytics.create_tables_from_yaml_file to get one Table per split:

    from tlc_ultralytics import create_tables_from_yaml_file
    
    tables = create_tables_from_yaml_file(
        dataset="/path/to/my/dataset.yaml",
        task="detect",
        project_name="My YOLO Project",
    )
    

    The same keyword arguments for pose (points, oks_sigmas, …) can be passed to configure the pose task.

  • tlc.client.sample_type module — removed. Sample types are referenced by string name through the SampleTypeRegistry (see Schema and Sample Type redesign).

  • tlc.helpers.JsonHelper — internal JSON-serialization plumbing, now private. Its only methods (to_minimal_dict, sort_by_rank) operate on internal object/schema structures and were not part of any user workflow. The class still exists at the private path tlc.helpers.json_helper.JsonHelper. Likewise, the internal-only SchemaHelper.pyarrow_list_to_tlc_schema and SegmentationMetricsCollector.tensor_to_pil_image are now underscore-prefixed (SchemaHelper itself stays public).

  • @tlc.integration.pytorch_lightning.lightning_module decorator and the tlc.integration.pytorch_lightning package — removed. Integrate via Lightning’s standard hooks; see Integrations.

Built-in schemas — reorganized

The set of built-in schemas has been cleaned up. A shape= parameter on the scalar schemas absorbs every *ListSchema wrapper, the CV annotation *Schema classes are now classmethods on the matching tlc.data_types dataclass (so a column’s runtime type and its schema-builder live together), and a handful of single-purpose label / vector / RGB-component schemas have been folded back into the corresponding primitive with number_role=.

Concretely: some schemas are simply gone; others survive but are constructed differently.

2.x schema

3.x replacement

BoundingBoxListSchema, BoundingBox2DSchema, BoundingBox3DSchema, OrientedBoundingBox2DSchema, OrientedBoundingBox3DSchema

BoundingBoxes2D.schema(...) / BoundingBoxes3D.schema(...) / OrientedBoundingBoxes2D.schema(...) / OrientedBoundingBoxes3D.schema(...)

BoundingBoxes2DSchema, BoundingBoxes3DSchema, OrientedBoundingBoxes2DSchema, OrientedBoundingBoxes3DSchema, Keypoints2DSchema, Geometry2DSchema, Geometry3DSchema, GeometrySchema

The matching <Dataclass>.schema(...) on the dataclass in tlc.data_types

SegmentationSchema

SegmentationPolygons.schema(...) or SegmentationMasks.schema(...)

FloatVector2Schema, FloatVector3Schema

Float32Schema(shape=2, number_role="xy_component") / (shape=3, number_role="xyz_component")

BlueComponentSchema / BlueComponentListSchema, GreenComponentSchema / , RedComponentSchema /

Uint8Schema(number_role="rgb_component_blue" / "_green" / "_red")

CIFAR10LabelSchema, COCOLabelSchema / CocoLabelSchema

CategoricalLabelSchema(classes=...)

CategoricalLabel (a 2.x sample type, not a schema)

CategoricalLabelSchema(classes=...)

Int32ListSchema, BoolListSchema, StringListSchema

<X>Schema(shape=(-1,)) (or shape=N for fixed length)

ImageUrlSchema

ImageSchema() (see ImageSchema — one class, four modes)

ConfidenceSchema, EmbeddingSchema, FractionSchema, IoUSchema, ProbabilitySchema, and SemanticSegmentationSchema are new pre-configured schemas in 3.x — useful when you’d otherwise reach for Float32Schema(number_role="...") with a stock role.

CategoricalLabel was a 2.x sample type (constructed as CategoricalLabel(display_name, classes=...)), not a schema, but it was commonly passed where a column schema was expected — e.g. in the column_schemas/schema mapping of Run.add_metrics or TableWriter. Replace it with CategoricalLabelSchema(classes, display_name=...): the first positional argument becomes the classes= keyword.

Removed re-exports

The following import paths were deprecated re-exports and have been removed:

Old

New

tlc.core.builtins.types.segmentation_helper

tlc.helpers.segmentation_helper

tlc.client.data_format

tlc.data_types (and tlc.data_types.segmentation for segmentation types)

tlc.lightning_module / tlc.integration.lightning_module

removed — see Integrations

tlcconfig.logger_configurator

tlclogging.logger_configurator

tlc.client.utils.relativize_with_max_depth(url, owner, max_depth)

url.to_relative_with_max_depth(owner=owner, max_depth=max_depth)

tlc.client.torch.metrics.metrics_collectors.segmentation_metrics_collector.PREDICTED_MASK_METRIC_NAME

tlc.constants.PREDICTED_MASK

The old list-based BoundingBox classes (BoundingBox, XYXYBoundingBox, CenteredXYWHBoundingBox, and friends, previously at tlc.core.builtins.types.bounding_box / tlc.core.data_formats.bounding_boxes) have been removed entirely, superseded by the BoundingBoxes2D / BoundingBoxes3D dataclasses in tlc.data_types.

Renames

This section covers renames where the symbol still exists with the same role, just under a new name or parameter spelling. The 2.x names emitted deprecation paths during 2.x; in 3.x they are removed.

Casing — acronym → PascalCase

2.x

3.x

TLCException

TlcException

COCOAnnotation, COCOGroundTruth, COCOPrediction, COCOExporter

CocoAnnotation, CocoGroundTruth, CocoPrediction, CocoExporter

CSVExporter, DefaultJSONExporter, YOLOExporter

CsvExporter, DefaultJsonExporter, YoloExporter

UMAPTable, UMapTable, UMapTableArgs, UMapReduction, UMAPReduceEmbeddingsHook

UmapTable, UmapTableArgs, UmapReduction, UmapReduceEmbeddingsHook

PaCMAPTable, PaCMAPTableArgs, PaCMAPReduction

PacmapTable, PacmapTableArgs, PacmapReduction

InstanceSegmentationRLEBytesStringValue

InstanceSegmentationRleBytesStringValue

FSSpecUrlAdapter, FSSpecUrlAdapterDirEntry

FsspecUrlAdapter, FsspecUrlAdapterDirEntry

GCSUrlAdapter, GSUrlAdapterDirEntry

GcsUrlAdapter, GcsUrlAdapterDirEntry

LRUCache, LRUEntry, LRUFuncCache, LRUCacheStore, LRUCacheStoreConfig

LruCache, LruEntry, LruFuncCache, LruCacheStore, LruCacheStoreConfig

Parameter renames

Site

2.x

3.x

Table.squash

root=

root_url=

ProjectHelper.register_project_url_alias

root=

root_url=

ProjectHelper.register_project_url_alias

project=

project_name= (now keyword-only)

AddressableObject.root (property)

.root

.root_url

CategoricalLabelSchema(class_names=..., display_colors=...)

class_names + display_colors

classes= (single argument combines both)

Run.add_metrics_data(override_column_schemas=..., input_table_url=..., …)

add_metrics_data, override_column_schemas, input_table_url

Run.add_metrics(schema=..., foreign_table_url=..., ...)

tlc.client.torch.metrics.collect_dataset.collect_metrics(..., exclude_zero_weights=...)

exclude_zero_weights

parameter removed — it had no effect; the detectron2 collect_metrics moved to tlc.integration.detectron2

UMAPTable.seed property

.seed

construct with random_state=... for determinism

Table.from_image_folder(root=...) is not affected — that root parameter refers to the image folder path, not the project root URL.

Url.create_* classmethods → ProjectLayout methods

The Url.create_* classmethods are gone. They are now methods on ProjectLayout:

2.x

3.x

Url.create_project_url(...)

ProjectLayout.project_url(...)

Url.create_table_url(...)

ProjectLayout.table_url(...)

Url.create_run_url(...)

ProjectLayout.run_url(...)

Url.create_default_aliases_config_url(...)

ProjectLayout.default_project_aliases_config_url(...)

Url.is_dataset_table_url()

ProjectLayout.is_dataset_table_url(url)

Url.is_run_url()

ProjectLayout.is_project_run_url(url)

Url.is_metrics_table_url()

ProjectLayout.is_run_metrics_table_url(url)

Url.create_unique(require_writable=True)

ProjectLayout.create_unique_table_url(url, require_writable=True)

tlc.client.helpers.register_project_url_alias(...)

ProjectHelper.register_project_url_alias(...)

The corresponding 2.x classmethods all accepted root= in 2.x; pass root_url= to the ProjectLayout methods instead.

IndexingTable verbs (niche)

The IndexingTable subclasses (TableIndexingTable, RunIndexingTable, ConfigIndexingTable) live under tlc._core and are considered internal. If you were using the escape hatch tlc.TableIndexingTable.instance().wait_for_complete_index() to force a re-scan after editing files outside the Python package, the verbs have been renamed:

  • wait_for_complete_index()sync() (blocking, returns True on completion)

  • request_reindex()request_sync() (non-blocking, returns a request token)

The force= parameter has been dropped.

Scheme is now string constants

In 2.x Scheme was an enum; in 3.x it is a string-constants class and url.scheme is already a string. Drop any .value calls:

# 2.x
url.scheme.value      # "file"
Scheme.FILE.value     # "file"

# 3.x
url.scheme            # "file"
Scheme.FILE           # "file"

Annotation dataclass renames

2.x

3.x

Geometry2DInstances

Geometry2D

Geometry3DInstances

Geometry3D

Keypoints2DInstances

Keypoints2D

OBB2DInstances

OrientedBoundingBoxes2D

OBB3DInstances

OrientedBoundingBoxes3D

BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, SegmentationMasks are new in 3.x — see CV and Annotation types.

Sample-type registry strings

The sample_type registry strings are now snake_case. The format-specific variants replace the 2.x dict form:

2.x

3.x

"PILImage" / {"name": "pil_image", "format": "jpeg"}

"pil_png", "pil_jpeg", "pil_webp"

"small_numpy_array", "large_numpy_array"

"numpy_array", "external_numpy_array"

"small_torch_tensor", "large_torch_tensor"

"torch_tensor", "external_torch_tensor"

"instance_segmentation_polygons"

"segmentation_polygons"

"instance_segmentation_masks"

"segmentation_masks"

(n/a — was the bb_list flat schema)

"bounding_boxes_2d", "bounding_boxes_3d", "oriented_bounding_boxes_2d", "oriented_bounding_boxes_3d", "keypoints_2d"

Old PascalCase and pre-rename strings are registered as legacy aliases — existing serialized tables still load. "pil_image" is registered as an alias for "pil_png", so old PascalCase "PILImage" and bare "pil_image" references resolve to PNG by default. The previous {"name": "pil_image", "format": "jpeg"} dict form is removed; pass "pil_jpeg" directly instead.

Color helpers — free functions → ColorHelper

# 2.x
tlc.rgb_tuple_to_hex(rgb)
tlc.hex_to_rgb_tuple(hex_str)

# 3.x
tlc.helpers.ColorHelper.rgb_tuple_to_hex(rgb)
tlc.helpers.ColorHelper.hex_to_rgb_tuple(hex_str)

Keyword-only parameters

In 2.x, most parameters on the public API were positional. In 3.x, public callables follow the convention positional only when the role is obvious at the call site; everything else keyword-only. This lets new keyword arguments be added near the top of the signature where they remain visible, instead of being appended to the end of a long positional chain.

If you already pass these arguments by name (e.g. Table.from_pandas(df, schema=schema, table_name="t")), nothing changes. Update any positional call that broke.

Affected callables and which arguments stay positional

Callable

Positional in 3.x

Everything else

Table.from_url

url

Table.from_names

(none)

project_name, dataset_name, table_name, root_url

Table.from_pandas

df

all other parameters

Table.from_dict

data

all other parameters

Table.from_csv / from_parquet / from_ndjson

the source file

all other parameters

Table.from_torch_dataset

dataset

including schema, all_arrays_are_fixed_size

Table.from_image_folder

root

including image_column_name, label_column_name, extensions, label_overrides

Table.from_coco

annotations_file, image_folder

including task, keep_crowd_annotations, all pose keypoint params, per_instance_extras, per_image_extras

Table.from_yolo_ndjson

ndjson_file, image_folder

including split

Table.from_yolo_url

images_url

including categories, task, max_depth, allow_fetch_remote_data

Table.from_hugging_face_hub

path, name, split (mirrors datasets.load_dataset)

all other parameters

Table.from_hugging_face_dataset

hf_dataset

all other parameters

Table.join_tables

tables

all other parameters

tlc.init

project_name, run_name

description, parameters, if_exists, root_url, run_url

Run.from_names

(none)

project_name, run_name, root_url

Run.copy

(none)

run_name, project_name, root_url, if_exists, destination_url

Run.add_metrics

metrics

schema, foreign_table_url, constants

collect_metrics

table, metrics_collectors

including predictor, foreign_table_url, constants, run_url, split, dataloader_args

TableWriter.__init__

(none)

schema, project_name, dataset_name, table_name, root_url, table_url, if_exists, description, input_tables, all bulk_data_*

MetricsTableWriter.__init__

(none)

run_url, foreign_table_url, schema

MapElement.__init__

internal_name

display_name, description, display_color, url

Table.add_column

column_name, values

schema, url

Table.export

output_url, format

weight_threshold (plus exporter-specific **kwargs)

Table.set_value_map_item

value_path, value, internal_name

display_name, description, display_color, url, edited_table_url

Table.add_value_map_item

value_path, internal_name

display_name, description, display_color, url, value, edited_table_url

Table.write_to_row_cache / get_rows_as_binary / get_column_as_pyarrow_array

the column name (where applicable)

the flag args (create_url_if_empty, overwrite_if_exists, exclude_bulk_data, combine_chunks)

Run.reduce_embeddings_by_foreign_table_url

foreign_table_url

delete_source_tables

Run.reduce_embeddings_per_dataset

(none)

delete_source_tables

tlc.reduction.reduce_embeddings / _per_dataset / _by_foreign_table_url / _with_producer_consumer / _multiple_parameters

the source Table / list[Table] (and foreign_table_url / consumers where applicable)

method, delete_source_tables, parameter_sets

MetricsCollector, FunctionalMetricsCollector, EmbeddingsMetricsCollector, SegmentationMetricsCollector constructors

(none) — fully keyword-only

every argument (collection_fn, layers, label_map, preprocess_fn, compute_aggregates, reshape_strategy, schema)

<Dataclass>.schema(...) for every tlc.data_types annotation type (BoundingBoxes2D/3D, OrientedBoundingBoxes2D/3D, Keypoints2D, SegmentationMasks/Polygons, Geometry2D/3D)

(none) — fully keyword-only

every argument, including classes and num_keypoints

Schema.add_sub_value / from_sample / add_sample_weight

the data argument (name+value, sample)

the flag args (writable, computable, all_arrays_are_fixed_size, hidden, default_value)

Url.write_bytes / Url.write_text

content, encoding

if_exists

Url.make_parents

(none)

exist_ok

Url.expand_aliases

(none)

allow_unexpanded

tlc.url.register_url_alias

token, path

force

ProjectLayout.create_unique_table_url

url

require_writable

SegmentationHelper.mask_from_polygons / polygons_from_mask / polygons_from_rles / rles_from_polygons

the geometry data (plus height, width)

relative

GeometryHelper.create_isotropic_bounds_3d

the six bound values

force_z_min

Two of these are now fully keyword-only (no positional arguments at all): the metrics-collector constructors and the annotation <Dataclass>.schema(...) builders. For the collectors, 2.x code such as EmbeddingsMetricsCollector([99]) or FunctionalMetricsCollector(my_fn) must become EmbeddingsMetricsCollector(layers=[99]) / FunctionalMetricsCollector(collection_fn=my_fn). The .schema(...) builders are new in 3.x, so there is no 2.x positional form to port — just always call them with keywords (e.g. Keypoints2D.schema(num_keypoints=17, classes=...)).

Argument-order changes inside the keyword-only block

Even when you already used keyword arguments, the documentation order on Table.from_* factories (and related helpers like TableWriter, Table.from_names, Run.from_names, Run.copy) has been aligned to a single canonical order:

  • Method-specific options first.

  • Then the project-layout block in hierarchy orderproject_name dataset_name table_name root_url table_url. (In 2.x the layout block was table_name dataset_name project_name. Reading top-down now matches the hierarchy: a project contains datasets which contain tables.)

  • Then if_exists, weights, and common metadata.

This affects nothing at the call site (kwargs are order-independent), but if you rely on inspect.signature or generated docs, expect the parameter listing to read differently.

Common migration patterns

# 2.x
table = tlc.Table.from_pandas(df, schema, "my_table", "my_dataset", "my_project")
tlc.collect_metrics(table, mc, model, constants={"epoch": 1})
run = tlc.Run.from_names("my_run", "my_project")

# 3.x
table = tlc.Table.from_pandas(
    df,
    schema=schema,
    project_name="my_project",
    dataset_name="my_dataset",
    table_name="my_table",
)
tlc.collect_metrics(table, mc, predictor=model, constants={"epoch": 1})
run = tlc.Run.from_names(project_name="my_project", run_name="my_run")

Behavioral changes

This section covers changes that may import and run cleanly but behave differently at runtime. Read each subsection even if your import passes don’t flag anything, so that existing code does not silently change in ways your do not expect.

Default table_name changed from "table" to "initial"

When no table_name is passed to Table.from_*, Table.from_names, Table.copy, TableWriter, or ProjectLayout.table_url, the resulting URL is now .../tables/initial/ instead of .../tables/table/.

Pre-3.0 projects have their data at .../tables/table/. Reading them without specifying a table_name — for example tlc.Table.from_names(project_name="p", dataset_name="d") — will now resolve to a non-existent .../tables/initial/ path. Round-trip patterns like tlc.Table.from_dict(...).latest() that rely on if_exists="reuse" will also silently create a new initial/ table instead of reusing the old table/ one. When 3LC detects a legacy .../tables/table/ alongside a new-default target, it emits a warning pointing this out. The warning fires for if_exists values reuse, rename, and raise, but not overwrite (which is an explicit “write regardless” intent).

Migration: For pre-3.0 projects, pass table_name="table" explicitly when reading an existing table if you would have previously passed no table_name and you want to preserve use of the old default name..

Session.run_url is now a Url

In 2.x, Session.run_url was stored as a str. In 3.x it is stored as a Url object, matching the return type of Session.initialize_run.

Migration: If you were comparing or concatenating session.run_url as a string, call .to_str() on it, or update the code to work with Url directly. Reads that pass it into APIs accepting Url | str need no change.

MetricsTableWriter.finalize() now updates the Run automatically

In 2.x, after calling MetricsTableWriter.finalize(), you had to manually update the run with the written metrics:

# 2.x
metrics_writer = MetricsTableWriter(run_url=run.url, ...)
metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
metrics_table = metrics_writer.finalize()

# Manual step required to associate the metrics table with the run
run.update_metrics(metrics_writer.get_written_metrics_infos())

In 3.x, finalize() automatically updates the run. The recommended pattern is to use MetricsTableWriter as a context manager, which calls finalize() on exit:

# 3.x
with MetricsTableWriter(run_url=run.url, ...) as metrics_writer:
    metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
# The run is automatically updated on exit.

finalize() can only be called once; calling it again raises RuntimeError.

Migration: Remove manual calls to run.update_metrics(metrics_writer.get_written_metrics_infos()) after finalize().

Url.read() and Url.write() removed

The generic Url.read() and Url.write() methods have been removed. Use the type-specific methods instead:

# 2.x
content = url.read()             # bytes (default mode="b")
content = url.read(mode="t")     # str
url.write(b"data")
url.write("text", mode="t")
url.write(content, if_exists="raise")

# 3.x
content = url.read_bytes()
content = url.read_text(encoding="utf-8")
url.write_bytes(b"data", if_exists=...)
url.write_text("text", if_exists=...)

The if_exists argument is supported by both write_bytes and write_text (values: "overwrite", "rename", "raise").

Sample view always returns a dict

Container sample types ("tuple", "list", "box", "horizontal_tuple", "horizontal_list") and the is_leaf flag on SampleType were removed. The sample view of any composite table is now a dict — one entry per visible column — instead of a tuple, list, or single value.

# 2.x — schema with sample_type="tuple"
image, label = table[0]

# 3.x — sample view is always a dict
sample = table[0]
image, label = sample["image"], sample["label"]

Tables persisted in 2.x with a container sample_type continue to load: the resolver coerces the legacy name to identity and emits a one-time warning per name. No data migration is required, but reader code that destructured the sample as a tuple needs to be updated. If you constructed a schema with sample_type="tuple" / "list" / "box" to control the sample shape, drop the sample_type argument; the structural part of the schema (values=...) is unchanged. Schema.from_schema_like applied to a tuple of schemas no longer attaches sample_type="tuple" to the resulting composite — it produces a dict-shaped schema with value_i keys (or display names where provided).

reduce_embeddings no longer accepts a list of tables

tlc.reduction.reduce_embeddings previously accepted either a single Table or a list[Table], returning a Table in the first case and a dict[Url, Url] in the second (with a DeprecationWarning on the list form). It now accepts a single Table and always returns a Table:

# 2.x
url_mapping = tlc.reduction.reduce_embeddings([table_a, table_b], method="umap")
reduced_a = tlc.Table.from_url(url_mapping[table_a.url])

# 3.x
reduced_a = tlc.reduction.reduce_embeddings(table_a, method="umap")
reduced_b = tlc.reduction.reduce_embeddings(table_b, method="umap")

The other multi-table reduction helpers (reduce_embeddings_per_dataset, reduce_embeddings_by_foreign_table_url, reduce_embeddings_with_producer_consumer) are unchanged and still take list[Table].

ClassificationMetricsCollector removed

ClassificationMetricsCollector has been removed. It bundled four metrics (loss, predicted, accuracy, confidence) behind a rigid input contract — batch had to be a (samples, labels) tuple and the model output had to be a raw logits tensor — which made it inflexible for HF-style models or models returning anything other than torch.Tensor. The same metrics are easy to compute with FunctionalMetricsCollector:

import torch
import torch.nn.functional as F
import tlc

def classification_metrics_fn(batch, predictor_output: tlc.PredictorOutput) -> dict:
    _, labels = batch
    predictions = predictor_output.forward
    if labels.dim() == 2 and labels.shape[1] > 1:
        labels = torch.argmax(labels, dim=1)
    softmax_out = F.softmax(predictions, dim=1)
    predicted = torch.argmax(predictions, dim=1)
    confidence = torch.gather(softmax_out, 1, predicted.unsqueeze(1)).squeeze(1)
    accuracy = predicted.eq(labels).float()
    loss = F.cross_entropy(predictions, labels, reduction="none")
    return {
        "loss": loss.detach().cpu().numpy(),
        "predicted": predicted.detach().cpu().numpy(),
        "accuracy": accuracy.detach().cpu().numpy(),
        "confidence": confidence.detach().cpu().numpy(),
    }

schemas = {
    "predicted": tlc.schemas.CategoricalLabelSchema(
        display_name="predicted label",
        classes=class_names,
    ),
}

collector = tlc.metrics.FunctionalMetricsCollector(
    collection_fn=classification_metrics_fn,
    schema=schemas,
)

TableWriter auto-detects sample-form vs row-form inputs

TableWriter now auto-detects sample-form vs row-form inputs per value (previously all inputs were treated as sample-form). Both work without a mode flag:

# Sample-form: a PIL.Image is serialized to a file and the URL is stored
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": pil_image})

# Row-form: a URL string is stored directly (relativized against the table)
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": "path/to/image.png"})

If you have custom SampleType subclasses, make sure their accepts() implementation returns True only for sample-form inputs — it is what the writer uses to distinguish the two.

Table.latest()wait_for_rescan and use_new_columns removed

Table.latest() no longer accepts wait_for_rescan. The blocking/non-blocking choice is now expressed through timeout:

  • 2.x latest(wait_for_rescan=True) → 3.x latest() (defaults to timeout=30.0)

  • 2.x latest(wait_for_rescan=False) → 3.x latest(timeout=0) (non-blocking; in-process fast-path only)

  • Pass timeout=None to block indefinitely.

use_new_columns is also removed. With dict-based samples in 3.x, extra keys no longer disrupt downstream consumers; columns that should not appear in samples can be marked at the column level via sample_type="hidden" (or any sample type with is_included_in_sample = False).

UrlAdapterRegistrydefault_value removed; async signatures cleaned up

UrlAdapterRegistry read methods used to accept a default_value to return when no adapter could be found. In 3.x, Url validates schemes at construction time, so the “no adapter” code path is unreachable for valid URLs. The read methods now raise ValueError if no adapter is found; the getter methods simply return None:

# 3.x
content = UrlAdapterRegistry.read_string_content_from_url(url)
data = UrlAdapterRegistry.read_binary_content_from_url(url)
data = await UrlAdapterRegistry.read_binary_content_from_url_async(url)
adapter = UrlAdapterRegistry.get_url_adapter_for_url(url)        # returns None if not found
adapter = UrlAdapterRegistry.get_url_adapter_for_scheme(scheme)  # returns None if not found

All *_async methods on ObjectRegistry, UrlAdapterRegistry, and the URL adapters themselves are now proper async methods — in 2.x they were sync methods that returned concurrent.futures.Future. Replace future = X.foo_async(...); future.result() with await X.foo_async(...):

# 2.x
future = ObjectRegistry.delete_object_from_url_async(url); future.result()

# 3.x
await ObjectRegistry.delete_object_from_url_async(url)

Async convenience methods are now available directly on UrlAdapterRegistry (read_binary_content_from_url_async, write_binary_content_to_url_async, delete_url_async) — adapter lookup is handled automatically. If you previously looked up an adapter just to call async methods, use the registry methods directly for cleaner code.

Schema and Sample Type redesign

The schema system has been redesigned. The old SampleType class hierarchy has been replaced with a new SampleType base class, a SampleTypeRegistry for pluggable registration, and convenience schemas that configure storage and transforms in one step. Schemas now directly control how data is transformed and stored.

Name-level changes (registry strings, sample_from_rowfrom_row, etc.) are in Sample-type registry strings and the renames table. This section covers the API redesign that needs more than search-and-replace.

Convenience schemas — explicit parameters

All built-in convenience schemas now have explicit parameters — no more **kwargs or **schema_kwargs. IDE autocompletion shows every available parameter, and typos are caught at the call site.

shape parameter on scalar schemas

Scalar schemas now accept a shape parameter for arrays of any dimensionality. The previous *ListSchema convenience classes have been folded into this parameter; only Float32ListSchema and CategoricalLabelListSchema survive as thin wrappers (because they’re common-enough that a dedicated class reads more clearly than shape=(-1,)).

Convenience class

Equivalent primitive

Float32ListSchema(list_size=10)

Float32Schema(shape=10)

Float32ListSchema()

Float32Schema(shape=(-1,))

CategoricalLabelListSchema(classes=)

CategoricalLabelSchema(classes=, shape=(-1,))

Use -1 for variable-size dimensions: Float32Schema(shape=(-1, -1)) for a variable 2D array.

Float32ListSchema no longer accepts a number_role parameter. For a list of fractions / confidences / probabilities, use the baked schema with a list shape: ConfidenceSchema(shape=(-1,)) instead of Float32ListSchema(number_role="fraction/confidence"). The same pattern applies to other roled list columns — pick the baked schema (EmbeddingSchema, FractionSchema, ConfidenceSchema, ProbabilitySchema, IoUSchema, CategoricalLabelSchema, …) and pass shape=.

ImageSchema — one class, four modes

ImageSchema is new in 3.x and covers every image-column use case via its sample_type kwarg. It replaces the 2.x ImageUrlSchema class and consolidates the image-format selection that 2.x performed via the sample-type dict form {"name": "pil_image", "format": ...} (now removed; see Sample-type registry strings).

# 2.x — separate class for URL columns; format chosen via sample-type dict form
ImageUrlSchema()                                                          # URL passthrough
Schema(value=ImageUrlStringValue(), sample_type="pil_image")              # PIL → file (PNG)
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "jpeg"})
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "webp"})

# 3.x — one class, four modes
ImageSchema(sample_type="url")                 # URL passthrough — pre-existing files
ImageSchema()                                  # PIL → file (PNG, default)
ImageSchema(sample_type="pil_jpeg")            # PIL → file (JPEG)
ImageSchema(sample_type="pil_webp")            # PIL → file (WEBP)

All four modes serialize to the same wire format (ImageUrlStringValue); only the Python-side behavior differs. sample_type=None is accepted as an alias for "url".

Common column attributes

All schemas now accept display_name, description, writable, default_visible, and default_value as explicit parameters:

# 2.x — had to use **kwargs, not discoverable
Float32Schema(display_name="Score", writable=False)  # worked but wasn't documented

# 3.x — explicit, IDE-discoverable
Float32Schema(display_name="Score", writable=False, default_visible=True, default_value=0.0)

display_importance is accepted by the base Schema and by the EpochSchema / IterationSchema system schemas, but not by the convenience scalar / categorical / image schemas. Construct the base Schema directly if you need a non-default display ordering on a convenience-shaped column.

Custom sample types

Custom SampleType subclasses should be migrated to the new SampleType base class, registered via @tlc.sample_types.register_sample_type. The method names changed: sample_from_rowfrom_row, row_from_sampleto_row.

# 2.x
class MyType(tlc.sample_types.SampleType):
    def sample_from_row(self, value):
        return custom_decode(value)
    def row_from_sample(self, value):
        return custom_encode(value)

# 3.x
@tlc.sample_types.register_sample_type("my_type")
class MyType(tlc.sample_types.SampleType):
    def from_row(self, value):
        return custom_decode(value)
    def to_row(self, value):
        return custom_encode(value)

New public APIs in tlc.sample_types

  • SampleType — base class for all sample types.

  • SampleTypeRegistry — registry for looking up and registering custom sample types.

  • register_sample_type — decorator for registering a SampleType subclass or a dataclass with to_row() / from_row() methods.

  • get_sample_types — list all registered sample type names.

CV and Annotation types

This section bundles the runtime dataclasses, schema-builders, and importer / exporter changes for computer-vision annotations. The unifying point: each annotation shape now owns both its runtime data and its column schema-builder under one dataclass in tlc.data_types.

One dataclass per shape, with a bound schema-builder

All CV annotation types now follow a consistent plural-base naming convention. The base name is always plural (e.g. BoundingBoxes2D), and the dataclass owns both runtime data and its column schema-builder (Dataclass.schema(...)):

Sample-type string

Dataclass

Schema-builder

bounding_boxes_2d

BoundingBoxes2D

BoundingBoxes2D.schema(...)

bounding_boxes_3d

BoundingBoxes3D

BoundingBoxes3D.schema(...)

oriented_bounding_boxes_2d

OrientedBoundingBoxes2D

OrientedBoundingBoxes2D.schema(...)

oriented_bounding_boxes_3d

OrientedBoundingBoxes3D

OrientedBoundingBoxes3D.schema(...)

keypoints_2d

Keypoints2D

Keypoints2D.schema(...)

segmentation_polygons

SegmentationPolygons

SegmentationPolygons.schema(...)

segmentation_masks

SegmentationMasks

SegmentationMasks.schema(...)

All dataclasses live under tlc.data_types. Constructor arguments are unchanged from their 2.x equivalents — only the spelling is (see Annotation dataclass renames).

# 2.x
from tlc.core.data_formats import Geometry2DInstances, OBB2DInstances

geom = Geometry2DInstances.create_empty(x_max=640.0, y_max=480.0)
obbs = OBB2DInstances.create_empty(x_max=640.0, y_max=480.0)

# 3.x
from tlc.data_types import Geometry2D, OrientedBoundingBoxes2D

geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0)
obbs = OrientedBoundingBoxes2D.create_empty(x_max=640.0, y_max=480.0)

BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, and SegmentationMasks are new in 3.x and have no 2.x dataclass counterpart. The 2.x sample types InstanceSegmentationPolygons / InstanceSegmentationMasks (from tlc.client.sample_type) were dropped along with the rest of that module.

Schema-builder replaces the 2.x *Schema classes

The 2.x tlc.SegmentationSchema (which selected storage form via its sample_type argument) has been split into two dataclass-bound builders:

2.x

3.x

tlc.SegmentationSchema(sample_type="instance_segmentation_polygons", ...)

SegmentationPolygons.schema(...)

tlc.SegmentationSchema(sample_type="instance_segmentation_masks", ...)

SegmentationMasks.schema(...)

# 2.x
schema = {"seg": tlc.SegmentationSchema(classes=["cat", "dog"])}

# 3.x
from tlc.data_types import SegmentationPolygons
schema = {"seg": SegmentationPolygons.schema(classes=["cat", "dog"])}

The full list of removed *Schema classes (and their replacements) is in Built-in schemas — reorganized.

Bounding box format migration

3.x replaces the flat bb_list dict format with the geometry-based BoundingBoxes2D dataclass (tlc.data_types.BoundingBoxes2D).

  • Existing tables written by 2.x still load — their bounding-box columns are stored in the legacy bb_list form and are not rewritten on read.

  • New tables, importers, and metrics produce BoundingBoxes2D instances, and the built-in schema builder (BoundingBoxes2D.schema(...)) is what 3.x emits by default. The COCO and YOLO importers/exporters auto-detect the stored format and handle both transparently — but your code that reads or writes box columns does not get that for free. The old BoundingBoxListSchema has been removed from the public API.

So any code that touches a bounding-box column needs an audit, in one of two directions.

Writing. Stop building the {"image_width", "image_height", "bb_list": [...]} dict. Construct a BoundingBoxes2D directly (coordinates are absolute XYXY by default; pass bounding_box_format= / normalized= if yours differ):

from tlc.data_types import BoundingBoxes2D
import numpy as np

bb = BoundingBoxes2D(
    bounding_boxes=np.array([[10, 20, 100, 200]], dtype=np.float32),  # Nx4, XYXY
    labels=np.array([3], dtype=np.int32),
    confidences=np.array([0.9], dtype=np.float32),   # omit for ground truth
    x_max=640.0, y_max=480.0,                         # optional image bounds
)
# or build incrementally:
bb = BoundingBoxes2D.create_empty(image_width=640, image_height=480)
bb.add_instance(bounding_box=[10, 20, 100, 200], label=3, confidence=0.9)

Use BoundingBoxes2D.schema(classes=..., include_per_instance_confidence=...) for the column schema; the old BoundingBoxListSchema is gone.

Reading. 3.x does not coerce stored data: a legacy table returns its bounding-box column in the original bb_list dict form, and a 3.x table returns a BoundingBoxes2D. The format you get back mirrors how the table was written — there is no migration-on-read. Consuming code should standardize on BoundingBoxes2D and convert legacy rows explicitly via from_legacy_row, which auto-detects the coordinate convention (XYXY / XYWH / centered-XYWH, normalized or absolute) from the column’s schema number-roles:

bb_schema = table.rows_schema.values["bounding_boxes"]

row = table[idx]
raw = row["bounding_boxes"]
# Legacy tables yield the old dict; 3.x tables yield a BoundingBoxes2D.
# `from_legacy_row` takes the *column* value (the `bb_list` dict with its
# `image_width`/`image_height` keys), i.e. `raw` here — not the whole row.
bb = BoundingBoxes2D.from_legacy_row(raw, schema=bb_schema) if isinstance(raw, dict) else raw

bb.bounding_boxes   # Nx4 array of [x_min, y_min, x_max, y_max]
bb.labels           # N category indices
bb.confidences      # N floats (predictions only)

Once converted, always read by attribute (.bounding_boxes, .labels, .confidences) — never by dict key.

Bounds are now optional on geometry dataclasses

The geometry dataclasses (Geometry2D, Geometry3D, Keypoints2D, BoundingBoxes2D, OrientedBoundingBoxes2D, etc.) now have optional bounds. In 2.x bounds were required and defaulted to 0.0; from_row() raised ValueError if bounds were missing. In 3.x bounds default to None and from_row() accepts rows without bounds. When a max bound is specified but the corresponding min is not, the min defaults to 0.0.

# 3.x
geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0)  # x_min / y_min default to 0.0
geom = Geometry2D.create_empty()                          # All bounds are None
geom = Geometry2D.from_row(row)                           # Works even if bounds are missing

COCO importer — include_iscrowd and per_instance_schemas replaced by per_instance_extras

The include_iscrowd boolean and per_instance_schemas (pose-only) parameters on the COCO importer have been removed. The new per_instance_extras parameter covers both — it works for all tasks (detect, segment, pose) and supports both auto-inferred and explicit schemas:

# Auto-infer schemas from data
table = tlc.Table.from_coco(
    ...,
    per_instance_extras=["iscrowd", "my_custom_field"],
)

# Or provide explicit schemas
table = tlc.Table.from_coco(
    ...,
    per_instance_extras={
        "iscrowd": tlc.schemas.Int32Schema(),
        "score": tlc.schemas.Float32Schema(),
    },
)

COCO importer — new per_image_extras parameter

The COCO importer now accepts a per_image_extras parameter for preserving image-level custom fields as top-level table columns:

table = tlc.Table.from_coco(
    ...,
    per_image_extras=["date_captured", "flickr_url"],
)

The CocoExporter also accepts per_image_extras for round-trip support, writing specified table columns back into COCO image entries on export.

CocoExporter.serialize(include_segmentation=...) removed

The deprecated include_segmentation parameter on CocoExporter.serialize() (and the corresponding table.export(..., include_segmentation=...) kwarg) has been removed. Detection tables created in 3.x no longer carry segmentation data, so the parameter is no longer meaningful. The exporter no longer copies legacy bb_list-row segmentation into the exported COCO file either; old-style detection tables export with empty segmentation lists, matching the new-style BB path. Use task="segment" to work with segmentation data.

Integrations

Hugging Face

The transformers-bound Trainer integration is split out from the datasets-bound table machinery. Importing tlc.integration.hugging_face no longer requires datasets.

TLCTrainer removed

The deprecated TLCTrainer class has been removed. The 3LC integration now provides a single Trainer class that supports 3LC Tables and metrics collection.

# 2.x
from tlc.integration.hugging_face import TLCTrainer
trainer = TLCTrainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()

# 3.x
from tlc.integration.hugging_face.trainer import Trainer
trainer = Trainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()

The Trainer class is no longer re-exported from the top-level tlc or tlc.integration packages — import from tlc.integration.hugging_face.trainer.

HF Table classes are private — use tlc.Table factories

The 2.x TableFromHuggingFaceHub, TableFromHuggingFaceDataset, and the TableFromHuggingFace alias are no longer part of the public API. Construct HF-backed tables via:

Top-level imports like from tlc.integration.hugging_face import TableFromHuggingFaceHub and the previously documented tlc.integration.hugging_face.table_from_hugging_face* module paths are no longer supported.

Detectron2

Importing tlc no longer eagerly tries to import detectron2; importing tlc.integration.detectron2 requires the optional detectron2 dependency.

BoundingBoxMetricsCollector lives in tlc.integration.detectron2

BoundingBoxMetricsCollector now lives exclusively in tlc.integration.detectron2. In practice this collector has always been used with Detectron2 — it relies on Detectron2-shaped ground-truth and prediction formats, and the COCO-style annotation dicts it consumes (CocoAnnotation, CocoGroundTruth, CocoPrediction) are produced by the Detectron2 hooks. Keeping it in the general-purpose metrics_collectors namespace implied a framework-agnostic collector that it never really was.

# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector

Detectron2 utilities no longer hoisted to tlc.*

Detectron2 metric collectors, hooks, and helpers were previously re-exported at the top-level tlc namespace via a wildcard import. They now live exclusively under tlc.integration.detectron2. Affected names: BoundingBoxMetricsCollector, CocoAnnotation, CocoGroundTruth, CocoPrediction, DetectronMetricsCollectionHook, MetricsCollectionHook, UmapReduceEmbeddingsHook, register_coco_instances. The 2.x all-caps spellings COCOAnnotation, COCOGroundTruth, COCOPrediction, and UMAPReduceEmbeddingsHook were hoisted alongside the PascalCase variants in 2.x and are also gone from the top level in 3.x (see Renames for the PascalCase replacements).

# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector

metrics_collector = BoundingBoxMetricsCollector(...)

PIL.Image.LINEAR shim removed

In 2.x, import tlc aliased PIL.Image.LINEAR = PIL.Image.BILINEAR as a back-compat shim for detectron2 ≤ v0.6, which referenced Image.LINEAR at import time (Pillow 10 removed it). In 3.x this patch is gone. If you are pinned to detectron2 ≤ v0.6, upgrade to a recent detectron2 commit, or apply the shim yourself before import detectron2:

from PIL import Image
if not hasattr(Image, "LINEAR"):
    Image.LINEAR = Image.BILINEAR

Legacy bb_list segmentation no longer copied into detectron2 annotations

register_coco_instances previously copied any segmentation field present alongside a legacy bb_list row into the detectron2 annotation dicts it produced. In 3.x this codepath has been removed; detection tables register with detectron2 without a segmentation field. Segmentation datasets should be registered as segmentation tables.

PyTorch Lightning — decorator removed

The @tlc.integration.pytorch_lightning.lightning_module class decorator has been removed in 3.x, along with the tlc.integration.pytorch_lightning package and the 3lc[lightning] install extra. 3LC integrates with Lightning using the public 3LC API and Lightning’s standard hooks — no decorator required.

For the migration pattern and end-to-end examples, see the PyTorch Lightning integration page. If you have a 2.x project that depends specifically on the decorator and the documented migration is not enough, please reach out to the 3LC team.