Migrate from 2.x to 3.x¶

This guide describes the common patterns involved in migrating from version 2.x to version 3.x of the 3LC Python API.

3.x makes the public surface explicit, redesigns the schema / sample-type system, and groups CV annotation runtime + schema in one place per shape. Most user code needs only mechanical updates (import paths, keyword arguments); a smaller set of behavior changes need attention even if the names look unchanged.

Recommended migration procedure¶

The size of the change makes a structured pass safer than ad-hoc search-and-replace. The recommended order is:

Install changes first. Apply Dependency and install changes. pandas and torch are now optional.
Apply namespace and rename passes. Walk Namespace moves and Renames. These two passes account for the bulk of the diff in typical 2.x → 3.x migrations and are mostly mechanical.
Convert positional calls. Apply Keyword-only parameters. Any 2.x call that still passes layout fields positionally needs to become keyword form.
Audit silent behavior changes. Read Behavioral changes end to end — these are the things that don’t show up as ImportError or AttributeError, so an otherwise-clean migration can still ship a bug.
Convert CV columns. Apply Schema and Sample Type redesign and CV and Annotation types. Bounding boxes have moved from the flat bb_list format to the geometry-based BoundingBoxes2D dataclass; segmentation has split into SegmentationPolygons / SegmentationMasks.
Update integration code. Apply Integrations — Trainer import path, Detectron2 hoisting removal, PyTorch Lightning decorator removal.

Disambiguation hotspots¶

The following items are likely to missed by a simple search-and-replace; verify by hand:

root= parameter. Renamed to root_url= on Table.squash, ProjectHelper.register_project_url_alias, and the inherited AddressableObject.root_url property. Not renamed on Table.from_image_folder, where root refers to the image folder, not the project root.
Trainer. 2.x’s tlc.Trainer came from tlc.integration.hugging_face. There is no top-level tlc.Trainer in 3.x; import from tlc.integration.hugging_face.trainer.
BoundingBoxMetricsCollector. This is specific to detectron2 in 3.x. The 2.x tlc.BoundingBoxMetricsCollector path is gone; it lives only at tlc.integration.detectron2.BoundingBoxMetricsCollector.
tlc.Object. Both renamed and redefined. tlc.objects.Object is now the lightweight root of the hierarchy; use tlc.objects.AddressableObject to recover the 2.x semantics of tlc.Object (URL-bearing).
sample_type container forms. "tuple", "list", "box" were removed (with a load-time warning, not an error). Code that did image, label = table[0] needs a dict update even though imports look fine.

Don’t hallucinate replacements (LLMs)¶

If a symbol is not mentioned in this guide, treat it as intentionally removed or renamed via a documented pattern. Do not invent new module paths — tlc.core.X is privatized to tlc._core.X only at the same internal path; the public path is whatever sub-namespace is curated for that subsystem (see Namespace moves).

Validation checklist¶

After the passes above:

ruff check . — catches removed names that turn into NameError after import.
mypy src/ — catches signature drift from kw-only changes and Url vs str mismatches.
Run the application end to end on a pre-3.0 project once. The first run will surface silent behavior changes (the table_name default in particular).

Dependency and install changes¶

Getting a 2.x-equivalent install¶

Several dependencies that were required in 2.x, and therefore always installed alongside 3lc, have been made optional in 3.x. This reduces the install footprint and import load time for workloads that do not need those dependencies. To get the same set of packages as a 2.x install, install the extras that were required in 2.x:

pip install '3lc[pandas,torch]'

See Dependencies for the full list of available extras, and the PyTorch installation notes for guidance on accelerator-specific torch wheels.

`pandas` is now an optional dependency¶

In 2.x, pandas was a hard runtime dependency of 3lc and was always installed alongside the package. In 3.x it has been moved to an optional extra to reduce the install footprint for users who do not need it.

pip install 3lc no longer pulls in pandas. To use the pandas-facing entry points, install the extra with pip install '3lc[pandas]', or pip install pandas or equivalent.

The entry points that require pandas to be importable are:

Table.from_pandas
Table.to_pandas — raises ImportError with installation instructions if pandas is not available

If your code already uses any of these, install the pandas extra and no code changes are required.

`torch` and `torchvision` are now optional dependencies¶

In 2.x, torch and torchvision were hard runtime dependencies of 3lc and import tlc would fail if either was missing. In 3.x they have been moved to an optional [torch] extra so that torch-free workflows (Tables, Runs, the Object Service, URL adapters) can run on a minimal install.

pip install 3lc no longer pulls in torch or torchvision. To use the torch-bound entry points, install the extra with pip install '3lc[torch]'. The [huggingface] extra depends on [torch] transitively, so pip install '3lc[huggingface]' continues to pull both in.

The entry points that require torch to be importable are:

tlc.collect_metrics()
Predictor, EmbeddingsMetricsCollector, and SegmentationMetricsCollector
Table.from_torch_dataset
The framework integrations under tlc.integration.detectron2, tlc.integration.hugging_face, and tlc.integration.super_gradients

Calling any of these without torch installed raises ImportError. The framework integration packages require the framework itself as well — their ImportError points at the framework’s own install command (pip install detectron2, pip install super-gradients, pip install datasets / transformers). FunctionalMetricsCollector and the MetricsCollector base class do not require torch.

For accelerator-specific builds (CUDA, ROCm, MPS), follow the official PyTorch install instructions to pick the right index URL: https://pytorch.org/get-started/locally/.

Namespace moves¶

In 2.x, almost every public name was reachable directly at the top-level tlc.* namespace via wildcard imports from tlc.client.* and tlc.core.*. In 3.x, the public surface has been curated and is declared explicitly: each public module sets __all__ to the names it exports, following the convention from PEP 8 — Public and Internal Interfaces. Anything not listed in __all__, or sitting behind an underscore-prefixed package path, is private and may change without notice.

tlc.* is now an explicit, curated set of names — the most commonly used types and free functions (Table, Run, Url, Schema, TableWriter, MetricsTableWriter, the init / active_run / log session helpers, collect_metrics, config).
Curated public sub-namespaces group related concerns: tlc.schemas, tlc.constants, tlc.metrics, tlc.helpers, tlc.export, tlc.reduction, tlc.data_types, tlc.url, tlc.integration, tlc.sample_types, tlc.objects, tlc.configuration. Each is equally public.
Implementation packages are private. tlc.core no longer exists as a public path — it has been renamed to tlc._core and is private. tlc.client has been removed entirely; its contents have been redistributed to the curated public sub-namespaces above. Underscore-prefixed names (tlc._core, tlc.schemas._foo, etc.) may move, rename, or be removed at any time.
tlcsaas has been renamed to _tlcsaas to reflect that it was always internal infrastructure for the 3LC client.
tlc.config is a bound shortcut for the live Configuration singleton (equivalent to tlc.configuration.Configuration.instance()). It is resolved lazily on first access, so import tlc does not force Configuration construction for callers that never touch config. The Configuration type itself now lives at tlc.configuration.Configuration rather than the top level — most code reaches it through the type of tlc.config and never needs to name it.

import tlc

print(tlc.config.project_root_url)
tlc.config.indexing.scan_urls = ["./data"]

Where everything moved¶

If a name lived at tlc.X in 2.x and is not listed below, see Removed from public API.

2.x location	3.x location
`tlc.client.session.init` / `close` / `log` / `active_run` / `set_active_run` / `active_project_name`	unchanged — `tlc.init` / `tlc.close` / `tlc.log` / `tlc.active_run` / `tlc.set_active_run` / `tlc.active_project_name`
`tlc.client.helpers.*`	`tlc.helpers.*`
`tlc.client.reduce.` (also `tlc.reduce_` free functions and `tlc.create_reducer`)	`tlc.reduction.*`
`tlc.client.torch.metrics.*` (also `tlc.<X>MetricsCollector`, `tlc.Predictor`, …)	`tlc.metrics.*`
`tlc.client.torch.samplers.*` (the standalone `create_sampler`, `create_weighted_sampler`, `create_random_sampler`, `create_sequential_sampler`, `create_repeat_by_weight_sampler` factories and the `RangeSampler` / `RepeatByWeightSampler` / `SubsetSequentialSampler` classes)	`tlc.integration.torch.samplers.*`
`tlc.core.builtins.schemas.*` (also `tlc.<X>Schema` for scalar/system schemas)	`tlc.schemas.*` (the `Schema` base class stays at `tlc.Schema`)
`tlc.core.builtins.constants.*` (also `tlc.<NAME>` constants)	`tlc.constants.*`
`tlc.core.export.*` (also `tlc.<X>Exporter`, `tlc.register_exporter`)	`tlc.export.*`
`tlc.core.data_formats.*` (also `tlc.Geometry2DInstances`, `tlc.OBB2DInstances`, …)	`tlc.data_types.*` — note: dataclass renames, see Annotation dataclass renames
`tlc.UrlAdapter`, `tlc.UrlAdapterDirEntry`, `tlc.Scheme`, `tlc.register_url_adapter`, `tlc.register_url_alias`, `tlc.unregister_url_alias`, `tlc.get_alias_path`, `tlc.get_registered_url_aliases`	`tlc.url.*` (`tlc.Url` itself stays at the top level)
`tlc.core.objects.*` (any `tlc.core.X`)	`tlc._core.objects.*` — private; reach for the public re-export instead
`tlc.MutableObject`, `tlc.AddressableObject`, `tlc.SerializableObject`, `tlc.Object`	`tlc.objects.*` — with `tlc.objects.Object` semantically redefined as the lightweight root; use `tlc.objects.AddressableObject` for the 2.x URL-bearing class
`tlc.Configuration`	`tlc.configuration.Configuration` — the `tlc.config` shortcut is unchanged; reach the type through it (or import from `tlc.configuration`) instead of the top level
`tlc.client.torch.metrics.metrics_collectors.bounding_box_metrics_collector.BoundingBoxMetricsCollector` (also `tlc.BoundingBoxMetricsCollector`)	`tlc.integration.detectron2.BoundingBoxMetricsCollector`
`tlc.TableWriter`, `tlc.MetricsTableWriter`	unchanged
`tlc.Table`, `tlc.Run`, `tlc.Url`, `tlc.Schema`, `tlc.config`, `tlc.collect_metrics`	unchanged

New names in the public sub-namespaces¶

The following are new in 3.x with no 2.x equivalent at any public path:

tlc.schemas: ConfidenceSchema, DatetimeStringSchema, EmbeddingSchema, Float64Schema, FractionSchema, ImageSchema (replaces 2.x ImageUrlSchema — see ImageSchema — one class, four modes), Int8Schema, Int16Schema, Int64Schema, IoUSchema, ProbabilitySchema, SemanticSegmentationSchema, Uint8Schema, Uint16Schema, Uint32Schema, Uint64Schema, UrlSchema.
tlc.url: IfExistsOption, list_url_adapters, list_url_schemes.
tlc.helpers: ColorHelper, AnnotationHelper, ProjectHelper, DateTimeHelper, ImageHelper, AnnotationColumn, AnnotationType (some have non-public 2.x counterparts under tlc.core.export.annotation_utils).
tlc.export: RowExporter, ExporterInfo, ExporterRegistry, ExporterSource, list_exporters, list_exporter_formats.
tlc.sample_types: SampleType, SampleTypeRegistry, register_sample_type, get_sample_types (see Schema and Sample Type redesign).
tlc.data_types: BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, SegmentationMasks (see CV and Annotation types).

`tlc.constants` prefix families¶

tlc.constants exposes a flat namespace — every constant is accessible directly at tlc.constants.X, with the same name it had at tlc.X in 2.x. The names group into:

Prefix family	Examples
Column names (no prefix)	`LABEL`, `IMAGE`, `EXAMPLE_ID`, `EPOCH`, `BOUNDING_BOXES`, `SEGMENTATIONS`, …
`BOOL_ROLE_*`	`BOOL_ROLE_ACTION`, `BOOL_ROLE_NORMAL`, `BOOL_ROLE_SLIDER`
`DISPLAY_IMPORTANCE_*`	`DISPLAY_IMPORTANCE_DEFAULT`, `DISPLAY_IMPORTANCE_IMAGE`, `DISPLAY_IMPORTANCE_LOSS`, …
`NUMBER_ROLE_*`	`NUMBER_ROLE_LABEL`, `NUMBER_ROLE_BB_CENTER_X`, `NUMBER_ROLE_FRACTION`, …
`RUN_STATUS_*`	`RUN_STATUS_RUNNING`, `RUN_STATUS_COMPLETED`, `RUN_STATUS_PAUSED`, …
`STRING_ROLE_*`	`STRING_ROLE_URL`, `STRING_ROLE_DATETIME`, `STRING_ROLE_IMAGE_URL`, …
`UNIT_*`	`UNIT_ABSOLUTE`, `UNIT_RELATIVE`
`WIDGET_*`	`WIDGET_CHART`, `WIDGET_FILTERS_PANEL`, `WIDGET_COLUMN_HEADER`, …
`DEFAULT_*`	`DEFAULT_BB_MAX_COUNT`, `DEFAULT_LIST_MAX_LENGTH`, `DEFAULT_BULK_DATA_CHUNK_SIZE_MB`, …

A handful of role constants that were defined but neither emitted by Python nor read by the dashboard have been removed: NUMBER_ROLE_PIXEL_COUNT, NUMBER_ROLE_BB_HALF_SIZE_X / _Y (use _BB_SIZE_X / _Y), NUMBER_ROLE_METRIC_STRING_INDEX, the bare NUMBER_ROLE_RGB_COMPONENT (use the channel-specific _RED / _GREEN / _BLUE), STRING_ROLE_NONE (use the literal ""), STRING_ROLE_INSTANCE_SEGMENTATION_URL.

Removed from public API¶

These were exposed at the top level in 2.x but are no longer addressable. Use the public surface instead, or reach into tlc._core.* privately if you genuinely need to:

Session class — private. Use the free functions tlc.init, tlc.close, tlc.active_run, tlc.set_active_run, tlc.active_project_name (which were always the recommended API).
Filter criteria — tlc.FilterCriterion, tlc.FreeTextFilterCriterion, tlc.IntegerSetFilterCriterion, tlc.LogicalNotFilterCriterion, tlc.NumericRangeFilterCriterion, tlc.Region2DFilterCriterion, tlc.Region3DFilterCriterion, tlc.TextFilterCriterion, tlc.create_filter, tlc.create_optional_filter. The dashboard is the supported way to author filtered views in 3.x.
Registry classes and URL adapters — tlc.ObjectTypeRegistry, tlc.ObjectRegistry, tlc.ObjectReference, tlc.UrlAliasRegistry, tlc.UrlAdapterRegistry, tlc.AdapterInfo, tlc.AdapterSource, and the concrete URL adapter classes (tlc.S3UrlAdapter, tlc.GcsUrlAdapter, tlc.HttpUrlAdapter, tlc.FileUrlAdapter, tlc.AbfsUrlAdapter, tlc.ApiUrlAdapter). Adapters are pluggable via the tlc.url_adapters entry-point group; subclass tlc.url.UrlAdapter and decorate with @tlc.url.register_url_adapter.
TableFrom* constructor classes — private. Use the corresponding tlc.Table.from_* factory method instead (from_coco, from_csv, from_parquet, from_pandas, from_dict, from_torch_dataset, from_yolo_url, from_hugging_face_hub, from_hugging_face_dataset).
TableFromTFRecordSet — was a placeholder, never implemented.
Bulk-data URL helpers — tlc.bulk_data_url_context, tlc.increment_and_get_bulk_data_url, tlc.relativize_bulk_data_url, tlc.reset_bulk_data_url, tlc.set_bulk_data_url_prefix. Bulk-data URL accounting is internal in 3.x.
Table.bulk_data_url — removed, along with the TLC_BULK_DATA_URL environment-variable override. The property hardcoded a single bulk-data path and did not distinguish dataset tables from metrics tables; bulk-data URL resolution is now handled internally by TableWriter.
Table.create_sampler() (deprecated in 2.x) — removed. Use the standalone factory functions in tlc.integration.torch.samplers: the general-purpose create_sampler(table, ...) dispatcher, or the explicit create_weighted_sampler, create_random_sampler, create_sequential_sampler, and create_repeat_by_weight_sampler.
Table.map() / Table.map_collect_metrics() / Table.collection_mode() / Table.clear_maps() (deprecated in late 2.x) — removed, along with the Table.collecting_metrics property. Use Table.with_transform() to obtain a non-mutating view instead; see Table.map() replaced by Table.with_transform().
Table.get_column() — use Table.get_column_as_pyarrow_array() instead.
Table.row_schema — use Table.rows_schema instead. The two properties were near-duplicates; the remaining one returns the table’s live schema reference and should be treated as read-only.
Table.from_yolo(...) (the deprecated YAML-parsing factory) — use Table.from_yolo_url(...) directly for a single image folder or text file. For a YOLO dataset YAML file with splits, install 3lc-ultralytics and use tlc_ultralytics.create_tables_from_yaml_file to get one Table per split:
```
from tlc_ultralytics import create_tables_from_yaml_file

tables = create_tables_from_yaml_file(
    dataset="/path/to/my/dataset.yaml",
    task="detect",
    project_name="My YOLO Project",
)
```
The same keyword arguments for pose (points, oks_sigmas, …) can be passed to configure the pose task.
tlc.client.sample_type module — removed. Sample types are referenced by string name through the SampleTypeRegistry (see Schema and Sample Type redesign).
tlc.helpers.JsonHelper — internal JSON-serialization plumbing, now private. Its only methods (to_minimal_dict, sort_by_rank) operate on internal object/schema structures and were not part of any user workflow. The class still exists at the private path tlc.helpers.json_helper.JsonHelper. Likewise, the internal-only SchemaHelper.pyarrow_list_to_tlc_schema and SegmentationMetricsCollector.tensor_to_pil_image are now underscore-prefixed (SchemaHelper itself stays public).
@tlc.integration.pytorch_lightning.lightning_module decorator and the tlc.integration.pytorch_lightning package — removed. Integrate via Lightning’s standard hooks; see Integrations.

Built-in schemas — reorganized¶

The set of built-in schemas has been cleaned up. A shape= parameter on the scalar schemas absorbs every *ListSchema wrapper, the CV annotation *Schema classes are now classmethods on the matching tlc.data_types dataclass (so a column’s runtime type and its schema-builder live together), and a handful of single-purpose label / vector / RGB-component schemas have been folded back into the corresponding primitive with number_role=.

Concretely: some schemas are simply gone; others survive but are constructed differently.

2.x schema	3.x replacement
`BoundingBoxListSchema`, `BoundingBox2DSchema`, `BoundingBox3DSchema`, `OrientedBoundingBox2DSchema`, `OrientedBoundingBox3DSchema`	`BoundingBoxes2D.schema(...)` / `BoundingBoxes3D.schema(...)` / `OrientedBoundingBoxes2D.schema(...)` / `OrientedBoundingBoxes3D.schema(...)`
`BoundingBoxes2DSchema`, `BoundingBoxes3DSchema`, `OrientedBoundingBoxes2DSchema`, `OrientedBoundingBoxes3DSchema`, `Keypoints2DSchema`, `Geometry2DSchema`, `Geometry3DSchema`, `GeometrySchema`	The matching `<Dataclass>.schema(...)` on the dataclass in `tlc.data_types`
`SegmentationSchema`	`SegmentationPolygons.schema(...)` or `SegmentationMasks.schema(...)`
`FloatVector2Schema`, `FloatVector3Schema`	`Float32Schema(shape=2, number_role="xy_component")` / `(shape=3, number_role="xyz_component")`
`BlueComponentSchema` / `BlueComponentListSchema`, `GreenComponentSchema` / `…`, `RedComponentSchema` / `…`	`Uint8Schema(number_role="rgb_component_blue" / "_green" / "_red")`
`CIFAR10LabelSchema`, `COCOLabelSchema` / `CocoLabelSchema`	`CategoricalLabelSchema(classes=...)`
`CategoricalLabel` (a 2.x sample type, not a schema)	`CategoricalLabelSchema(classes=...)`
`Int32ListSchema`, `BoolListSchema`, `StringListSchema`	`<X>Schema(shape=(-1,))` (or `shape=N` for fixed length)
`ImageUrlSchema`	`ImageSchema()` (see ImageSchema — one class, four modes)

ConfidenceSchema, EmbeddingSchema, FractionSchema, IoUSchema, ProbabilitySchema, and SemanticSegmentationSchema are new pre-configured schemas in 3.x — useful when you’d otherwise reach for Float32Schema(number_role="...") with a stock role.

CategoricalLabel was a 2.x sample type (constructed as CategoricalLabel(display_name, classes=...)), not a schema, but it was commonly passed where a column schema was expected — e.g. in the column_schemas/schema mapping of Run.add_metrics or TableWriter. Replace it with CategoricalLabelSchema(classes, display_name=...): the first positional argument becomes the classes= keyword.

Removed re-exports¶

The following import paths were deprecated re-exports and have been removed:

Old	New
`tlc.core.builtins.types.segmentation_helper`	`tlc.helpers.segmentation_helper`
`tlc.client.data_format`	`tlc.data_types` (and `tlc.data_types.segmentation` for segmentation types)
`tlc.lightning_module` / `tlc.integration.lightning_module`	removed — see Integrations
`tlcconfig.logger_configurator`	`tlclogging.logger_configurator`
`tlc.client.utils.relativize_with_max_depth(url, owner, max_depth)`	`url.to_relative_with_max_depth(owner=owner, max_depth=max_depth)`
`tlc.client.torch.metrics.metrics_collectors.segmentation_metrics_collector.PREDICTED_MASK_METRIC_NAME`	`tlc.constants.PREDICTED_MASK`

The old list-based BoundingBox classes (BoundingBox, XYXYBoundingBox, CenteredXYWHBoundingBox, and friends, previously at tlc.core.builtins.types.bounding_box / tlc.core.data_formats.bounding_boxes) have been removed entirely, superseded by the BoundingBoxes2D / BoundingBoxes3D dataclasses in tlc.data_types.

Renames¶

This section covers renames where the symbol still exists with the same role, just under a new name or parameter spelling. The 2.x names emitted deprecation paths during 2.x; in 3.x they are removed.

Casing — acronym → PascalCase¶

2.x	3.x
`TLCException`	`TlcException`
`COCOAnnotation`, `COCOGroundTruth`, `COCOPrediction`, `COCOExporter`	`CocoAnnotation`, `CocoGroundTruth`, `CocoPrediction`, `CocoExporter`
`CSVExporter`, `DefaultJSONExporter`, `YOLOExporter`	`CsvExporter`, `DefaultJsonExporter`, `YoloExporter`
`UMAPTable`, `UMapTable`, `UMapTableArgs`, `UMapReduction`, `UMAPReduceEmbeddingsHook`	`UmapTable`, `UmapTableArgs`, `UmapReduction`, `UmapReduceEmbeddingsHook`
`PaCMAPTable`, `PaCMAPTableArgs`, `PaCMAPReduction`	`PacmapTable`, `PacmapTableArgs`, `PacmapReduction`
`InstanceSegmentationRLEBytesStringValue`	`InstanceSegmentationRleBytesStringValue`
`FSSpecUrlAdapter`, `FSSpecUrlAdapterDirEntry`	`FsspecUrlAdapter`, `FsspecUrlAdapterDirEntry`
`GCSUrlAdapter`, `GSUrlAdapterDirEntry`	`GcsUrlAdapter`, `GcsUrlAdapterDirEntry`
`LRUCache`, `LRUEntry`, `LRUFuncCache`, `LRUCacheStore`, `LRUCacheStoreConfig`	`LruCache`, `LruEntry`, `LruFuncCache`, `LruCacheStore`, `LruCacheStoreConfig`

Parameter renames¶

Site	2.x	3.x
`Table.squash`	`root=`	`root_url=`
`ProjectHelper.register_project_url_alias`	`root=`	`root_url=`
`ProjectHelper.register_project_url_alias`	`project=`	`project_name=` (now keyword-only)
`AddressableObject.root` (property)	`.root`	`.root_url`
`CategoricalLabelSchema(class_names=..., display_colors=...)`	`class_names` + `display_colors`	`classes=` (single argument combines both)
`Run.add_metrics_data(override_column_schemas=..., input_table_url=..., …)`	`add_metrics_data`, `override_column_schemas`, `input_table_url`	`Run.add_metrics(schema=..., foreign_table_url=..., ...)`
`tlc.client.torch.metrics.collect_dataset.collect_metrics(..., exclude_zero_weights=...)`	`exclude_zero_weights`	parameter removed — it had no effect; the detectron2 `collect_metrics` moved to `tlc.integration.detectron2`
`UMAPTable.seed` property	`.seed`	construct with `random_state=...` for determinism

Table.from_image_folder(root=...) is not affected — that root parameter refers to the image folder path, not the project root URL.

`Url.create_*` classmethods → `ProjectLayout` methods¶

The Url.create_* classmethods are gone. They are now methods on ProjectLayout:

2.x	3.x
`Url.create_project_url(...)`	`ProjectLayout.project_url(...)`
`Url.create_table_url(...)`	`ProjectLayout.table_url(...)`
`Url.create_run_url(...)`	`ProjectLayout.run_url(...)`
`Url.create_default_aliases_config_url(...)`	`ProjectLayout.default_project_aliases_config_url(...)`
`Url.is_dataset_table_url()`	`ProjectLayout.is_dataset_table_url(url)`
`Url.is_run_url()`	`ProjectLayout.is_project_run_url(url)`
`Url.is_metrics_table_url()`	`ProjectLayout.is_run_metrics_table_url(url)`
`Url.create_unique(require_writable=True)`	`ProjectLayout.create_unique_table_url(url, require_writable=True)`
`tlc.client.helpers.register_project_url_alias(...)`	`ProjectHelper.register_project_url_alias(...)`

The corresponding 2.x classmethods all accepted root= in 2.x; pass root_url= to the ProjectLayout methods instead.

`IndexingTable` verbs (niche)¶

The IndexingTable subclasses (TableIndexingTable, RunIndexingTable, ConfigIndexingTable) live under tlc._core and are considered internal. If you were using the escape hatch tlc.TableIndexingTable.instance().wait_for_complete_index() to force a re-scan after editing files outside the Python package, the verbs have been renamed:

wait_for_complete_index() → sync() (blocking, returns True on completion)
request_reindex() → request_sync() (non-blocking, returns a request token)

The force= parameter has been dropped.

`Scheme` is now string constants¶

In 2.x Scheme was an enum; in 3.x it is a string-constants class and url.scheme is already a string. Drop any .value calls:

# 2.x
url.scheme.value      # "file"
Scheme.FILE.value     # "file"

# 3.x
url.scheme            # "file"
Scheme.FILE           # "file"

Annotation dataclass renames¶

2.x	3.x
`Geometry2DInstances`	`Geometry2D`
`Geometry3DInstances`	`Geometry3D`
`Keypoints2DInstances`	`Keypoints2D`
`OBB2DInstances`	`OrientedBoundingBoxes2D`
`OBB3DInstances`	`OrientedBoundingBoxes3D`

BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, SegmentationMasks are new in 3.x — see CV and Annotation types.

Sample-type registry strings¶

The sample_type registry strings are now snake_case. The format-specific variants replace the 2.x dict form:

2.x	3.x
`"PILImage"` / `{"name": "pil_image", "format": "jpeg"}`	`"pil_png"`, `"pil_jpeg"`, `"pil_webp"`
`"small_numpy_array"`, `"large_numpy_array"`	`"numpy_array"`, `"external_numpy_array"`
`"small_torch_tensor"`, `"large_torch_tensor"`	`"torch_tensor"`, `"external_torch_tensor"`
`"instance_segmentation_polygons"`	`"segmentation_polygons"`
`"instance_segmentation_masks"`	`"segmentation_masks"`
(n/a — was the `bb_list` flat schema)	`"bounding_boxes_2d"`, `"bounding_boxes_3d"`, `"oriented_bounding_boxes_2d"`, `"oriented_bounding_boxes_3d"`, `"keypoints_2d"`

Old PascalCase and pre-rename strings are registered as legacy aliases — existing serialized tables still load. "pil_image" is registered as an alias for "pil_png", so old PascalCase "PILImage" and bare "pil_image" references resolve to PNG by default. The previous {"name": "pil_image", "format": "jpeg"} dict form is removed; pass "pil_jpeg" directly instead.

Color helpers — free functions → `ColorHelper`¶

# 2.x
tlc.rgb_tuple_to_hex(rgb)
tlc.hex_to_rgb_tuple(hex_str)

# 3.x
tlc.helpers.ColorHelper.rgb_tuple_to_hex(rgb)
tlc.helpers.ColorHelper.hex_to_rgb_tuple(hex_str)

Keyword-only parameters¶

In 2.x, most parameters on the public API were positional. In 3.x, public callables follow the convention positional only when the role is obvious at the call site; everything else keyword-only. This lets new keyword arguments be added near the top of the signature where they remain visible, instead of being appended to the end of a long positional chain.

If you already pass these arguments by name (e.g. Table.from_pandas(df, schema=schema, table_name="t")), nothing changes. Update any positional call that broke.

Affected callables and which arguments stay positional¶

Callable	Positional in 3.x	Everything else
`Table.from_url`	`url`	—
`Table.from_names`	(none)	`project_name`, `dataset_name`, `table_name`, `root_url`
`Table.from_pandas`	`df`	all other parameters
`Table.from_dict`	`data`	all other parameters
`Table.from_csv` / `from_parquet` / `from_ndjson`	the source file	all other parameters
`Table.from_torch_dataset`	`dataset`	including `schema`, `all_arrays_are_fixed_size`
`Table.from_image_folder`	`root`	including `image_column_name`, `label_column_name`, `extensions`, `label_overrides`
`Table.from_coco`	`annotations_file`, `image_folder`	including `task`, `keep_crowd_annotations`, all pose keypoint params, `per_instance_extras`, `per_image_extras`
`Table.from_yolo_ndjson`	`ndjson_file`, `image_folder`	including `split`
`Table.from_yolo_url`	`images_url`	including `categories`, `task`, `max_depth`, `allow_fetch_remote_data`
`Table.from_hugging_face_hub`	`path`, `name`, `split` (mirrors `datasets.load_dataset`)	all other parameters
`Table.from_hugging_face_dataset`	`hf_dataset`	all other parameters
`Table.join_tables`	`tables`	all other parameters
`tlc.init`	`project_name`, `run_name`	`description`, `parameters`, `if_exists`, `root_url`, `run_url`
`Run.from_names`	(none)	`project_name`, `run_name`, `root_url`
`Run.copy`	(none)	`run_name`, `project_name`, `root_url`, `if_exists`, `destination_url`
`Run.add_metrics`	`metrics`	`schema`, `foreign_table_url`, `constants`
`collect_metrics`	`table`, `metrics_collectors`	including `predictor`, `foreign_table_url`, `constants`, `run_url`, `split`, `dataloader_args`
`TableWriter.__init__`	(none)	`schema`, `project_name`, `dataset_name`, `table_name`, `root_url`, `table_url`, `if_exists`, `description`, `input_tables`, all `bulk_data_*`
`MetricsTableWriter.__init__`	(none)	`run_url`, `foreign_table_url`, `schema`
`MapElement.__init__`	`internal_name`	`display_name`, `description`, `display_color`, `url`
`Table.add_column`	`column_name`, `values`	`schema`, `url`
`Table.export`	`output_url`, `format`	`weight_threshold` (plus exporter-specific `**kwargs`)
`Table.set_value_map_item`	`value_path`, `value`, `internal_name`	`display_name`, `description`, `display_color`, `url`, `edited_table_url`
`Table.add_value_map_item`	`value_path`, `internal_name`	`display_name`, `description`, `display_color`, `url`, `value`, `edited_table_url`
`Table.write_to_row_cache` / `get_rows_as_binary` / `get_column_as_pyarrow_array`	the column `name` (where applicable)	the flag args (`create_url_if_empty`, `overwrite_if_exists`, `exclude_bulk_data`, `combine_chunks`)
`Run.reduce_embeddings_by_foreign_table_url`	`foreign_table_url`	`delete_source_tables`
`Run.reduce_embeddings_per_dataset`	(none)	`delete_source_tables`
`tlc.reduction.reduce_embeddings` / `_per_dataset` / `_by_foreign_table_url` / `_with_producer_consumer` / `_multiple_parameters`	the source `Table` / `list[Table]` (and `foreign_table_url` / `consumers` where applicable)	`method`, `delete_source_tables`, `parameter_sets`
`MetricsCollector`, `FunctionalMetricsCollector`, `EmbeddingsMetricsCollector`, `SegmentationMetricsCollector` constructors	(none) — fully keyword-only	every argument (`collection_fn`, `layers`, `label_map`, `preprocess_fn`, `compute_aggregates`, `reshape_strategy`, `schema`)
`<Dataclass>.schema(...)` for every `tlc.data_types` annotation type (`BoundingBoxes2D`/`3D`, `OrientedBoundingBoxes2D`/`3D`, `Keypoints2D`, `SegmentationMasks`/`Polygons`, `Geometry2D`/`3D`)	(none) — fully keyword-only	every argument, including `classes` and `num_keypoints`
`Schema.add_sub_value` / `from_sample` / `add_sample_weight`	the data argument (`name`+`value`, `sample`)	the flag args (`writable`, `computable`, `all_arrays_are_fixed_size`, `hidden`, `default_value`)
`Url.write_bytes` / `Url.write_text`	`content`, `encoding`	`if_exists`
`Url.make_parents`	(none)	`exist_ok`
`Url.expand_aliases`	(none)	`allow_unexpanded`
`tlc.url.register_url_alias`	`token`, `path`	`force`
`ProjectLayout.create_unique_table_url`	`url`	`require_writable`
`SegmentationHelper.mask_from_polygons` / `polygons_from_mask` / `polygons_from_rles` / `rles_from_polygons`	the geometry data (plus `height`, `width`)	`relative`
`GeometryHelper.create_isotropic_bounds_3d`	the six bound values	`force_z_min`

Two of these are now fully keyword-only (no positional arguments at all): the metrics-collector constructors and the annotation <Dataclass>.schema(...) builders. For the collectors, 2.x code such as EmbeddingsMetricsCollector([99]) or FunctionalMetricsCollector(my_fn) must become EmbeddingsMetricsCollector(layers=[99]) / FunctionalMetricsCollector(collection_fn=my_fn). The .schema(...) builders are new in 3.x, so there is no 2.x positional form to port — just always call them with keywords (e.g. Keypoints2D.schema(num_keypoints=17, classes=...)).

Argument-order changes inside the keyword-only block¶

Even when you already used keyword arguments, the documentation order on Table.from_* factories (and related helpers like TableWriter, Table.from_names, Run.from_names, Run.copy) has been aligned to a single canonical order:

Method-specific options first.
Then the project-layout block in hierarchy order — project_name → dataset_name → table_name → root_url → table_url. (In 2.x the layout block was table_name → dataset_name → project_name. Reading top-down now matches the hierarchy: a project contains datasets which contain tables.)
Then if_exists, weights, and common metadata.

This affects nothing at the call site (kwargs are order-independent), but if you rely on inspect.signature or generated docs, expect the parameter listing to read differently.

Common migration patterns¶

# 2.x
table = tlc.Table.from_pandas(df, schema, "my_table", "my_dataset", "my_project")
tlc.collect_metrics(table, mc, model, constants={"epoch": 1})
run = tlc.Run.from_names("my_run", "my_project")

# 3.x
table = tlc.Table.from_pandas(
    df,
    schema=schema,
    project_name="my_project",
    dataset_name="my_dataset",
    table_name="my_table",
)
tlc.collect_metrics(table, mc, predictor=model, constants={"epoch": 1})
run = tlc.Run.from_names(project_name="my_project", run_name="my_run")

Behavioral changes¶

This section covers changes that may import and run cleanly but behave differently at runtime. Read each subsection even if your import passes don’t flag anything, so that existing code does not silently change in ways your do not expect.

Default `table_name` changed from `"table"` to `"initial"`¶

When no table_name is passed to Table.from_*, Table.from_names, Table.copy, TableWriter, or ProjectLayout.table_url, the resulting URL is now .../tables/initial/ instead of .../tables/table/.

Pre-3.0 projects have their data at .../tables/table/. Reading them without specifying a table_name — for example tlc.Table.from_names(project_name="p", dataset_name="d") — will now resolve to a non-existent .../tables/initial/ path. Round-trip patterns like tlc.Table.from_dict(...).latest() that rely on if_exists="reuse" will also silently create a new initial/ table instead of reusing the old table/ one. When 3LC detects a legacy .../tables/table/ alongside a new-default target, it emits a warning pointing this out. The warning fires for if_exists values reuse, rename, and raise, but not overwrite (which is an explicit “write regardless” intent).

Migration: For pre-3.0 projects, pass table_name="table" explicitly when reading an existing table if you would have previously passed no table_name and you want to preserve use of the old default name..

`Session.run_url` is now a `Url`¶

In 2.x, Session.run_url was stored as a str. In 3.x it is stored as a Url object, matching the return type of Session.initialize_run.

Migration: If you were comparing or concatenating session.run_url as a string, call .to_str() on it, or update the code to work with Url directly. Reads that pass it into APIs accepting Url | str need no change.

`MetricsTableWriter.finalize()` now updates the Run automatically¶

In 2.x, after calling MetricsTableWriter.finalize(), you had to manually update the run with the written metrics:

# 2.x
metrics_writer = MetricsTableWriter(run_url=run.url, ...)
metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
metrics_table = metrics_writer.finalize()

# Manual step required to associate the metrics table with the run
run.update_metrics(metrics_writer.get_written_metrics_infos())

In 3.x, finalize() automatically updates the run. The recommended pattern is to use MetricsTableWriter as a context manager, which calls finalize() on exit:

# 3.x
with MetricsTableWriter(run_url=run.url, ...) as metrics_writer:
    metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
# The run is automatically updated on exit.

finalize() can only be called once; calling it again raises RuntimeError.

Migration: Remove manual calls to run.update_metrics(metrics_writer.get_written_metrics_infos()) after finalize().

`Url.read()` and `Url.write()` removed¶

The generic Url.read() and Url.write() methods have been removed. Use the type-specific methods instead:

# 2.x
content = url.read()             # bytes (default mode="b")
content = url.read(mode="t")     # str
url.write(b"data")
url.write("text", mode="t")
url.write(content, if_exists="raise")

# 3.x
content = url.read_bytes()
content = url.read_text(encoding="utf-8")
url.write_bytes(b"data", if_exists=...)
url.write_text("text", if_exists=...)

The if_exists argument is supported by both write_bytes and write_text (values: "overwrite", "rename", "raise").

Sample view always returns a dict¶

Container sample types ("tuple", "list", "box", "horizontal_tuple", "horizontal_list") and the is_leaf flag on SampleType were removed. The sample view of any composite table is now a dict — one entry per visible column — instead of a tuple, list, or single value.

# 2.x — schema with sample_type="tuple"
image, label = table[0]

# 3.x — sample view is always a dict
sample = table[0]
image, label = sample["image"], sample["label"]

Tables persisted in 2.x with a container sample_type continue to load: the resolver coerces the legacy name to identity and emits a one-time warning per name. No data migration is required, but reader code that destructured the sample as a tuple needs to be updated. If you constructed a schema with sample_type="tuple" / "list" / "box" to control the sample shape, drop the sample_type argument; the structural part of the schema (values=...) is unchanged. Schema.from_schema_like applied to a tuple of schemas no longer attaches sample_type="tuple" to the resulting composite — it produces a dict-shaped schema with value_i keys (or display names where provided).

`Table.map()` replaced by `Table.with_transform()`¶

In 2.x, per-sample transforms were mutable state on the Table instance: Table.map() registered functions applied to every sample on __getitem__, Table.map_collect_metrics() registered an alternative set used during metrics collection, the collection_mode() context manager (and collecting_metrics property) switched between the two, and clear_maps() reset both lists. All of these were deprecated in late 2.x and are removed in 3.x.

The replacement is Table.with_transform(), which returns a TableView — a lightweight map-style view that applies the transform on read, leaving the Table itself untouched. A Table always returns untransformed samples; each consumer builds its own view. The training-vs-metrics-collection duality disappears: where 2.x switched one stateful Table between two transform sets, 3.x uses two views over the same Table.

# 2.x — transforms stored on the Table; collect_metrics implicitly switched to the metrics set
table.map(train_transform)
table.map_collect_metrics(eval_transform)

train_loader = torch.utils.data.DataLoader(table, batch_size=32)  # applies train_transform
tlc.collect_metrics(table, collectors, model)                     # applies eval_transform

# 3.x — one view per purpose; no hidden state, no mode switching
train_view = table.with_transform(train_transform)
train_collection_view = table.with_transform(train_transform)
eval_collection_view = table.with_transform(eval_transform)

train_loader = torch.utils.data.DataLoader(train_view, batch_size=32)
tlc.collect_metrics(train_collection_view, collectors, predictor=model)
tlc.collect_metrics(eval_collection_view, collectors, predictor=model)

Notes on the new mechanism:

TableView is not a Table: it has no schema, no persistence, and no object-registry identity. It implements the MapDataset protocol (__len__, __getitem__) accepted by tlc.collect_metrics and any torch.utils.data.DataLoader. Its url, name, project_name, and dataset_name forward to the underlying Table, so metrics collected through a view are linked back to the source table.
Views compose: view.with_transform(g) chains transforms. view.source resolves to the root Table regardless of chain depth — use it where a real Table is required, e.g. sampler construction (create_sampler(view.source, ...)).
Each with_transform() call returns a fresh TableView instance. Hoist the view into a variable when you need a stable reference across calls — in particular when reusing a DataLoader with persistent_workers=True, where a fresh view per call defeats worker reuse.
The transform must be picklable (a top-level function or importable callable, not a lambda or local closure) when the view is consumed by a DataLoader with num_workers > 0.
tlc.collect_metrics now accepts any MapDataset, not just Table / TableView. For a custom map-style dataset (e.g. a torch dataset), pass foreign_table_url= to declare which Table the metrics belong to; for a Table or TableView the URL is derived automatically, and combining the two raises ValueError.
Table.from_torch_dataset no longer re-attaches a VisionDataset’s transform / target_transform / transforms as map functions on the resulting Table — they are stripped before serialization and not applied on read. Re-attach them explicitly with with_transform().
Revision navigation no longer carries transforms along: 2.x copied a table’s map functions onto the table returned by latest() / revision(); in 3.x there is no per-table transform state to carry — wrap the returned revision in a new view.

`reduce_embeddings` no longer accepts a list of tables¶

tlc.reduction.reduce_embeddings previously accepted either a single Table or a list[Table], returning a Table in the first case and a dict[Url, Url] in the second (with a DeprecationWarning on the list form). It now accepts a single Table and always returns a Table:

# 2.x
url_mapping = tlc.reduction.reduce_embeddings([table_a, table_b], method="umap")
reduced_a = tlc.Table.from_url(url_mapping[table_a.url])

# 3.x
reduced_a = tlc.reduction.reduce_embeddings(table_a, method="umap")
reduced_b = tlc.reduction.reduce_embeddings(table_b, method="umap")

The other multi-table reduction helpers (reduce_embeddings_per_dataset, reduce_embeddings_by_foreign_table_url, reduce_embeddings_with_producer_consumer) are unchanged and still take list[Table].

`ClassificationMetricsCollector` removed¶

ClassificationMetricsCollector has been removed. It bundled four metrics (loss, predicted, accuracy, confidence) behind a rigid input contract — batch had to be a (samples, labels) tuple and the model output had to be a raw logits tensor — which made it inflexible for HF-style models or models returning anything other than torch.Tensor. The same metrics are easy to compute with FunctionalMetricsCollector:

import torch
import torch.nn.functional as F
import tlc

def classification_metrics_fn(batch, predictor_output: tlc.PredictorOutput) -> dict:
    _, labels = batch
    predictions = predictor_output.forward
    if labels.dim() == 2 and labels.shape[1] > 1:
        labels = torch.argmax(labels, dim=1)
    softmax_out = F.softmax(predictions, dim=1)
    predicted = torch.argmax(predictions, dim=1)
    confidence = torch.gather(softmax_out, 1, predicted.unsqueeze(1)).squeeze(1)
    accuracy = predicted.eq(labels).float()
    loss = F.cross_entropy(predictions, labels, reduction="none")
    return {
        "loss": loss.detach().cpu().numpy(),
        "predicted": predicted.detach().cpu().numpy(),
        "accuracy": accuracy.detach().cpu().numpy(),
        "confidence": confidence.detach().cpu().numpy(),
    }

schemas = {
    "predicted": tlc.schemas.CategoricalLabelSchema(
        display_name="predicted label",
        classes=class_names,
    ),
}

collector = tlc.metrics.FunctionalMetricsCollector(
    collection_fn=classification_metrics_fn,
    schema=schemas,
)

`TableWriter` auto-detects sample-form vs row-form inputs¶

TableWriter now auto-detects sample-form vs row-form inputs per value (previously all inputs were treated as sample-form). Both work without a mode flag:

# Sample-form: a PIL.Image is serialized to a file and the URL is stored
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": pil_image})

# Row-form: a URL string is stored directly (relativized against the table)
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": "path/to/image.png"})

If you have custom SampleType subclasses, make sure their accepts() implementation returns True only for sample-form inputs — it is what the writer uses to distinguish the two.

`Table.latest()` — `wait_for_rescan` and `use_new_columns` removed¶

Table.latest() no longer accepts wait_for_rescan. The blocking/non-blocking choice is now expressed through timeout:

2.x latest(wait_for_rescan=True) → 3.x latest() (defaults to timeout=30.0)
2.x latest(wait_for_rescan=False) → 3.x latest(timeout=0) (non-blocking; in-process fast-path only)
Pass timeout=None to block indefinitely.

use_new_columns is also removed. With dict-based samples in 3.x, extra keys no longer disrupt downstream consumers; columns that should not appear in samples can be marked at the column level via sample_type="hidden" (or any sample type with is_included_in_sample = False).

`UrlAdapterRegistry` — `default_value` removed; async signatures cleaned up¶

UrlAdapterRegistry read methods used to accept a default_value to return when no adapter could be found. In 3.x, Url validates schemes at construction time, so the “no adapter” code path is unreachable for valid URLs. The read methods now raise ValueError if no adapter is found; the getter methods simply return None:

# 3.x
content = UrlAdapterRegistry.read_string_content_from_url(url)
data = UrlAdapterRegistry.read_binary_content_from_url(url)
data = await UrlAdapterRegistry.read_binary_content_from_url_async(url)
adapter = UrlAdapterRegistry.get_url_adapter_for_url(url)        # returns None if not found
adapter = UrlAdapterRegistry.get_url_adapter_for_scheme(scheme)  # returns None if not found

All *_async methods on ObjectRegistry, UrlAdapterRegistry, and the URL adapters themselves are now proper async methods — in 2.x they were sync methods that returned concurrent.futures.Future. Replace future = X.foo_async(...); future.result() with await X.foo_async(...):

# 2.x
future = ObjectRegistry.delete_object_from_url_async(url); future.result()

# 3.x
await ObjectRegistry.delete_object_from_url_async(url)

Async convenience methods are now available directly on UrlAdapterRegistry (read_binary_content_from_url_async, write_binary_content_to_url_async, delete_url_async) — adapter lookup is handled automatically. If you previously looked up an adapter just to call async methods, use the registry methods directly for cleaner code.

Schema and Sample Type redesign¶

The schema system has been redesigned. The old SampleType class hierarchy has been replaced with a new SampleType base class, a SampleTypeRegistry for pluggable registration, and convenience schemas that configure storage and transforms in one step. Schemas now directly control how data is transformed and stored.

Name-level changes (registry strings, sample_from_row → from_row, etc.) are in Sample-type registry strings and the renames table. This section covers the API redesign that needs more than search-and-replace.

Convenience schemas — explicit parameters¶

All built-in convenience schemas now have explicit parameters — no more **kwargs or **schema_kwargs. IDE autocompletion shows every available parameter, and typos are caught at the call site.

`shape` parameter on scalar schemas¶

Scalar schemas now accept a shape parameter for arrays of any dimensionality. The previous *ListSchema convenience classes have been folded into this parameter; only Float32ListSchema and CategoricalLabelListSchema survive as thin wrappers (because they’re common-enough that a dedicated class reads more clearly than shape=(-1,)).

Convenience class	Equivalent primitive
`Float32ListSchema(list_size=10)`	`Float32Schema(shape=10)`
`Float32ListSchema()`	`Float32Schema(shape=(-1,))`
`CategoricalLabelListSchema(classes=)`	`CategoricalLabelSchema(classes=, shape=(-1,))`

Use -1 for variable-size dimensions: Float32Schema(shape=(-1, -1)) for a variable 2D array.

Float32ListSchema no longer accepts a number_role parameter. For a list of fractions / confidences / probabilities, use the baked schema with a list shape: ConfidenceSchema(shape=(-1,)) instead of Float32ListSchema(number_role="fraction/confidence"). The same pattern applies to other roled list columns — pick the baked schema (EmbeddingSchema, FractionSchema, ConfidenceSchema, ProbabilitySchema, IoUSchema, CategoricalLabelSchema, …) and pass shape=.

`ImageSchema` — one class, four modes¶

ImageSchema is new in 3.x and covers every image-column use case via its sample_type kwarg. It replaces the 2.x ImageUrlSchema class and consolidates the image-format selection that 2.x performed via the sample-type dict form {"name": "pil_image", "format": ...} (now removed; see Sample-type registry strings).

# 2.x — separate class for URL columns; format chosen via sample-type dict form
ImageUrlSchema()                                                          # URL passthrough
Schema(value=ImageUrlStringValue(), sample_type="pil_image")              # PIL → file (PNG)
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "jpeg"})
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "webp"})

# 3.x — one class, four modes
ImageSchema(sample_type="url")                 # URL passthrough — pre-existing files
ImageSchema()                                  # PIL → file (PNG, default)
ImageSchema(sample_type="pil_jpeg")            # PIL → file (JPEG)
ImageSchema(sample_type="pil_webp")            # PIL → file (WEBP)

All four modes serialize to the same wire format (ImageUrlStringValue); only the Python-side behavior differs. sample_type=None is accepted as an alias for "url".

Common column attributes¶

All schemas now accept display_name, description, writable, default_visible, and default_value as explicit parameters:

# 2.x — had to use **kwargs, not discoverable
Float32Schema(display_name="Score", writable=False)  # worked but wasn't documented

# 3.x — explicit, IDE-discoverable
Float32Schema(display_name="Score", writable=False, default_visible=True, default_value=0.0)

display_importance is accepted by the base Schema and by the EpochSchema / IterationSchema system schemas, but not by the convenience scalar / categorical / image schemas. Construct the base Schema directly if you need a non-default display ordering on a convenience-shaped column.

Custom sample types¶

Custom SampleType subclasses should be migrated to the new SampleType base class, registered via @tlc.sample_types.register_sample_type. The method names changed: sample_from_row → from_row, row_from_sample → to_row.

# 2.x
class MyType(tlc.sample_types.SampleType):
    def sample_from_row(self, value):
        return custom_decode(value)
    def row_from_sample(self, value):
        return custom_encode(value)

# 3.x
@tlc.sample_types.register_sample_type("my_type")
class MyType(tlc.sample_types.SampleType):
    def from_row(self, value):
        return custom_decode(value)
    def to_row(self, value):
        return custom_encode(value)

New public APIs in `tlc.sample_types`¶

SampleType — base class for all sample types.
SampleTypeRegistry — registry for looking up and registering custom sample types.
register_sample_type — decorator for registering a SampleType subclass or a dataclass with to_row() / from_row() methods.
get_sample_types — list all registered sample type names.

CV and Annotation types¶

This section bundles the runtime dataclasses, schema-builders, and importer / exporter changes for computer-vision annotations. The unifying point: each annotation shape now owns both its runtime data and its column schema-builder under one dataclass in tlc.data_types.

One dataclass per shape, with a bound schema-builder¶

All CV annotation types now follow a consistent plural-base naming convention. The base name is always plural (e.g. BoundingBoxes2D), and the dataclass owns both runtime data and its column schema-builder (Dataclass.schema(...)):

Sample-type string	Dataclass	Schema-builder
`bounding_boxes_2d`	`BoundingBoxes2D`	`BoundingBoxes2D.schema(...)`
`bounding_boxes_3d`	`BoundingBoxes3D`	`BoundingBoxes3D.schema(...)`
`oriented_bounding_boxes_2d`	`OrientedBoundingBoxes2D`	`OrientedBoundingBoxes2D.schema(...)`
`oriented_bounding_boxes_3d`	`OrientedBoundingBoxes3D`	`OrientedBoundingBoxes3D.schema(...)`
`keypoints_2d`	`Keypoints2D`	`Keypoints2D.schema(...)`
`segmentation_polygons`	`SegmentationPolygons`	`SegmentationPolygons.schema(...)`
`segmentation_masks`	`SegmentationMasks`	`SegmentationMasks.schema(...)`

All dataclasses live under tlc.data_types. Constructor arguments are unchanged from their 2.x equivalents — only the spelling is (see Annotation dataclass renames).

# 2.x
from tlc.core.data_formats import Geometry2DInstances, OBB2DInstances

geom = Geometry2DInstances.create_empty(x_max=640.0, y_max=480.0)
obbs = OBB2DInstances.create_empty(x_max=640.0, y_max=480.0)

# 3.x
from tlc.data_types import Geometry2D, OrientedBoundingBoxes2D

geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0)
obbs = OrientedBoundingBoxes2D.create_empty(x_max=640.0, y_max=480.0)

BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, and SegmentationMasks are new in 3.x and have no 2.x dataclass counterpart. The 2.x sample types InstanceSegmentationPolygons / InstanceSegmentationMasks (from tlc.client.sample_type) were dropped along with the rest of that module.

Schema-builder replaces the 2.x `*Schema` classes¶

The 2.x tlc.SegmentationSchema (which selected storage form via its sample_type argument) has been split into two dataclass-bound builders:

2.x	3.x
`tlc.SegmentationSchema(sample_type="instance_segmentation_polygons", ...)`	`SegmentationPolygons.schema(...)`
`tlc.SegmentationSchema(sample_type="instance_segmentation_masks", ...)`	`SegmentationMasks.schema(...)`

# 2.x
schema = {"seg": tlc.SegmentationSchema(classes=["cat", "dog"])}

# 3.x
from tlc.data_types import SegmentationPolygons
schema = {"seg": SegmentationPolygons.schema(classes=["cat", "dog"])}

The full list of removed *Schema classes (and their replacements) is in Built-in schemas — reorganized.

Bounding box format migration¶

3.x replaces the flat bb_list dict format with the geometry-based BoundingBoxes2D dataclass (tlc.data_types.BoundingBoxes2D).

Existing tables written by 2.x still load — their bounding-box columns are stored in the legacy bb_list form and are not rewritten on read.
New tables, importers, and metrics produce BoundingBoxes2D instances, and the built-in schema builder (BoundingBoxes2D.schema(...)) is what 3.x emits by default. The COCO and YOLO importers/exporters auto-detect the stored format and handle both transparently — but your code that reads or writes box columns does not get that for free. The old BoundingBoxListSchema has been removed from the public API.

So any code that touches a bounding-box column needs an audit, in one of two directions.

Writing. Stop building the {"image_width", "image_height", "bb_list": [...]} dict. Construct a BoundingBoxes2D directly (coordinates are absolute XYXY by default; pass bounding_box_format= / normalized= if yours differ):

from tlc.data_types import BoundingBoxes2D
import numpy as np

bb = BoundingBoxes2D(
    bounding_boxes=np.array([[10, 20, 100, 200]], dtype=np.float32),  # Nx4, XYXY
    labels=np.array([3], dtype=np.int32),
    confidences=np.array([0.9], dtype=np.float32),   # omit for ground truth
    x_max=640.0, y_max=480.0,                         # optional image bounds
)
# or build incrementally:
bb = BoundingBoxes2D.create_empty(image_width=640, image_height=480)
bb.add_instance(bounding_box=[10, 20, 100, 200], label=3, confidence=0.9)

Use BoundingBoxes2D.schema(classes=..., include_per_instance_confidence=...) for the column schema; the old BoundingBoxListSchema is gone.

Reading. 3.x does not coerce stored data: a legacy table returns its bounding-box column in the original bb_list dict form, and a 3.x table returns a BoundingBoxes2D. The format you get back mirrors how the table was written — there is no migration-on-read. Consuming code should standardize on BoundingBoxes2D and convert legacy rows explicitly via from_legacy_row, which auto-detects the coordinate convention (XYXY / XYWH / centered-XYWH, normalized or absolute) from the column’s schema number-roles:

bb_schema = table.rows_schema.values["bounding_boxes"]

row = table[idx]
raw = row["bounding_boxes"]
# Legacy tables yield the old dict; 3.x tables yield a BoundingBoxes2D.
# `from_legacy_row` takes the *column* value (the `bb_list` dict with its
# `image_width`/`image_height` keys), i.e. `raw` here — not the whole row.
bb = BoundingBoxes2D.from_legacy_row(raw, schema=bb_schema) if isinstance(raw, dict) else raw

bb.bounding_boxes   # Nx4 array of [x_min, y_min, x_max, y_max]
bb.labels           # N category indices
bb.confidences      # N floats (predictions only)

Once converted, always read by attribute (.bounding_boxes, .labels, .confidences) — never by dict key.

Bounds are now optional on geometry dataclasses¶

The geometry dataclasses (Geometry2D, Geometry3D, Keypoints2D, BoundingBoxes2D, OrientedBoundingBoxes2D, etc.) now have optional bounds. In 2.x bounds were required and defaulted to 0.0; from_row() raised ValueError if bounds were missing. In 3.x bounds default to None and from_row() accepts rows without bounds. When a max bound is specified but the corresponding min is not, the min defaults to 0.0.

# 3.x
geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0)  # x_min / y_min default to 0.0
geom = Geometry2D.create_empty()                          # All bounds are None
geom = Geometry2D.from_row(row)                           # Works even if bounds are missing

COCO importer — `include_iscrowd` and `per_instance_schemas` replaced by `per_instance_extras`¶

The include_iscrowd boolean and per_instance_schemas (pose-only) parameters on the COCO importer have been removed. The new per_instance_extras parameter covers both — it works for all tasks (detect, segment, pose) and supports both auto-inferred and explicit schemas:

# Auto-infer schemas from data
table = tlc.Table.from_coco(
    ...,
    per_instance_extras=["iscrowd", "my_custom_field"],
)

# Or provide explicit schemas
table = tlc.Table.from_coco(
    ...,
    per_instance_extras={
        "iscrowd": tlc.schemas.Int32Schema(),
        "score": tlc.schemas.Float32Schema(),
    },
)

COCO importer — new `per_image_extras` parameter¶

The COCO importer now accepts a per_image_extras parameter for preserving image-level custom fields as top-level table columns:

table = tlc.Table.from_coco(
    ...,
    per_image_extras=["date_captured", "flickr_url"],
)

The CocoExporter also accepts per_image_extras for round-trip support, writing specified table columns back into COCO image entries on export.

`CocoExporter.serialize(include_segmentation=...)` removed¶

The deprecated include_segmentation parameter on CocoExporter.serialize() (and the corresponding table.export(..., include_segmentation=...) kwarg) has been removed. Detection tables created in 3.x no longer carry segmentation data, so the parameter is no longer meaningful. The exporter no longer copies legacy bb_list-row segmentation into the exported COCO file either; old-style detection tables export with empty segmentation lists, matching the new-style BB path. Use task="segment" to work with segmentation data.

Integrations¶

Hugging Face¶

The transformers-bound Trainer integration is split out from the datasets-bound table machinery. Importing tlc.integration.hugging_face no longer requires datasets.

`TLCTrainer` removed¶

The deprecated TLCTrainer class has been removed. The 3LC integration now provides a single Trainer class that supports 3LC Tables and metrics collection.

# 2.x
from tlc.integration.hugging_face import TLCTrainer
trainer = TLCTrainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()

# 3.x
from tlc.integration.hugging_face.trainer import Trainer
trainer = Trainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()

The Trainer class is no longer re-exported from the top-level tlc or tlc.integration packages — import from tlc.integration.hugging_face.trainer.

HF Table classes are private — use `tlc.Table` factories¶

The 2.x TableFromHuggingFaceHub, TableFromHuggingFaceDataset, and the TableFromHuggingFace alias are no longer part of the public API. Construct HF-backed tables via:

tlc.Table.from_hugging_face_hub() — load from the Hugging Face Hub
tlc.Table.from_hugging_face_dataset() — wrap an in-memory datasets.Dataset

Top-level imports like from tlc.integration.hugging_face import TableFromHuggingFaceHub and the previously documented tlc.integration.hugging_face.table_from_hugging_face* module paths are no longer supported.

Detectron2¶

Importing tlc no longer eagerly tries to import detectron2; importing tlc.integration.detectron2 requires the optional detectron2 dependency.

`BoundingBoxMetricsCollector` lives in `tlc.integration.detectron2`¶

BoundingBoxMetricsCollector now lives exclusively in tlc.integration.detectron2. In practice this collector has always been used with Detectron2 — it relies on Detectron2-shaped ground-truth and prediction formats, and the COCO-style annotation dicts it consumes (CocoAnnotation, CocoGroundTruth, CocoPrediction) are produced by the Detectron2 hooks. Keeping it in the general-purpose metrics_collectors namespace implied a framework-agnostic collector that it never really was.

# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector

Detectron2 utilities no longer hoisted to `tlc.*`¶

Detectron2 metric collectors, hooks, and helpers were previously re-exported at the top-level tlc namespace via a wildcard import. They now live exclusively under tlc.integration.detectron2. Affected names: BoundingBoxMetricsCollector, CocoAnnotation, CocoGroundTruth, CocoPrediction, DetectronMetricsCollectionHook, MetricsCollectionHook, UmapReduceEmbeddingsHook, register_coco_instances. The 2.x all-caps spellings COCOAnnotation, COCOGroundTruth, COCOPrediction, and UMAPReduceEmbeddingsHook were hoisted alongside the PascalCase variants in 2.x and are also gone from the top level in 3.x (see Renames for the PascalCase replacements).

# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector

metrics_collector = BoundingBoxMetricsCollector(...)

`PIL.Image.LINEAR` shim removed¶

In 2.x, import tlc aliased PIL.Image.LINEAR = PIL.Image.BILINEAR as a back-compat shim for detectron2 ≤ v0.6, which referenced Image.LINEAR at import time (Pillow 10 removed it). In 3.x this patch is gone. If you are pinned to detectron2 ≤ v0.6, upgrade to a recent detectron2 commit, or apply the shim yourself before import detectron2:

from PIL import Image
if not hasattr(Image, "LINEAR"):
    Image.LINEAR = Image.BILINEAR

Legacy `bb_list` segmentation no longer copied into detectron2 annotations¶

register_coco_instances previously copied any segmentation field present alongside a legacy bb_list row into the detectron2 annotation dicts it produced. In 3.x this codepath has been removed; detection tables register with detectron2 without a segmentation field. Segmentation datasets should be registered as segmentation tables.

PyTorch Lightning — decorator removed¶

The @tlc.integration.pytorch_lightning.lightning_module class decorator has been removed in 3.x, along with the tlc.integration.pytorch_lightning package and the 3lc[lightning] install extra. 3LC integrates with Lightning using the public 3LC API and Lightning’s standard hooks — no decorator required.

For the migration pattern and end-to-end examples, see the PyTorch Lightning integration page. If you have a 2.x project that depends specifically on the decorator and the documented migration is not enough, please reach out to the 3LC team.

Migrate from 2.x to 3.x¶

Recommended migration procedure¶

Disambiguation hotspots¶

Don’t hallucinate replacements (LLMs)¶

Validation checklist¶

Dependency and install changes¶

Getting a 2.x-equivalent install¶

pandas is now an optional dependency¶

torch and torchvision are now optional dependencies¶

Namespace moves¶

Where everything moved¶

New names in the public sub-namespaces¶

tlc.constants prefix families¶

Removed from public API¶

Built-in schemas — reorganized¶

Removed re-exports¶

Renames¶

Casing — acronym → PascalCase¶

Parameter renames¶

Url.create_* classmethods → ProjectLayout methods¶

IndexingTable verbs (niche)¶

Scheme is now string constants¶

Annotation dataclass renames¶

Sample-type registry strings¶

Color helpers — free functions → ColorHelper¶

Keyword-only parameters¶

Affected callables and which arguments stay positional¶

Argument-order changes inside the keyword-only block¶

Common migration patterns¶

Behavioral changes¶

Default table_name changed from "table" to "initial"¶

Session.run_url is now a Url¶

MetricsTableWriter.finalize() now updates the Run automatically¶

Url.read() and Url.write() removed¶

Sample view always returns a dict¶

Table.map() replaced by Table.with_transform()¶

reduce_embeddings no longer accepts a list of tables¶

ClassificationMetricsCollector removed¶

TableWriter auto-detects sample-form vs row-form inputs¶

Table.latest() — wait_for_rescan and use_new_columns removed¶

UrlAdapterRegistry — default_value removed; async signatures cleaned up¶

Schema and Sample Type redesign¶

Convenience schemas — explicit parameters¶

shape parameter on scalar schemas¶

ImageSchema — one class, four modes¶

Common column attributes¶

Custom sample types¶

New public APIs in tlc.sample_types¶

CV and Annotation types¶

One dataclass per shape, with a bound schema-builder¶

Schema-builder replaces the 2.x *Schema classes¶

Bounding box format migration¶

Bounds are now optional on geometry dataclasses¶

COCO importer — include_iscrowd and per_instance_schemas replaced by per_instance_extras¶

COCO importer — new per_image_extras parameter¶

CocoExporter.serialize(include_segmentation=...) removed¶

Integrations¶

Hugging Face¶

TLCTrainer removed¶

HF Table classes are private — use tlc.Table factories¶

Detectron2¶

BoundingBoxMetricsCollector lives in tlc.integration.detectron2¶

Detectron2 utilities no longer hoisted to tlc.*¶

PIL.Image.LINEAR shim removed¶

Legacy bb_list segmentation no longer copied into detectron2 annotations¶

PyTorch Lightning — decorator removed¶

`pandas` is now an optional dependency¶

`torch` and `torchvision` are now optional dependencies¶

`tlc.constants` prefix families¶

`Url.create_*` classmethods → `ProjectLayout` methods¶

`IndexingTable` verbs (niche)¶

`Scheme` is now string constants¶

Color helpers — free functions → `ColorHelper`¶

Default `table_name` changed from `"table"` to `"initial"`¶

`Session.run_url` is now a `Url`¶

`MetricsTableWriter.finalize()` now updates the Run automatically¶

`Url.read()` and `Url.write()` removed¶

`Table.map()` replaced by `Table.with_transform()`¶

`reduce_embeddings` no longer accepts a list of tables¶

`ClassificationMetricsCollector` removed¶

`TableWriter` auto-detects sample-form vs row-form inputs¶

`Table.latest()` — `wait_for_rescan` and `use_new_columns` removed¶

`UrlAdapterRegistry` — `default_value` removed; async signatures cleaned up¶

`shape` parameter on scalar schemas¶

`ImageSchema` — one class, four modes¶

New public APIs in `tlc.sample_types`¶

Schema-builder replaces the 2.x `*Schema` classes¶

COCO importer — `include_iscrowd` and `per_instance_schemas` replaced by `per_instance_extras`¶

COCO importer — new `per_image_extras` parameter¶

`CocoExporter.serialize(include_segmentation=...)` removed¶

`TLCTrainer` removed¶

HF Table classes are private — use `tlc.Table` factories¶

`BoundingBoxMetricsCollector` lives in `tlc.integration.detectron2`¶

Detectron2 utilities no longer hoisted to `tlc.*`¶

`PIL.Image.LINEAR` shim removed¶

Legacy `bb_list` segmentation no longer copied into detectron2 annotations¶