Migrate from 2.x to 3.x¶
This guide describes the common patterns involved in migrating from version 2.x to version 3.x of the 3LC Python API.
3.x makes the public surface explicit, redesigns the schema / sample-type system, and groups CV annotation runtime + schema in one place per shape. Most user code needs only mechanical updates (import paths, keyword arguments); a smaller set of behavior changes need attention even if the names look unchanged.
Recommended migration procedure¶
The size of the change makes a structured pass safer than ad-hoc search-and-replace. The recommended order is:
Install changes first. Apply Dependency and install changes.
pandasandtorchare now optional.Apply namespace and rename passes. Walk Namespace moves and Renames. These two passes account for the bulk of the diff in typical 2.x → 3.x migrations and are mostly mechanical.
Convert positional calls. Apply Keyword-only parameters. Any 2.x call that still passes layout fields positionally needs to become keyword form.
Audit silent behavior changes. Read Behavioral changes end to end — these are the things that don’t show up as
ImportErrororAttributeError, so an otherwise-clean migration can still ship a bug.Convert CV columns. Apply Schema and Sample Type redesign and CV and Annotation types. Bounding boxes have moved from the flat
bb_listformat to the geometry-basedBoundingBoxes2Ddataclass; segmentation has split intoSegmentationPolygons/SegmentationMasks.Update integration code. Apply Integrations — Trainer import path, Detectron2 hoisting removal, PyTorch Lightning decorator removal.
Disambiguation hotspots¶
The following items are likely to missed by a simple search-and-replace; verify by hand:
root=parameter. Renamed toroot_url=onTable.squash,ProjectHelper.register_project_url_alias, and the inheritedAddressableObject.root_urlproperty. Not renamed onTable.from_image_folder, whererootrefers to the image folder, not the project root.Trainer. 2.x’stlc.Trainercame fromtlc.integration.hugging_face. There is no top-leveltlc.Trainerin 3.x; import fromtlc.integration.hugging_face.trainer.BoundingBoxMetricsCollector. This is specific to detectron2 in 3.x. The 2.xtlc.BoundingBoxMetricsCollectorpath is gone; it lives only attlc.integration.detectron2.BoundingBoxMetricsCollector.tlc.Object. Both renamed and redefined.tlc.objects.Objectis now the lightweight root of the hierarchy; usetlc.objects.AddressableObjectto recover the 2.x semantics oftlc.Object(URL-bearing).sample_typecontainer forms."tuple","list","box"were removed (with a load-time warning, not an error). Code that didimage, label = table[0]needs a dict update even though imports look fine.
Don’t hallucinate replacements (LLMs)¶
If a symbol is not mentioned in this guide, treat it as intentionally removed or renamed via a documented
pattern. Do not invent new module paths — tlc.core.X is privatized to tlc._core.X only at the same
internal path; the public path is whatever sub-namespace is curated for that subsystem (see
Namespace moves).
Validation checklist¶
After the passes above:
ruff check .— catches removed names that turn intoNameErrorafter import.mypy src/— catches signature drift from kw-only changes andUrlvsstrmismatches.Run the application end to end on a pre-3.0 project once. The first run will surface silent behavior changes (the
table_namedefault in particular).
Dependency and install changes¶
Getting a 2.x-equivalent install¶
Several dependencies that were required in 2.x, and therefore always installed alongside 3lc, have been made
optional in 3.x. This reduces the install footprint and import load time for workloads that do not need those
dependencies. To get the same set of packages as a 2.x install, install the extras that were required in 2.x:
pip install '3lc[pandas,torch]'
See Dependencies for the full list of available extras, and the
PyTorch installation notes for guidance on accelerator-specific torch wheels.
pandas is now an optional dependency¶
In 2.x, pandas was a hard runtime dependency of 3lc and was always installed alongside the package. In
3.x it has been moved to an optional extra to reduce the install footprint for users who do not need it.
pip install 3lc no longer pulls in pandas. To use the pandas-facing entry points, install the extra with
pip install '3lc[pandas]', or pip install pandas or equivalent.
The entry points that require pandas to be importable are:
Table.to_pandas— raisesImportErrorwith installation instructions ifpandasis not available
If your code already uses any of these, install the pandas extra and no code changes are required.
torch and torchvision are now optional dependencies¶
In 2.x, torch and torchvision were hard runtime dependencies of 3lc and import tlc would fail if either
was missing. In 3.x they have been moved to an optional [torch] extra so that torch-free workflows (Tables,
Runs, the Object Service, URL adapters) can run on a minimal install.
pip install 3lc no longer pulls in torch or torchvision. To use the torch-bound entry points, install the
extra with pip install '3lc[torch]'. The [huggingface] extra depends on [torch] transitively, so
pip install '3lc[huggingface]' continues to pull both in.
The entry points that require torch to be importable are:
Predictor,EmbeddingsMetricsCollector, andSegmentationMetricsCollectorThe framework integrations under
tlc.integration.detectron2,tlc.integration.hugging_face, andtlc.integration.super_gradients
Calling any of these without torch installed raises ImportError. The framework integration packages
require the framework itself as well — their ImportError points at the framework’s own install command
(pip install detectron2, pip install super-gradients, pip install datasets / transformers).
FunctionalMetricsCollector and the
MetricsCollector base class do
not require torch.
For accelerator-specific builds (CUDA, ROCm, MPS), follow the official PyTorch install instructions to pick the right index URL: https://pytorch.org/get-started/locally/.
Namespace moves¶
In 2.x, almost every public name was reachable directly at the top-level tlc.* namespace via wildcard
imports from tlc.client.* and tlc.core.*. In 3.x, the public surface has been curated and is declared
explicitly: each public module sets __all__ to the names it exports, following the convention from
PEP 8 — Public and Internal Interfaces.
Anything not listed in __all__, or sitting behind an underscore-prefixed package path, is private and
may change without notice.
tlc.*is now an explicit, curated set of names — the most commonly used types and free functions (Table,Run,Url,Schema,TableWriter,MetricsTableWriter, theinit/active_run/logsession helpers,collect_metrics,config).Curated public sub-namespaces group related concerns:
tlc.schemas,tlc.constants,tlc.metrics,tlc.helpers,tlc.export,tlc.reduction,tlc.data_types,tlc.url,tlc.integration,tlc.sample_types,tlc.objects,tlc.configuration. Each is equally public.Implementation packages are private.
tlc.coreno longer exists as a public path — it has been renamed totlc._coreand is private.tlc.clienthas been removed entirely; its contents have been redistributed to the curated public sub-namespaces above. Underscore-prefixed names (tlc._core,tlc.schemas._foo, etc.) may move, rename, or be removed at any time.tlcsaashas been renamed to_tlcsaasto reflect that it was always internal infrastructure for the 3LC client.tlc.configis a bound shortcut for the liveConfigurationsingleton (equivalent totlc.configuration.Configuration.instance()). It is resolved lazily on first access, soimport tlcdoes not force Configuration construction for callers that never touch config. TheConfigurationtype itself now lives attlc.configuration.Configurationrather than the top level — most code reaches it through the type oftlc.configand never needs to name it.
Where everything moved¶
If a name lived at tlc.X in 2.x and is not listed below, see
Removed from public API.
2.x location |
3.x location |
|---|---|
|
unchanged — |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
unchanged |
|
unchanged |
New names in the public sub-namespaces¶
The following are new in 3.x with no 2.x equivalent at any public path:
tlc.schemas:ConfidenceSchema,DatetimeStringSchema,EmbeddingSchema,Float64Schema,FractionSchema,ImageSchema(replaces 2.xImageUrlSchema— see ImageSchema — one class, four modes),Int8Schema,Int16Schema,Int64Schema,IoUSchema,ProbabilitySchema,SemanticSegmentationSchema,Uint8Schema,Uint16Schema,Uint32Schema,Uint64Schema,UrlSchema.tlc.url:IfExistsOption,list_url_adapters,list_url_schemes.tlc.helpers:ColorHelper,AnnotationHelper,ProjectHelper,DateTimeHelper,ImageHelper,AnnotationColumn,AnnotationType(some have non-public 2.x counterparts undertlc.core.export.annotation_utils).tlc.export:RowExporter,ExporterInfo,ExporterRegistry,ExporterSource,list_exporters,list_exporter_formats.tlc.sample_types:SampleType,SampleTypeRegistry,register_sample_type,get_sample_types(see Schema and Sample Type redesign).tlc.data_types:BoundingBoxes2D,BoundingBoxes3D,SegmentationPolygons,SegmentationMasks(see CV and Annotation types).
tlc.constants prefix families¶
tlc.constants exposes a flat namespace — every constant is accessible directly at tlc.constants.X, with
the same name it had at tlc.X in 2.x. The names group into:
Prefix family |
Examples |
|---|---|
Column names (no prefix) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A handful of role constants that were defined but neither emitted by Python nor read by the dashboard have
been removed: NUMBER_ROLE_PIXEL_COUNT, NUMBER_ROLE_BB_HALF_SIZE_X / _Y (use _BB_SIZE_X / _Y),
NUMBER_ROLE_METRIC_STRING_INDEX, the bare NUMBER_ROLE_RGB_COMPONENT (use the channel-specific
_RED / _GREEN / _BLUE), STRING_ROLE_NONE (use the literal ""),
STRING_ROLE_INSTANCE_SEGMENTATION_URL.
Removed from public API¶
These were exposed at the top level in 2.x but are no longer addressable. Use the public surface instead, or
reach into tlc._core.* privately if you genuinely need to:
Sessionclass — private. Use the free functionstlc.init,tlc.close,tlc.active_run,tlc.set_active_run,tlc.active_project_name(which were always the recommended API).Filter criteria —
tlc.FilterCriterion,tlc.FreeTextFilterCriterion,tlc.IntegerSetFilterCriterion,tlc.LogicalNotFilterCriterion,tlc.NumericRangeFilterCriterion,tlc.Region2DFilterCriterion,tlc.Region3DFilterCriterion,tlc.TextFilterCriterion,tlc.create_filter,tlc.create_optional_filter. The dashboard is the supported way to author filtered views in 3.x.Registry classes and URL adapters —
tlc.ObjectTypeRegistry,tlc.ObjectRegistry,tlc.ObjectReference,tlc.UrlAliasRegistry,tlc.UrlAdapterRegistry,tlc.AdapterInfo,tlc.AdapterSource, and the concrete URL adapter classes (tlc.S3UrlAdapter,tlc.GcsUrlAdapter,tlc.HttpUrlAdapter,tlc.FileUrlAdapter,tlc.AbfsUrlAdapter,tlc.ApiUrlAdapter). Adapters are pluggable via thetlc.url_adaptersentry-point group; subclasstlc.url.UrlAdapterand decorate with@tlc.url.register_url_adapter.TableFrom*constructor classes — private. Use the correspondingtlc.Table.from_*factory method instead (from_coco,from_csv,from_parquet,from_pandas,from_dict,from_torch_dataset,from_yolo_url,from_hugging_face_hub,from_hugging_face_dataset).TableFromTFRecordSet— was a placeholder, never implemented.Bulk-data URL helpers —
tlc.bulk_data_url_context,tlc.increment_and_get_bulk_data_url,tlc.relativize_bulk_data_url,tlc.reset_bulk_data_url,tlc.set_bulk_data_url_prefix. Bulk-data URL accounting is internal in 3.x.Table.create_sampler()(deprecated in 2.x) — removed. Use the standalone factory functions intlc.integration.torch.samplers: the general-purposecreate_sampler(table, ...)dispatcher, or the explicitcreate_weighted_sampler,create_random_sampler,create_sequential_sampler, andcreate_repeat_by_weight_sampler.Table.get_column()— useTable.get_column_as_pyarrow_array()instead.Table.row_schema— useTable.rows_schemainstead. The two properties were near-duplicates; the remaining one returns the table’s live schema reference and should be treated as read-only.Table.from_yolo(...)(the deprecated YAML-parsing factory) — useTable.from_yolo_url(...)directly for a single image folder or text file. For a YOLO dataset YAML file with splits, install3lc-ultralyticsand usetlc_ultralytics.create_tables_from_yaml_fileto get oneTableper split:from tlc_ultralytics import create_tables_from_yaml_file tables = create_tables_from_yaml_file( dataset="/path/to/my/dataset.yaml", task="detect", project_name="My YOLO Project", )
The same keyword arguments for pose (
points,oks_sigmas, …) can be passed to configure the pose task.tlc.client.sample_typemodule — removed. Sample types are referenced by string name through theSampleTypeRegistry(see Schema and Sample Type redesign).tlc.helpers.JsonHelper— internal JSON-serialization plumbing, now private. Its only methods (to_minimal_dict,sort_by_rank) operate on internal object/schema structures and were not part of any user workflow. The class still exists at the private pathtlc.helpers.json_helper.JsonHelper. Likewise, the internal-onlySchemaHelper.pyarrow_list_to_tlc_schemaandSegmentationMetricsCollector.tensor_to_pil_imageare now underscore-prefixed (SchemaHelperitself stays public).@tlc.integration.pytorch_lightning.lightning_moduledecorator and thetlc.integration.pytorch_lightningpackage — removed. Integrate via Lightning’s standard hooks; see Integrations.
Built-in schemas — reorganized¶
The set of built-in schemas has been cleaned up. A shape= parameter on the scalar schemas absorbs every
*ListSchema wrapper, the CV annotation *Schema classes are now classmethods on the matching
tlc.data_types dataclass (so a column’s runtime type and its schema-builder live together), and a handful
of single-purpose label / vector / RGB-component schemas have been folded back into the corresponding
primitive with number_role=.
Concretely: some schemas are simply gone; others survive but are constructed differently.
2.x schema |
3.x replacement |
|---|---|
|
|
|
The matching |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ConfidenceSchema, EmbeddingSchema, FractionSchema, IoUSchema, ProbabilitySchema, and
SemanticSegmentationSchema are new pre-configured schemas in 3.x — useful when you’d otherwise reach for
Float32Schema(number_role="...") with a stock role.
CategoricalLabel was a 2.x sample type (constructed as CategoricalLabel(display_name, classes=...)),
not a schema, but it was commonly passed where a column schema was expected — e.g. in the
column_schemas/schema mapping of Run.add_metrics or TableWriter. Replace it with
CategoricalLabelSchema(classes, display_name=...): the first positional argument becomes the
classes= keyword.
Removed re-exports¶
The following import paths were deprecated re-exports and have been removed:
Old |
New |
|---|---|
|
|
|
|
|
removed — see Integrations |
|
|
|
|
|
|
The old list-based BoundingBox classes (BoundingBox, XYXYBoundingBox, CenteredXYWHBoundingBox, and
friends, previously at tlc.core.builtins.types.bounding_box / tlc.core.data_formats.bounding_boxes) have
been removed entirely, superseded by the BoundingBoxes2D / BoundingBoxes3D dataclasses in
tlc.data_types.
Renames¶
This section covers renames where the symbol still exists with the same role, just under a new name or parameter spelling. The 2.x names emitted deprecation paths during 2.x; in 3.x they are removed.
Casing — acronym → PascalCase¶
2.x |
3.x |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Parameter renames¶
Site |
2.x |
3.x |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
parameter removed — it had no effect; the detectron2 |
|
|
construct with |
Table.from_image_folder(root=...) is not affected — that root parameter refers to the image folder
path, not the project root URL.
Url.create_* classmethods → ProjectLayout methods¶
The Url.create_* classmethods are gone. They are now methods on ProjectLayout:
2.x |
3.x |
|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The corresponding 2.x classmethods all accepted root= in 2.x; pass root_url= to the ProjectLayout
methods instead.
IndexingTable verbs (niche)¶
The IndexingTable subclasses (TableIndexingTable, RunIndexingTable, ConfigIndexingTable) live under
tlc._core and are considered internal. If you were using the escape hatch
tlc.TableIndexingTable.instance().wait_for_complete_index() to force a re-scan after editing files outside
the Python package, the verbs have been renamed:
wait_for_complete_index()→sync()(blocking, returnsTrueon completion)request_reindex()→request_sync()(non-blocking, returns a request token)
The force= parameter has been dropped.
Scheme is now string constants¶
In 2.x Scheme was an enum; in 3.x it is a string-constants class and url.scheme is already a string.
Drop any .value calls:
# 2.x
url.scheme.value # "file"
Scheme.FILE.value # "file"
# 3.x
url.scheme # "file"
Scheme.FILE # "file"
Annotation dataclass renames¶
2.x |
3.x |
|---|---|
|
|
|
|
|
|
|
|
|
|
BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, SegmentationMasks are new in 3.x — see
CV and Annotation types.
Sample-type registry strings¶
The sample_type registry strings are now snake_case. The format-specific variants replace the 2.x dict
form:
2.x |
3.x |
|---|---|
|
|
|
|
|
|
|
|
|
|
(n/a — was the |
|
Old PascalCase and pre-rename strings are registered as legacy aliases — existing serialized tables still
load. "pil_image" is registered as an alias for "pil_png", so old PascalCase "PILImage" and bare
"pil_image" references resolve to PNG by default. The previous {"name": "pil_image", "format": "jpeg"}
dict form is removed; pass "pil_jpeg" directly instead.
Color helpers — free functions → ColorHelper¶
# 2.x
tlc.rgb_tuple_to_hex(rgb)
tlc.hex_to_rgb_tuple(hex_str)
# 3.x
tlc.helpers.ColorHelper.rgb_tuple_to_hex(rgb)
tlc.helpers.ColorHelper.hex_to_rgb_tuple(hex_str)
Keyword-only parameters¶
In 2.x, most parameters on the public API were positional. In 3.x, public callables follow the convention positional only when the role is obvious at the call site; everything else keyword-only. This lets new keyword arguments be added near the top of the signature where they remain visible, instead of being appended to the end of a long positional chain.
If you already pass these arguments by name (e.g. Table.from_pandas(df, schema=schema, table_name="t")),
nothing changes. Update any positional call that broke.
Affected callables and which arguments stay positional¶
Callable |
Positional in 3.x |
Everything else |
|---|---|---|
|
|
— |
|
(none) |
|
|
|
all other parameters |
|
|
all other parameters |
|
the source file |
all other parameters |
|
|
including |
|
|
including |
|
|
including |
|
|
including |
|
|
including |
|
|
all other parameters |
|
|
all other parameters |
|
|
all other parameters |
|
|
|
|
(none) |
|
|
(none) |
|
|
|
|
|
|
including |
|
(none) |
|
|
(none) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
the column |
the flag args ( |
|
|
|
|
(none) |
|
|
the source |
|
|
(none) — fully keyword-only |
every argument ( |
|
(none) — fully keyword-only |
every argument, including |
|
the data argument ( |
the flag args ( |
|
|
|
|
(none) |
|
|
(none) |
|
|
|
|
|
|
|
|
the geometry data (plus |
|
|
the six bound values |
|
Two of these are now fully keyword-only (no positional arguments at all): the metrics-collector
constructors and the annotation <Dataclass>.schema(...) builders. For the collectors, 2.x code such as
EmbeddingsMetricsCollector([99]) or FunctionalMetricsCollector(my_fn) must become
EmbeddingsMetricsCollector(layers=[99]) / FunctionalMetricsCollector(collection_fn=my_fn). The
.schema(...) builders are new in 3.x, so there is no 2.x positional form to port — just always call them
with keywords (e.g. Keypoints2D.schema(num_keypoints=17, classes=...)).
Argument-order changes inside the keyword-only block¶
Even when you already used keyword arguments, the documentation order on Table.from_* factories (and
related helpers like TableWriter, Table.from_names, Run.from_names, Run.copy) has been aligned to a
single canonical order:
Method-specific options first.
Then the project-layout block in hierarchy order —
project_name → dataset_name → table_name → root_url → table_url. (In 2.x the layout block wastable_name → dataset_name → project_name. Reading top-down now matches the hierarchy: a project contains datasets which contain tables.)Then
if_exists, weights, and common metadata.
This affects nothing at the call site (kwargs are order-independent), but if you rely on
inspect.signature or generated docs, expect the parameter listing to read differently.
Common migration patterns¶
# 2.x
table = tlc.Table.from_pandas(df, schema, "my_table", "my_dataset", "my_project")
tlc.collect_metrics(table, mc, model, constants={"epoch": 1})
run = tlc.Run.from_names("my_run", "my_project")
# 3.x
table = tlc.Table.from_pandas(
df,
schema=schema,
project_name="my_project",
dataset_name="my_dataset",
table_name="my_table",
)
tlc.collect_metrics(table, mc, predictor=model, constants={"epoch": 1})
run = tlc.Run.from_names(project_name="my_project", run_name="my_run")
Behavioral changes¶
This section covers changes that may import and run cleanly but behave differently at runtime. Read each subsection even if your import passes don’t flag anything, so that existing code does not silently change in ways your do not expect.
Default table_name changed from "table" to "initial"¶
When no table_name is passed to Table.from_*, Table.from_names, Table.copy, TableWriter, or
ProjectLayout.table_url, the resulting URL is now .../tables/initial/ instead of .../tables/table/.
Pre-3.0 projects have their data at .../tables/table/. Reading them without specifying a table_name —
for example tlc.Table.from_names(project_name="p", dataset_name="d") — will now resolve to a non-existent
.../tables/initial/ path. Round-trip patterns like tlc.Table.from_dict(...).latest() that rely on
if_exists="reuse" will also silently create a new initial/ table instead of reusing the old table/
one. When 3LC detects a legacy .../tables/table/ alongside a new-default target, it emits a warning
pointing this out. The warning fires for if_exists values reuse, rename, and raise, but not
overwrite (which is an explicit “write regardless” intent).
Migration: For pre-3.0 projects, pass table_name="table" explicitly when reading an existing table if
you would have previously passed no table_name and you want to preserve use of the old default name..
Session.run_url is now a Url¶
In 2.x, Session.run_url was stored as a str. In 3.x it is stored as a Url object,
matching the return type of Session.initialize_run.
Migration: If you were comparing or concatenating session.run_url as a string, call .to_str() on
it, or update the code to work with Url directly. Reads that pass it into APIs accepting Url | str
need no change.
MetricsTableWriter.finalize() now updates the Run automatically¶
In 2.x, after calling MetricsTableWriter.finalize(), you had to manually update the run with the written
metrics:
# 2.x
metrics_writer = MetricsTableWriter(run_url=run.url, ...)
metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
metrics_table = metrics_writer.finalize()
# Manual step required to associate the metrics table with the run
run.update_metrics(metrics_writer.get_written_metrics_infos())
In 3.x, finalize() automatically updates the run. The recommended pattern is to use MetricsTableWriter
as a context manager, which calls finalize() on exit:
# 3.x
with MetricsTableWriter(run_url=run.url, ...) as metrics_writer:
metrics_writer.add_batch({"loss": [0.1, 0.2], "example_id": [0, 1]})
# The run is automatically updated on exit.
finalize() can only be called once; calling it again raises RuntimeError.
Migration: Remove manual calls to run.update_metrics(metrics_writer.get_written_metrics_infos())
after finalize().
Url.read() and Url.write() removed¶
The generic Url.read() and Url.write() methods have been removed. Use the type-specific methods
instead:
# 2.x
content = url.read() # bytes (default mode="b")
content = url.read(mode="t") # str
url.write(b"data")
url.write("text", mode="t")
url.write(content, if_exists="raise")
# 3.x
content = url.read_bytes()
content = url.read_text(encoding="utf-8")
url.write_bytes(b"data", if_exists=...)
url.write_text("text", if_exists=...)
The if_exists argument is supported by both write_bytes and write_text (values: "overwrite",
"rename", "raise").
Sample view always returns a dict¶
Container sample types ("tuple", "list", "box", "horizontal_tuple", "horizontal_list") and the
is_leaf flag on SampleType were removed. The sample view of any composite table is now a dict — one
entry per visible column — instead of a tuple, list, or single value.
# 2.x — schema with sample_type="tuple"
image, label = table[0]
# 3.x — sample view is always a dict
sample = table[0]
image, label = sample["image"], sample["label"]
Tables persisted in 2.x with a container sample_type continue to load: the resolver coerces the legacy
name to identity and emits a one-time warning per name. No data migration is required, but reader code
that destructured the sample as a tuple needs to be updated. If you constructed a schema with
sample_type="tuple" / "list" / "box" to control the sample shape, drop the sample_type argument;
the structural part of the schema (values=...) is unchanged. Schema.from_schema_like applied to a tuple
of schemas no longer attaches sample_type="tuple" to the resulting composite — it produces a dict-shaped
schema with value_i keys (or display names where provided).
reduce_embeddings no longer accepts a list of tables¶
tlc.reduction.reduce_embeddings previously accepted either a single Table or a list[Table], returning
a Table in the first case and a dict[Url, Url] in the second (with a DeprecationWarning on the list
form). It now accepts a single Table and always returns a Table:
# 2.x
url_mapping = tlc.reduction.reduce_embeddings([table_a, table_b], method="umap")
reduced_a = tlc.Table.from_url(url_mapping[table_a.url])
# 3.x
reduced_a = tlc.reduction.reduce_embeddings(table_a, method="umap")
reduced_b = tlc.reduction.reduce_embeddings(table_b, method="umap")
The other multi-table reduction helpers (reduce_embeddings_per_dataset,
reduce_embeddings_by_foreign_table_url, reduce_embeddings_with_producer_consumer) are unchanged and
still take list[Table].
ClassificationMetricsCollector removed¶
ClassificationMetricsCollector has been removed. It bundled four metrics (loss, predicted, accuracy,
confidence) behind a rigid input contract — batch had to be a (samples, labels) tuple and the model
output had to be a raw logits tensor — which made it inflexible for HF-style models or models returning
anything other than torch.Tensor. The same metrics are easy to compute with
FunctionalMetricsCollector:
import torch
import torch.nn.functional as F
import tlc
def classification_metrics_fn(batch, predictor_output: tlc.PredictorOutput) -> dict:
_, labels = batch
predictions = predictor_output.forward
if labels.dim() == 2 and labels.shape[1] > 1:
labels = torch.argmax(labels, dim=1)
softmax_out = F.softmax(predictions, dim=1)
predicted = torch.argmax(predictions, dim=1)
confidence = torch.gather(softmax_out, 1, predicted.unsqueeze(1)).squeeze(1)
accuracy = predicted.eq(labels).float()
loss = F.cross_entropy(predictions, labels, reduction="none")
return {
"loss": loss.detach().cpu().numpy(),
"predicted": predicted.detach().cpu().numpy(),
"accuracy": accuracy.detach().cpu().numpy(),
"confidence": confidence.detach().cpu().numpy(),
}
schemas = {
"predicted": tlc.schemas.CategoricalLabelSchema(
display_name="predicted label",
classes=class_names,
),
}
collector = tlc.metrics.FunctionalMetricsCollector(
collection_fn=classification_metrics_fn,
schema=schemas,
)
TableWriter auto-detects sample-form vs row-form inputs¶
TableWriter now auto-detects sample-form vs row-form inputs per value (previously all inputs were
treated as sample-form). Both work without a mode flag:
# Sample-form: a PIL.Image is serialized to a file and the URL is stored
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": pil_image})
# Row-form: a URL string is stored directly (relativized against the table)
writer = tlc.TableWriter(schema={"image": tlc.schemas.ImageSchema()}, project_name="My Project")
writer.add_row({"image": "path/to/image.png"})
If you have custom SampleType subclasses, make sure their
accepts() implementation returns True only for
sample-form inputs — it is what the writer uses to distinguish the two.
Table.latest() — wait_for_rescan and use_new_columns removed¶
Table.latest() no longer accepts wait_for_rescan. The blocking/non-blocking choice is now expressed
through timeout:
2.x
latest(wait_for_rescan=True)→ 3.xlatest()(defaults totimeout=30.0)2.x
latest(wait_for_rescan=False)→ 3.xlatest(timeout=0)(non-blocking; in-process fast-path only)Pass
timeout=Noneto block indefinitely.
use_new_columns is also removed. With dict-based samples in 3.x, extra keys no longer disrupt downstream
consumers; columns that should not appear in samples can be marked at the column level via
sample_type="hidden" (or any sample type with is_included_in_sample = False).
UrlAdapterRegistry — default_value removed; async signatures cleaned up¶
UrlAdapterRegistry read methods used to accept a default_value to return when no adapter could be
found. In 3.x, Url validates schemes at construction time, so the “no adapter” code path is unreachable
for valid URLs. The read methods now raise ValueError if no adapter is found; the getter methods simply
return None:
# 3.x
content = UrlAdapterRegistry.read_string_content_from_url(url)
data = UrlAdapterRegistry.read_binary_content_from_url(url)
data = await UrlAdapterRegistry.read_binary_content_from_url_async(url)
adapter = UrlAdapterRegistry.get_url_adapter_for_url(url) # returns None if not found
adapter = UrlAdapterRegistry.get_url_adapter_for_scheme(scheme) # returns None if not found
All *_async methods on ObjectRegistry, UrlAdapterRegistry, and the URL adapters themselves are now
proper async methods — in 2.x they were sync methods that returned concurrent.futures.Future. Replace
future = X.foo_async(...); future.result() with await X.foo_async(...):
# 2.x
future = ObjectRegistry.delete_object_from_url_async(url); future.result()
# 3.x
await ObjectRegistry.delete_object_from_url_async(url)
Async convenience methods are now available directly on UrlAdapterRegistry
(read_binary_content_from_url_async, write_binary_content_to_url_async, delete_url_async) — adapter
lookup is handled automatically. If you previously looked up an adapter just to call async methods, use
the registry methods directly for cleaner code.
Schema and Sample Type redesign¶
The schema system has been redesigned. The old SampleType class hierarchy has been replaced with a new
SampleType base class, a SampleTypeRegistry for pluggable registration, and convenience schemas that
configure storage and transforms in one step. Schemas now directly control how data is transformed and
stored.
Name-level changes (registry strings, sample_from_row → from_row, etc.) are in
Sample-type registry strings and the renames table. This section covers
the API redesign that needs more than search-and-replace.
Convenience schemas — explicit parameters¶
All built-in convenience schemas now have explicit parameters — no more **kwargs or **schema_kwargs.
IDE autocompletion shows every available parameter, and typos are caught at the call site.
shape parameter on scalar schemas¶
Scalar schemas now accept a shape parameter for arrays of any dimensionality. The previous *ListSchema
convenience classes have been folded into this parameter; only Float32ListSchema and
CategoricalLabelListSchema survive as thin wrappers (because they’re common-enough that a dedicated class
reads more clearly than shape=(-1,)).
Convenience class |
Equivalent primitive |
|---|---|
|
|
|
|
|
|
Use -1 for variable-size dimensions: Float32Schema(shape=(-1, -1)) for a variable 2D array.
Float32ListSchema no longer accepts a number_role parameter. For a list of fractions / confidences /
probabilities, use the baked schema with a list shape: ConfidenceSchema(shape=(-1,)) instead of
Float32ListSchema(number_role="fraction/confidence"). The same pattern applies to other roled list
columns — pick the baked schema (EmbeddingSchema, FractionSchema, ConfidenceSchema,
ProbabilitySchema, IoUSchema, CategoricalLabelSchema, …) and pass shape=.
ImageSchema — one class, four modes¶
ImageSchema is new in 3.x and covers every image-column use case via its sample_type kwarg. It replaces
the 2.x ImageUrlSchema class and consolidates the image-format selection that 2.x performed via the
sample-type dict form {"name": "pil_image", "format": ...} (now removed; see
Sample-type registry strings).
# 2.x — separate class for URL columns; format chosen via sample-type dict form
ImageUrlSchema() # URL passthrough
Schema(value=ImageUrlStringValue(), sample_type="pil_image") # PIL → file (PNG)
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "jpeg"})
Schema(value=ImageUrlStringValue(), sample_type={"name": "pil_image", "format": "webp"})
# 3.x — one class, four modes
ImageSchema(sample_type="url") # URL passthrough — pre-existing files
ImageSchema() # PIL → file (PNG, default)
ImageSchema(sample_type="pil_jpeg") # PIL → file (JPEG)
ImageSchema(sample_type="pil_webp") # PIL → file (WEBP)
All four modes serialize to the same wire format (ImageUrlStringValue); only the Python-side behavior
differs. sample_type=None is accepted as an alias for "url".
Common column attributes¶
All schemas now accept display_name, description, writable, default_visible, and default_value as
explicit parameters:
# 2.x — had to use **kwargs, not discoverable
Float32Schema(display_name="Score", writable=False) # worked but wasn't documented
# 3.x — explicit, IDE-discoverable
Float32Schema(display_name="Score", writable=False, default_visible=True, default_value=0.0)
display_importance is accepted by the base Schema and by the EpochSchema / IterationSchema system
schemas, but not by the convenience scalar / categorical / image schemas. Construct the base Schema
directly if you need a non-default display ordering on a convenience-shaped column.
Custom sample types¶
Custom SampleType subclasses should be migrated to the new SampleType base class, registered via
@tlc.sample_types.register_sample_type. The method names changed: sample_from_row → from_row,
row_from_sample → to_row.
# 2.x
class MyType(tlc.sample_types.SampleType):
def sample_from_row(self, value):
return custom_decode(value)
def row_from_sample(self, value):
return custom_encode(value)
# 3.x
@tlc.sample_types.register_sample_type("my_type")
class MyType(tlc.sample_types.SampleType):
def from_row(self, value):
return custom_decode(value)
def to_row(self, value):
return custom_encode(value)
New public APIs in tlc.sample_types¶
SampleType— base class for all sample types.SampleTypeRegistry— registry for looking up and registering custom sample types.register_sample_type— decorator for registering aSampleTypesubclass or a dataclass withto_row()/from_row()methods.get_sample_types— list all registered sample type names.
CV and Annotation types¶
This section bundles the runtime dataclasses, schema-builders, and importer / exporter changes for
computer-vision annotations. The unifying point: each annotation shape now owns both its runtime data and
its column schema-builder under one dataclass in tlc.data_types.
One dataclass per shape, with a bound schema-builder¶
All CV annotation types now follow a consistent plural-base naming convention. The base name is always
plural (e.g. BoundingBoxes2D), and the dataclass owns both runtime data and its column schema-builder
(Dataclass.schema(...)):
Sample-type string |
Dataclass |
Schema-builder |
|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
All dataclasses live under tlc.data_types. Constructor arguments are unchanged from their 2.x
equivalents — only the spelling is (see Annotation dataclass renames).
# 2.x
from tlc.core.data_formats import Geometry2DInstances, OBB2DInstances
geom = Geometry2DInstances.create_empty(x_max=640.0, y_max=480.0)
obbs = OBB2DInstances.create_empty(x_max=640.0, y_max=480.0)
# 3.x
from tlc.data_types import Geometry2D, OrientedBoundingBoxes2D
geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0)
obbs = OrientedBoundingBoxes2D.create_empty(x_max=640.0, y_max=480.0)
BoundingBoxes2D, BoundingBoxes3D, SegmentationPolygons, and SegmentationMasks are new in 3.x and
have no 2.x dataclass counterpart. The 2.x sample types InstanceSegmentationPolygons /
InstanceSegmentationMasks (from tlc.client.sample_type) were dropped along with the rest of that
module.
Schema-builder replaces the 2.x *Schema classes¶
The 2.x tlc.SegmentationSchema (which selected storage form via its sample_type argument) has been
split into two dataclass-bound builders:
2.x |
3.x |
|---|---|
|
|
|
|
# 2.x
schema = {"seg": tlc.SegmentationSchema(classes=["cat", "dog"])}
# 3.x
from tlc.data_types import SegmentationPolygons
schema = {"seg": SegmentationPolygons.schema(classes=["cat", "dog"])}
The full list of removed *Schema classes (and their replacements) is in
Built-in schemas — reorganized.
Bounding box format migration¶
3.x replaces the flat bb_list dict format with the geometry-based
BoundingBoxes2D dataclass (tlc.data_types.BoundingBoxes2D).
Existing tables written by 2.x still load — their bounding-box columns are stored in the legacy
bb_listform and are not rewritten on read.New tables, importers, and metrics produce
BoundingBoxes2Dinstances, and the built-in schema builder (BoundingBoxes2D.schema(...)) is what 3.x emits by default. The COCO and YOLO importers/exporters auto-detect the stored format and handle both transparently — but your code that reads or writes box columns does not get that for free. The oldBoundingBoxListSchemahas been removed from the public API.
So any code that touches a bounding-box column needs an audit, in one of two directions.
Writing. Stop building the {"image_width", "image_height", "bb_list": [...]}
dict. Construct a BoundingBoxes2D directly (coordinates are absolute XYXY by
default; pass bounding_box_format= / normalized= if yours differ):
from tlc.data_types import BoundingBoxes2D
import numpy as np
bb = BoundingBoxes2D(
bounding_boxes=np.array([[10, 20, 100, 200]], dtype=np.float32), # Nx4, XYXY
labels=np.array([3], dtype=np.int32),
confidences=np.array([0.9], dtype=np.float32), # omit for ground truth
x_max=640.0, y_max=480.0, # optional image bounds
)
# or build incrementally:
bb = BoundingBoxes2D.create_empty(image_width=640, image_height=480)
bb.add_instance(bounding_box=[10, 20, 100, 200], label=3, confidence=0.9)
Use BoundingBoxes2D.schema(classes=..., include_per_instance_confidence=...)
for the column schema; the old BoundingBoxListSchema is gone.
Reading. 3.x does not coerce stored data: a legacy table returns its
bounding-box column in the original bb_list dict form, and a 3.x table returns
a BoundingBoxes2D. The format you get back mirrors how the table was written —
there is no migration-on-read. Consuming code should standardize on
BoundingBoxes2D and convert legacy rows explicitly via from_legacy_row, which
auto-detects the coordinate convention (XYXY / XYWH / centered-XYWH, normalized or
absolute) from the column’s schema number-roles:
bb_schema = table.rows_schema.values["bounding_boxes"]
row = table[idx]
raw = row["bounding_boxes"]
# Legacy tables yield the old dict; 3.x tables yield a BoundingBoxes2D.
# `from_legacy_row` takes the *column* value (the `bb_list` dict with its
# `image_width`/`image_height` keys), i.e. `raw` here — not the whole row.
bb = BoundingBoxes2D.from_legacy_row(raw, schema=bb_schema) if isinstance(raw, dict) else raw
bb.bounding_boxes # Nx4 array of [x_min, y_min, x_max, y_max]
bb.labels # N category indices
bb.confidences # N floats (predictions only)
Once converted, always read by attribute (.bounding_boxes, .labels,
.confidences) — never by dict key.
Bounds are now optional on geometry dataclasses¶
The geometry dataclasses (Geometry2D, Geometry3D, Keypoints2D, BoundingBoxes2D,
OrientedBoundingBoxes2D, etc.) now have optional bounds. In 2.x bounds were required and defaulted to
0.0; from_row() raised ValueError if bounds were missing. In 3.x bounds default to None and
from_row() accepts rows without bounds. When a max bound is specified but the corresponding min is
not, the min defaults to 0.0.
# 3.x
geom = Geometry2D.create_empty(x_max=640.0, y_max=480.0) # x_min / y_min default to 0.0
geom = Geometry2D.create_empty() # All bounds are None
geom = Geometry2D.from_row(row) # Works even if bounds are missing
COCO importer — include_iscrowd and per_instance_schemas replaced by per_instance_extras¶
The include_iscrowd boolean and per_instance_schemas (pose-only) parameters on the COCO importer have
been removed. The new per_instance_extras parameter covers both — it works for all tasks (detect,
segment, pose) and supports both auto-inferred and explicit schemas:
# Auto-infer schemas from data
table = tlc.Table.from_coco(
...,
per_instance_extras=["iscrowd", "my_custom_field"],
)
# Or provide explicit schemas
table = tlc.Table.from_coco(
...,
per_instance_extras={
"iscrowd": tlc.schemas.Int32Schema(),
"score": tlc.schemas.Float32Schema(),
},
)
COCO importer — new per_image_extras parameter¶
The COCO importer now accepts a per_image_extras parameter for preserving image-level custom fields as
top-level table columns:
table = tlc.Table.from_coco(
...,
per_image_extras=["date_captured", "flickr_url"],
)
The CocoExporter also accepts per_image_extras for round-trip support, writing specified table columns
back into COCO image entries on export.
CocoExporter.serialize(include_segmentation=...) removed¶
The deprecated include_segmentation parameter on CocoExporter.serialize() (and the corresponding
table.export(..., include_segmentation=...) kwarg) has been removed. Detection tables created in 3.x no
longer carry segmentation data, so the parameter is no longer meaningful. The exporter no longer copies
legacy bb_list-row segmentation into the exported COCO file either; old-style detection tables export
with empty segmentation lists, matching the new-style BB path. Use task="segment" to work with
segmentation data.
Integrations¶
Hugging Face¶
The transformers-bound Trainer integration is split out from the datasets-bound table machinery.
Importing tlc.integration.hugging_face no longer requires datasets.
TLCTrainer removed¶
The deprecated TLCTrainer class has been removed. The 3LC integration now provides a single Trainer
class that supports 3LC Tables and metrics collection.
# 2.x
from tlc.integration.hugging_face import TLCTrainer
trainer = TLCTrainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()
# 3.x
from tlc.integration.hugging_face.trainer import Trainer
trainer = Trainer(model=model, args=training_args, train_dataset=tlc_table, eval_dataset=eval_table)
trainer.train()
The Trainer class is no longer re-exported from the top-level tlc or tlc.integration packages —
import from tlc.integration.hugging_face.trainer.
HF Table classes are private — use tlc.Table factories¶
The 2.x TableFromHuggingFaceHub, TableFromHuggingFaceDataset, and the TableFromHuggingFace alias are
no longer part of the public API. Construct HF-backed tables via:
tlc.Table.from_hugging_face_hub()— load from the Hugging Face Hubtlc.Table.from_hugging_face_dataset()— wrap an in-memorydatasets.Dataset
Top-level imports like from tlc.integration.hugging_face import TableFromHuggingFaceHub and the
previously documented tlc.integration.hugging_face.table_from_hugging_face* module paths are no longer
supported.
Detectron2¶
Importing tlc no longer eagerly tries to import detectron2; importing tlc.integration.detectron2
requires the optional detectron2 dependency.
BoundingBoxMetricsCollector lives in tlc.integration.detectron2¶
BoundingBoxMetricsCollector now lives exclusively in tlc.integration.detectron2. In practice this
collector has always been used with Detectron2 — it relies on Detectron2-shaped ground-truth and
prediction formats, and the COCO-style annotation dicts it consumes (CocoAnnotation, CocoGroundTruth,
CocoPrediction) are produced by the Detectron2 hooks. Keeping it in the general-purpose
metrics_collectors namespace implied a framework-agnostic collector that it never really was.
# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector
Detectron2 utilities no longer hoisted to tlc.*¶
Detectron2 metric collectors, hooks, and helpers were previously re-exported at the top-level tlc
namespace via a wildcard import. They now live exclusively under tlc.integration.detectron2. Affected
names: BoundingBoxMetricsCollector, CocoAnnotation, CocoGroundTruth, CocoPrediction,
DetectronMetricsCollectionHook, MetricsCollectionHook, UmapReduceEmbeddingsHook,
register_coco_instances. The 2.x all-caps spellings COCOAnnotation, COCOGroundTruth,
COCOPrediction, and UMAPReduceEmbeddingsHook were hoisted alongside the PascalCase variants in 2.x and
are also gone from the top level in 3.x (see Renames for the PascalCase replacements).
# 3.x
from tlc.integration.detectron2 import BoundingBoxMetricsCollector
metrics_collector = BoundingBoxMetricsCollector(...)
PIL.Image.LINEAR shim removed¶
In 2.x, import tlc aliased PIL.Image.LINEAR = PIL.Image.BILINEAR as a back-compat shim for detectron2
≤ v0.6, which referenced Image.LINEAR at import time (Pillow 10 removed it). In 3.x this patch is gone.
If you are pinned to detectron2 ≤ v0.6, upgrade to a recent detectron2 commit, or apply the shim yourself
before import detectron2:
Legacy bb_list segmentation no longer copied into detectron2 annotations¶
register_coco_instances previously copied any segmentation field present alongside a legacy bb_list
row into the detectron2 annotation dicts it produced. In 3.x this codepath has been removed; detection
tables register with detectron2 without a segmentation field. Segmentation datasets should be registered
as segmentation tables.
PyTorch Lightning — decorator removed¶
The @tlc.integration.pytorch_lightning.lightning_module class decorator has been removed in 3.x, along
with the tlc.integration.pytorch_lightning package and the 3lc[lightning] install extra. 3LC integrates
with Lightning using the public 3LC API and Lightning’s standard hooks — no decorator required.
For the migration pattern and end-to-end examples, see the PyTorch Lightning integration page. If you have a 2.x project that depends specifically on the decorator and the documented migration is not enough, please reach out to the 3LC team.