tlc.metrics.collect¶

Collect per-sample metrics with a tlc Table.

Module Contents¶

Classes¶

Class

Description

MapDataset

A map-style dataset: anything with __len__ and __getitem__.

Functions¶

Function

Description

collect_metrics

Collect per-sample metrics with a map-style dataset.

reset_dataloader_cache

Drop all DataLoaders cached by collect_metrics.

Data¶

Data

Description

T_co

API¶

class MapDataset¶

Bases: typing.Protocol[tlc.metrics.collect.T_co]

A map-style dataset: anything with __len__ and __getitem__.

Table satisfies this protocol, as does any torch.utils.data.Dataset that implements both methods.

T_co = TypeVar(...)¶
collect_metrics(
table: MapDataset[Any],
metrics_collectors: tlc.metrics.collectors.metrics_collector_base.MetricsCollectorType,
*,
predictor: Module | Predictor | None = None,
foreign_table_url: Url | str | None = None,
constants: dict[str, Any] | None = None,
constants_schemas: dict[str, Schema] | None = None,
run_url: Url | str | None = None,
collect_aggregates: bool = True,
split: str = '',
exclude_zero_weights: bool = False,
dataloader_args: dict[str, Any] | None = None,
) None¶

Collect per-sample metrics with a map-style dataset.

  • Writes a single metrics table joined to a foreign Table by row index. The written metrics table will contain any constants contained in the constants argument, as well as any metrics computed by the metrics collectors.

  • Adds the metadata of the metrics table to the metrics property of the Run.

  • Adds the Url of the foreign Table to the Run as an input.

  • Collects aggregate values from the metrics collectors and add them to the Run.

The dataset’s index i is interpreted as the row index of the foreign Table for the per-sample join. Pass a Table or TableView to derive the foreign URL automatically, or any other MapDataset together with foreign_table_url to declare the join explicitly. The two paths are mutually exclusive: passing foreign_table_url alongside a Table/TableView is rejected.

Parameters:
  • table – A map-style dataset (any object with __len__ and __getitem__). A Table or TableView works directly; for a custom dataset, foreign_table_url must be passed so metrics can be linked back to a Table. (Parameter will be renamed to dataset in 3.0.)

  • metrics_collectors – A list of metrics collectors to use. Can be a single metrics collector, a list of metrics collectors, or a list of callables with the signature Callable[[Any, PredictorOutput], dict[str, Any]].

  • constants – A dictionary of constants to use when collecting metrics.

  • constants_schemas – A dictionary of schemas for the constants. If no schemas are provided, the schemas will be inferred from the constants.

  • run_url – The url of the run to add the metrics to. If not specified, the active run will be used. If no active run is found, a new run will be created.

  • collect_aggregates – Whether to collect aggregate values from the metrics collectors and add them to the Run. This allows an aggregate view to be shown in the Project page of the 3LC Dashboard. Aggregate values are computed for all computable columns in the metrics collectors, and are prefixed with the split name. For example, if a metrics collector defines a computable column called “accuracy”, and the split is “train”, then the aggregate value will be called “train_accuracy_avg”.

  • split – The split of the dataset. This will be prepended to the aggregate metric names.

  • exclude_zero_weights – Whether to exclude samples with zero weights when collecting metrics. Reads weights from the foreign Table; requires foreign_table_url= (or that table is a Table or TableView).

  • foreign_table_url – Url of the Table to link the metrics back to. Required when table is a custom map-style dataset; must NOT be passed when table is itself a Table or TableView (the URL is derived from table.url).

  • dataloader_args – Additional arguments to pass to the dataloader. Samples produced by table (after any transform) must be combinable by the active collate_fn — the default torch.utils.data.default_collate handles tensors, numbers, strings, and dict/list/tuple trees thereof. For heterogeneous samples (e.g. PIL images, variable-length sequences), pass {"collate_fn": <your fn>} here.

Raises:

ValueError – If table is a DataLoader; if foreign_table_url is provided alongside a Table or TableView table; or if table is a custom map-style dataset and foreign_table_url is not provided.

reset_dataloader_cache() None¶

Drop all DataLoaders cached by collect_metrics.

DataLoaders are only cached on Windows when dataloader_args includes persistent_workers=True; on other platforms or without persistent workers, this function is a no-op (the cache is empty by construction). Each cached entry holds worker processes alive for the lifetime of the host process; call this between unrelated training runs (or after switching tables) to release them. After clearing, the next collect_metrics call rebuilds.