Integrating 3LC with SuperGradients¶

This document describes how to integrate 3LC in projects using SuperGradients, an open-source training library for computer vision models and the home of Yolo-NAS. 3LC provides several classes and methods to make it easy to integrate 3LC with your existing SuperGradients projects.

Note

In order to use the SuperGradients integration, the super-gradients Python package must be installed in your environment. Use pip install super-gradients or equivalent.

At the time of writing, the latest release of super-gradients declares a dependency on termcolor==1.1.0, but this conflicts with termcolor>=2.2.0 declared in 3lc. A dependency resolution which respects the declared requirements of 3lc and super-gradients is therefore not possible.

In practice, it turns out that super-gradients can be used with a higher version of termcolor. We therefore recommend to install super-gradients first, and then 3lc, in separate invocations of pip install or equivalent.

pip install super-gradients
pip install 3lc

This installs the requirements of super-gradients, and upgrades any package where 3lc has a higher required version such as termcolor.

Note

The integration is only tested on Python 3.9 and 3.10, due to unresolved dependency conflicts that prevent installing super-gradients on other Python versions. We recommend using Python 3.10 for running SuperGradients from PyPI, or cloning the source code and modifying the dependencies directly for higher Python versions.

Training from Code¶

Datasets¶

In order to make a dataset compatible with 3LC and SuperGradients training code, first create a 3LC Table for each split of your dataset. Then provide your Table to a SuperGradients integration Dataset which lets you use the 3LC Table with SuperGradients training.

Object Detection

For Object Detection, use DetectionDataset, which offers SuperGradients’ detection dataset functionality and loads data directly from a 3LC Table. To create a Table compatible with SuperGradients object detection training, use a method such as Table.from_yolo(), Table.from_coco() or Table.from_yolo_ndjson().

import tlc
from tlc.integration.super_gradients import DetectionDataset

table = tlc.Table.from_yolo(dataset_yaml_file="path/to/dataset.yaml", split="train")

dataset = DetectionDataset(
    table=table,
    input_dim=(640, 640),
    transforms=[...],
)

Pose Estimation

For Pose Estimation, use the PoseEstimationDataset, which is designed for training pose estimation models and loads keypoint data directly from a keypoints-compatible 3LC Table. To create a keypoints-compatible 3LC Table, either use Table.from_yolo or write your own custom keypoints-compatible 3LC Table.

import tlc
from tlc.integration.super_gradients.datasets import PoseEstimationDataset

table = tlc.Table.from_yolo(
    dataset_yaml_file="path/to/dataset.yaml",
    split="train",
    task="pose",
)

pose_dataset = PoseEstimationDataset(
    table=table,
    transforms=[],
    keypoints_column="keypoints_2d",
)

Training a Model and Collecting Metrics¶

When using the Trainer abstraction in SuperGradients, provide a 3LC metrics collection callback for the task you are working on. The callbacks log the aggregate metrics returned by the SuperGradients Trainer, and invokes per-sample metrics collection at the end of training by default, for both the train and validation Tables.

Object Detection

For Object Detection, use the DetectionMetricsCollectionCallback, which extracts predicted bounding boxes, confidence scores, and class labels from SuperGradients detection predictions and stores them in a 3LC Metrics Table associated with the Run, which can be opened in the 3LC Dashboard.

from super_gradients.training import Trainer

from tlc.integration.super_gradients import DetectionMetricsCollectionCallback, PipelineParams

trainer = Trainer(experiment_name="my_supergradients_experiment")

pipeline_params = PipelineParams(conf=0.1, fp16=True)

metrics_collection_callback = DetectionMetricsCollectionCallback(
    project_name="my_supergradients_project",
    batch_size=32,
    pipeline_params=pipeline_params,
)

training_params = {
    ...: ...,
    "phase_callbacks": [metrics_collection_callback, ...],
}

trainer.train(
    model=...,
    training_params=training_params,
    train_loader=...,
    valid_loader=...,
)

Pose Estimation

For Pose Estimation, use the PoseEstimationMetricsCollectionCallback, which extracts predicted keypoints, confidence scores, and bounding boxes from SuperGradients pose estimation predictions and stores them in a 3LC Metrics Table associated with the Run, which can be opened in the 3LC Dashboard.

from super_gradients.training import Trainer

from tlc.integration.super_gradients import PoseEstimationMetricsCollectionCallback

trainer = Trainer(experiment_name="my_supergradients_experiment")

metrics_collection_callback = PoseEstimationMetricsCollectionCallback(
    project_name="my_supergradients_project",
    batch_size=32,
)

training_params = {
    ...: ...,
    "phase_callbacks": [metrics_collection_callback, ...],
}

trainer.train(
    model=...,
    training_params=training_params,
    train_loader=...,
    valid_loader=...,
)

Generic

For generic training, use the base MetricsCollectionCallback.

This will store hyperparameters and aggregate metrics, but not per-sample metrics.

from super_gradients.training import Trainer

from tlc.integration.super_gradients import MetricsCollectionCallback, PipelineParams

trainer = Trainer(experiment_name="my_supergradients_experiment")

pipeline_params = PipelineParams(conf=0.1, fp16=True)

metrics_collection_callback = MetricsCollectionCallback(
    project_name="my_supergradients_project",
    batch_size=32,
    pipeline_params=pipeline_params,
)

training_params = {
    ...: ...,
    "phase_callbacks": [metrics_collection_callback, ...],
}

trainer.train(
    model=...,
    training_params=training_params,
    train_loader=...,
    valid_loader=...,
)

For metrics collection, a SuperGradients Pipeline is created to run inference over all the images of a Table. To provide arguments to this pipeline, provide an instance of PipelineParams to the MetricsCollectionCallback .

See the parameters of the base MetricsCollectionCallback for more ways of customizing the integration. To disable per-sample metrics collection, use the base MetricsCollectionCallback or set collect_predictions=False.

Note

The callbacks will reuse any existing Run in the session. If no active Run exists, a new Run is created with the provided project_name and run_name. If project_name or run_name is provided and different from those of an existing active Run, a ValueError is raised.

Weighting¶

3LC supports sample weighting to emphasize important samples or exclude unwanted samples during training. To apply sample weights from a Table in SuperGradients training, use Table.create_sampler() to create a weighted sampler and pass it to your DataLoader.

Training from a Recipe¶

SuperGradients supports training from YAML-formatted configuration files called recipes. This approach allows you to define complete training configurations including hyperparameters, dataset, augmentation, and model architecture settings in a standardized format.

3LC integrates with SuperGradients recipes through the framework’s registration system. To use the 3LC integration datasets and callbacks described above in your recipes, they need to be registered before the configuration is loaded. Import the 3LC integration module to run the decorators that register the available datasets and callbacks in the SuperGradients registry.

To achieve this, create an entrypoint script main.py following the SuperGradients recipe pattern, importing the 3LC integration to register the datasets and callbacks:

import omegaconf
import hydra

from super_gradients import Trainer, init_trainer

# Register the 3LC integration objects in the SuperGradients registry
import tlc.integration.super_gradients

@hydra.main(config_path="recipes")
def main(cfg: omegaconf.DictConfig) -> None:
   Trainer.train_from_config(cfg)

init_trainer()
main()

Important

3LC integration objects are registered with names with the prefix 3LC to disambiguate them from objects registered by base SuperGradients. For example, MetricsCollectionCallback is registered with the name 3LCMetricsCollectionCallback.

Datasets¶

In your YAML recipe files, you can configure 3LC tables as data sources for both training and validation through defining dataloaders using a dataset class from the 3LC integration.

Object Detection

Use 3LCDetectionDataset to use the integration object detection dataset with your Tables.

dataset_params:
  train_table_url: "url/to/train_table"
  val_table_url: "url/to/val_table"
  
  # Dataset definitions
  train_dataset_params:
    table: ${dataset_params.train_table_url}
    transforms:
      ...

  val_dataset_params:
    table: ${dataset_params.val_table_url}
    transforms:
      ...
  
  train_dataloader_params:
    dataset: 3LCDetectionDataset
    ...

  val_dataloader_params:
    dataset: 3LCDetectionDataset
    ...

Pose Estimation

Use 3LCPoseEstimationDataset to use the integration pose estimation dataset with your Tables.

dataset_params:
  train_table_url: "url/to/train_table"
  val_table_url: "url/to/val_table"
  
  # Dataset definitions
  train_dataset_params:
    table: ${dataset_params.train_table_url}
    transforms:
      ...

  val_dataset_params:
    table: ${dataset_params.val_table_url}
    transforms:
      ...
  
  train_dataloader_params:
    dataset: 3LCPoseEstimationDataset
    ...

  val_dataloader_params:
    dataset: 3LCPoseEstimationDataset
    ...

Training a Model and Collecting Metrics¶

To use the callbacks to create a Run and collect metrics during training, add the appropriate 3LC callback to your training hyperparameters.

Object Detection

training_hyperparams:
  ...
  phase_callbacks:
  - 3LCDetectionMetricsCollectionCallback:
      project_name: "My SuperGradients Project"
      pipeline_params:
        iou: 0.5
        conf: 0.3
      ...

Pose Estimation

training_hyperparams:
  ...
  phase_callbacks:
  - 3LCPoseEstimationMetricsCollectionCallback:
      project_name: "My SuperGradients Project"
      pipeline_params:
        iou: 0.5
        conf: 0.3
      ...

Generic

training_hyperparams:
  ...
  phase_callbacks:
  - 3LCMetricsCollectionCallback:
      project_name: "My SuperGradients Project"
      ...

Run training by using the main.py script with your recipe configuration:

python main.py --config-name=my_recipe.yaml

Collecting Embeddings¶

The SuperGradients integration supports collecting embeddings for models that expose an attribute backbone, such as the Yolo-NAS family of models. To enable embeddings collection, set collect_embeddings=True in your MetricsCollectionCallback.

For each image, the outputs of the model backbone are extracted and flattened with 2D adaptive average pooling. The resulting high-dimensional vectors are reduced to 2D or 3D with your chosen reducer, where PaCMAP is the default. These embeddings often encode high-level information in an image. For the COCO dataset with pretrained Yolo-NAS weights, images of people playing tennis will typically be close to each other, relatively close to images with other sports, and far away from e.g. images with airplanes.

Note

To be able to reduce the collected embeddings, the selected reducer needs to be installed in your environment. To use the default reducer PaCMAP, the Python package pacmap needs to be available through pip install 3lc[pacmap], pip install pacmap or equivalent.

Additionally, dependencies of SuperGradients transitively depend on a version of coverage which is not compatible with Numba, a dependency of pacmap and umap-learn. A compatible version of coverage must therefore also be installed with pip install coverage>6 or equivalent. Alternatively, coverage can be uninstalled as it is not needed. A check is performed by the 3LC integration if embeddings collection is enabled, which will raise a ValueError if any known incompatibilities are detected before training is started.

If your model does not expose an attribute backbone, a warning is raised and the run continues, but no embeddings are collected.

Collecting Metrics without Training¶

It is also possible to collect metrics and/or embeddings directly using a MetricsCollectionCallback without training a model. Call the method MetricsCollectionCallback.collect_metrics_direct(). This is useful for evaluating a trained model on one or more datasets.

from tlc.integration.super_gradients import PoseEstimationMetricsCollectionCallback, PipelineParams
from super_gradients.training import models
import tlc


model = models.get("yolo_nas_pose_l", num_classes=20, checkpoint_path="my-checkpoint.pth")

train_table = tlc.Table.from_url("train-table-url")
val_table = tlc.Table.from_url("val-table-url")

tlc_callback = PoseEstimationMetricsCollectionCallback(
        project_name="my-project",
        run_name="Collect metrics on my-model",
        pipeline_params=PipelineParams(
            iou=0.5,
            conf=0.5,
            fuse_model=True,
            skip_image_resizing=False,
            nms_top_k=100,
        ),
    )

tlc_callback.collect_metrics_direct(
    model=model,
    tables=[train_table, val_table],
)

Note

Metrics collection will use the image processor from the dataset processing parameters of the model. To override the processing parameters, call model.set_dataset_processing_params(**processing_params) before training or collecting metrics.

Metrics collection will build a pipeline to perform the inference, using the pipeline parameters provided to the callback, see PipelineParams for more details.

Collecting Custom Metrics¶

To collect additional metrics beyond the model predictions, subclass the task specific MetricsCollectionCallback and override the methods compute_metrics and metrics_column_schemas. Ensure the parent methods are called to retain their functionality.

The following example demonstrates how to extend metrics collection by adding a custom column to the metrics output. In this case, we add a column that records the number of predicted bounding boxes per image. Although this specific metric can already be computed in the 3LC Dashboard, the example is intended to illustrate the recommended approach for customizing metrics collection in your own callbacks.

import tlc

from super_gradients.training.utils.predict.prediction_results import (
    ImageDetectionPrediction,
    ImagesDetectionPrediction,
)
from tlc.integration.super_gradients import DetectionMetricsCollectionCallback

from typing import Any

class CustomDetectionMetricsCollectionCallback(DetectionMetricsCollectionCallback):
    def compute_metrics(
        self,
        images: list[str],
        predictions: ImagesDetectionPrediction | ImageDetectionPrediction,
        table: tlc.Table,
    ) -> dict[str, Any]:
        metrics = super().compute_metrics(images, predictions, table)

        if isinstance(predictions, ImagesDetectionPrediction):
            metrics["num_predicted_boxes"] = [len(predicted_boxes) for predicted_boxes in predictions]
        else:
            metrics["num_predicted_boxes"] = [len(predictions)]

        return metrics