Frequently Asked Questions¶

What is 3LC?¶

3LC is a unique platform revolutionizing machine learning by delivering granular, real-time insights into model training and data interactions. Seamlessly integrating with leading ML frameworks like PyTorch, 3LC equips data scientists with unprecedented control over data diagnosis and correction.

Unique Features:

Detailed per-sample, per-epoch metric recording to illuminate how the model interacts and learns from the data.
Real-time, interactive visual diagnosis and correction of data issues, enhancing data quality and model precision.
Innovative data management solution seamlessly merging original data with modifications, simplifying version control and avoiding data duplication and data movement.

Business Value:

Superior model performance: With enhanced data quality and insight into model-data interactions, models trained with 3LC are more accurate and reliable.
Enhanced efficiency: Real-time visual diagnosis and correction streamline the ML process, saving valuable time and reducing operational costs.
Better decision-making: Granular insight into the training process equips teams with data-driven evidence to optimize models.

3LC transforms machine learning from a traditionally opaque, “black-box” process into an open, interactive experience. It offers data scientists unparalleled control and deep insight into their models and the key model-data interactions, positioning 3LC as a groundbreaking solution in the ML landscape. By improving the fitness and correctness of datasets and enhancing the efficiency of model training, 3LC provides a substantial competitive edge in today’s data-driven business environment.

Is it spelled 3LC or `tlc`?¶

The company and product are both called 3LC. The lowercase tlc name is only used when it is required to avoid starting with a number, such as in the naming of a Python module or an environment variable.

How can I export data from a 3LC Table?¶

After the of successful usage of 3LC to modify at dataset, one might want to export the modified data out of a 3LC Table and into a common format such as CSV or Coco. This can be achieved either by using the 3LC CLI or through the Python API.

Using the 3LC CLI

In a terminal (with the tlc Python Package installed), the command line tool can be invoked as follows:

$ 3lc export path/to/table.json <output-path>

The output format will be deduced from the extension of <output-path> and the contents of the table, but can also be explicitly specified using the --format option.

Using the Table.export method

The Table.export-method provide a simple interface for exporting a Table directly from a Python notebook.

table = Table.from_url(input_url)

table.export(output_path) 

The export method will deduce the output format from the extension of the output-path and the contents of the table.

How do I run the Dashboard and the Object Service on different machines?¶

Please see the Object Service Deployment Guide for several solutions to running the Dashboard and Object Service on different machines.

How do I create a 3LC Table from a Pandas DataFrame?¶

The Table.from_pandas method can be used to create a 3LC Table from a Pandas DataFrame.

import pandas as pd
import tlc

df = pd.read_csv("path/to/data.csv")

table = tlc.Table.from_pandas(df, table_name="my_table", dataset_name="my_dataset", project_name="my_project")

Why are my image transforms not applied in the Dashboard?¶

If you have created a 3LC Table using the Table.from_torch_dataset on a TorchVision VisionDataset which has transforms applied to it, you might notice that the transforms are not applied when viewing the data in the Dashboard. Because the transforms of a VisionDataset might contain augmentations, or conversion from a PIL.Image.Image to a torch.Tensor, 3LC needs to persist the untransformed samples. These are the images which are shown in the Dashboard.

The transforms will still be applied as expected when getting samples from the Table object in your code, and will still be applied when training your model. If you want to see the transformed images in the Dashboard, you can explicitly transform the samples in the __getitem__ method of your Dataset class, instead of using the transforms argument of the VisionDataset class, before calling Table.from_torch_dataset. If you are non-deterministically augmenting your samples, or converting PIL images to tensors, these transforms still need to be added through the transforms argument, and not in the __getitem__ method.

How can I add an extra column to my 3LC Table?¶

Adding an extra column to a 3LC Table can be done when creating the Table by using the extra_columns argument of constructor methods like Table.from_pandas, Table.from_dict or Table.from_torch_dataset.

my_torch_dataset = MyTorchDataset()
table = Table.from_torch_dataset(
  my_torch_dataset, 
  ..., 
  extra_columns={
    "QA": tlc.CategoricalLabel("QA", ["Not required", "Required", "In progress", "Done"])
  }
)

The extra_columns argument should be a dictionary where the keys are the names of the extra columns, and the values describe the type of the column, either with a Schema or a SampleType. The column will initially be populated by a default value, and will only be visible in the Dashboard, not during training.

How do I collect metrics in Python with my data and model?¶

Collecting metrics from your data and models can sometimes be tricky. If you encounter any issues, make sure to check the relevant sections of the user guide for guidance:

Register Datasets explains how to create a tlc.Table from your data.
Collect Metrics explains how to collect metrics from your table and model.
Combining Data and Models in 3LC goes into detail about how the tlc.Predictor wraps the model, and how it calls the forward method of the model using data from the tlc.Table.

Why can I not see my images in the Dashboard?¶

If your images are not showing up in the Dashboard, the images might be in a format that 3LC does not recognize.

All images which are displayed in the Dashboard, are stored as URLs (paths) in the Table. If your dataset contains PIL.Image objects, 3LC will automatically convert these to URLs when creating the Table. If your images are in any other format and you want to view them in the Dashboard, you have two options:

Include the path to the images in the samples of your dataset when creating the Table. If you are using a custom PyTorch Dataset, you can include the paths in the return value of the __getitem__ method of your Dataset class. Make sure to specify that these strings are paths to images by, for instance, using the ImagePath SampleType.
Let 3LC handle the conversion of the images to URLs by ensuring that your images in your samples are represented as PIL.Image objects. If your images are in a different format, you can convert them to PIL.Image objects before creating the Table. Note that if the PIL.Image objects are not created from a file, this might involve saving copies of the images the first time you create the Table.

How do I manage torch and torchvision dependencies alongside 3LC?¶

3LC is tightly integrated with, and therefore depends on, PyTorch and torchvision. For 3lc < 2.13, it is the user’s responsibility to install torch and torchvision. Since 3lc >= 2.13, 3lc declares torch and torchvision as required dependencies.

Therefore, when installing 3lc >= 2.13 and torch and torchvision are not present in your environment, they will be pulled from PyPI, which at the time of writing provides:

CPU-only wheels for Windows and MacOS
GPU-accelerated wheels on Linux

When installation of any nonstandard torch/torchvision is performed after installing 3lc >= 2.13, one must:

Either declare the full version number and include any accelerator-specific local version specifier, for example pip install torch==2.6.0+cu126 --index-url https://download.pytorch.org/whl/cu126.
Or use the --force-reinstall flag to ensure the package is installed from the specified index, which uninstalls the wheel from PyPI.

When installing 3lc >= 2.13 and torch and torchvision are present in your environment, or installed with the same call to pip install, any torch-specific index url and accelerator-specific local version specifier is respected.

For example, pip install torch torchvision --index-url https://download.pytorch.org/whl/cu126 followed by pip install 3lc will result in cu126 wheels.
Similarly, pip install 3lc torch torchvision --index-url https://download.pytorch.org/whl/cu126 will install cu126 wheels.

We therefore recommend to install torch and torchvision before 3lc, or as a part of the same invocation of pip install. If you are using 3lc < 2.13 the order does not matter, but torch and torchvision must be installed manually in order to use 3LC.

Do I need to upload my data to 3LC?¶

No. All data viewed, created and modified in the 3LC Dashboard interfaces with your own 3lc service, and is never sent anywhere else. 3LC only requires an internet connection for authentication and for fetching the Dashboard UI from our servers. If your deployment necessitates the absence of any connection to the public internet, you can still use the customer managed version of 3LC.

What should be the `project_name`, `dataset_name` and `table_name` of my 3LC Table?¶

The project_name, dataset_name and table_name of a 3LC Table are used to define the folders in which the Table is located, and thus uniquely identify the Table. These names have no functional purpose outside of helping you organize your data and making related data easier to share. That being said, we recommend the following naming convention:

project_name: A short name describing the goal of the project, e.g. “CatAndDogClassification”.
dataset_name: A name describing one distinct part of all the data associated with the project, e.g. “train”, “val” or “test”.
table_name: A name describing the unique changes introduced in this specific table revision. This can typically be “initial” for the first Table created for a dataset, or a description of some changes made either in the 3LC Dashboard or in the Python API, e.g. “changed_cat_weights”.

Frequently Asked Questions¶

What is 3LC?¶

Is it spelled 3LC or tlc?¶

How can I export data from a 3LC Table?¶

How do I run the Dashboard and the Object Service on different machines?¶

How do I create a 3LC Table from a Pandas DataFrame?¶

Why are my image transforms not applied in the Dashboard?¶

How can I add an extra column to my 3LC Table?¶

How do I collect metrics in Python with my data and model?¶

Why can I not see my images in the Dashboard?¶

How do I manage torch and torchvision dependencies alongside 3LC?¶

Do I need to upload my data to 3LC?¶

What should be the project_name, dataset_name and table_name of my 3LC Table?¶

Is it spelled 3LC or `tlc`?¶

What should be the `project_name`, `dataset_name` and `table_name` of my 3LC Table?¶