3LC Version 2.2 Release Notes#

3LC Version 2.2.6#

  1. May 2024

Enhancements and Fixes#

tlc Python Package#

  • Made fixes to telemetry reporting to report the correct environment, to include more useful information, and to send data on normal shutdown

  • Changed example notebooks to use PaCMAP instead of UMAP for embeddings


3LC Version 2.2.5#

  1. April 2024

Enhancements and Fixes#

tlc Python Package#

  • Fixed an issue with Hugging Face metrics collection

  • Changed the default reshape_strategy for EmbeddingsMetricsCollector from “flatten” to “mean”, since “flatten” can fail if inputs do not have the same shape across the batches

  • Fixed “mean” reshape_strategy for EmbeddingsMetricsCollector when the embeddings shape is 2D


3LC Version 2.2.4#

  1. April 2024

Enhancements and Fixes#

tlc Python Package#

  • Numerous improvements to schema, including PyArrow type handling and number roles for vectors

  • Greatly improved the performance of accessing Table rows, including when requested through the object service

  • Removed storage of references to input/foreign table dataset name, deriving display names directly from URL instead

    • NOTE: This fix does not retroactively fix foreign table display names in existing tables

  • Added a status attribute to Run and logic to update it internally and across the various integrations

  • Update Run to have a more complete API, including add_metrics_table and set_status methods, along with tlc.set_active_run

  • Added Run and Table from_name methods

  • Create Run if one doesn’t exist when collecting metrics

  • Added description to Table

  • Updated validation of samples against schemas and sample types (such as in TableWriter) to provide specific feedback on mismatches

  • Refactored and renamed MetricsWriter to MetricsTableWriter and made it inherit from TableWriter

  • Updated collect_metrics and metrics_collector APIs to ensure broader compatibility with various PyTorch models and to enhance the flexibility of the metrics-collection API:

    • Unified Model Handling: The function now accepts a predictor as its first keyword argument. This model, if a standard PyTorch model, is automatically wrapped in a Predictor object for batch prediction processing. Users also have the option to directly pass an already instantiated Predictor object (which can be constructed by passing suitable PredictorArgs).

    • Decoupled Metrics Collectors and Model: Metrics collectors no longer accept models as a constructor argument, and are instead passed a well-defined model output object when called (PredictorOutput).

    • Enhanced Preprocessing Capabilities: Metrics collectors now have the ability to pre-process inputs before metrics-computation, allowing for direct manipulation and preparation of both the data batch and the model output before metrics calculation. Built-in metrics collectors may provide default pre-processing functions which are suitable in most cases, and these can easily be overridden when necessary.

    • Flexible Metrics Collector Specification: The metrics_collector argument to collect_metrics has been expanded to support a more diverse range of input types. This includes the ability to specify a single metrics collector, a list of multiple metrics collectors, or callable functions with the signature (batch: Any, output: PredictorOutput)-> Dict[str, Any].

    • These changes aim to provide a more robust and adaptable framework for metrics collection, facilitating easier integration and customization to meet the needs of different PyTorch modeling workflows.

    • Example

      import tlc
      def callable_metrics_collector(batch, output):
          return {"accuracy": [...]}
      metric_collectors = [tlc.BoundingBoxMetricsCollector(), tlc.EmbeddingMetricsCollector(layers), callable_metrics_collector]
      table = ... # Your tlc.Table object
      model = ... # Your PyTorch model
      # Optionally wrap the model in a Predictor object
      model = tlc.Predictor(model, preprocess_fn=..., layers=...)
      tlc.collect_metrics(table, metric_collectors, predictor=model)
      
  • Added a decorator for using 3LC with PyTorch Lightning modules and updated the SegFormer example notebook to use it

  • Improved huggingface TLC trainer and added options for when to collect metrics

  • Added a full-screen text user-interface (TUI) for the object service, used by default, disabled with --no-tui

  • Added support for accessing data on Azure Blob Storage using “abfs://” URLs

  • Added an LRU cache to the object service for external data requests (e.g. images) to improve performance by avoiding re-requesting data; default memory size of 1GB and timeout of 1 hour

  • Fixed object service so that requests for external data from the dashboard are serviced asynchronously so they do not block one another

  • Numerous improvements and additions to the public example notebooks, including more notebooks, a README.md, and making it possible to run them directly from a github checkout using test data in the repo

Dashboard#

  • Reorganized navigation components to make them more intuitive:

    • The active Project is shown in the top-left

    • Tabs for selecting Runs and Tables are provided just underneath the active Project component. Clicking on each one shows the Runs or Tables for the active Project, allowing selection of one of more to be loaded.

    • A tab to the right of the Runs and Tables navigation tab shows a summary of selected Runs and Tables. Clicking on this tab allows for the following:

      • Deselecting individual Runs and Tables

      • Showing details for the selected Runs and Tables

      • Showing hyperparameters for the selected Runs

  • Reorganized toolbars to simplify and group things more logically

    • Changed 3LC icon in top-left to be a menu, providing access to settings

    • Replaced connection status string with icon

    • Moved pending edits under new toolbar icon, where they can be discarded or committed with a commit message

    • Added info icon to top-right; moved options for documentation, feedback, and the about window there

    • Removed toolbar buttons to show hide panels (filters, charts, rows); added arrows to collapse and expand them instead

  • Added a new Workflows panel, along with a number of specific workflows

  • Made it possible to add a new value map value for (e.g. label category) after dataset creation

  • Made it possible to save edited tables when the original table is not in a writable location; a new location under the project root directory is used

  • Remember selected Project between dashboard sessions

  • Added support for query parameters to open a Table/Run

  • Added status for Run objects

  • Made several fixes to bounding box rendering and editing

  • Made it so a consistent color is used for each label value wherever it is shown

  • Fixed an issue with coloring by boolean properties, such as FN/FP/TP

  • Made fixes to make text easier to read in several contexts

  • Made traversal index operation deterministic

  • Implemented a number of optimizations, including operations for IoU, BB overlap, BB FN/FP/TP

  • Don’t invalidate virtual columns whose inputs are unchanged

  • Optimized tweaking of virtual columns

  • In the filter panel, made it possible to sort and filter for properties with many enum values (e.g. labels)

  • Made it possible to filter on edited rows

  • Made it so Ctrl+Left/Right arrow keys can be used to navigate between samples in the rows panel

  • Allow for dragging a column to define R, G, B, and radius axes in charts

  • Made it so charts prefer ‘Label’ over ‘Predicted Label’ for color when available

  • Made fixes to supersampling in charts

  • Made improvements to mouse handling in charts

  • Added a reset button to charts to remove filtering done by lasso tool, etc.

  • Made it so clicking on the next/previous arrows in a chart, which allows navigation between samples, also scrolls the rows panel to the selected sample

  • Fixed an issue where Command-C did not work as expected on Mac


3LC Version 2.2.3#

  1. March 2024

Enhancements and Fixes#

tlc Python Package#

  • Made deriving project and dataset portions of URLs for Runs and Tables more robust

  • Added serialization version number to 3LC files to allow for compatibility checks

  • Improved support for using the object service with NGrok

  • Added public SegFormer example demonstrating integration with Pytorch Lightning

Dashboard#

  • Improved support for using the Dashboard with NGrok

  • Significantly improved performance of Tp/Fp/Fn Operations

  • Added keyboard shortcut Space to toggle camera/paint


3LC Version 2.2.2#

  1. March 2024

Enhancements and Fixes#

tlc Python Package#

  • Made Table row access immutable to prevent unsupported modifications to the in-memory representation of the Table, which could then get cached to disk.

  • Significantly improved performance of TableFromCoco

  • Made it so exporters write unaliased URLs

  • Made Warning the default log level

  • Mapped thread ID for log messages to lower numbers to avoid errors in the Dashboard

Dashboard#

  • Added support for rendering image masks

    • Support image RGB remapping from value map

    • Support rendering multiple images on top of each other

  • Introduced new operations to cluster by distance threshold and check if a sample is the primary element in a cluster, which can be used to e.g. cluster samples with identical images then ignore all but one in subsequent training.

  • Fixed an issue where charts did not show the correct data after creating a subset table

  • Reintroduced use of cookies to specify object service URL


3LC Version 2.2.1#

  1. February 2024

Enhancements and Fixes#

tlc Python Package#

  • Renamed predicted bounding box labels from “label_predicted” to “label”. This fixes a bug where in the Dashboard, adding or assigning a prediction would lead to no labels being written for the new box. This also caused the visualization of all boxes to break after committing such a change.

    • NOTE: This fix does not retroactively fix existing tables with the old “label_predicted” bounding box labels.

  • Updated Hugging Face datasets integration

    • Renamed the Hugging Face integration from tlc.integration.huggingface to tlc.integration.hugging_face

    • Added Table.from_hugging_face that is similar to the other Table.from_* methods

    • Removed load_dataset; Table.from_hugging_face should be used instead

  • Made various fixes and additions to sample types and schemas

  • Made various improvements to type hinting and removed many uses of Any, which disabled type checking

Dashboard#

  • Made Run constants visible by default

  • Made it so that a default column is chosen for the color property when a chart is created without specifying which column should provide the color. In most cases, this means that a label column will be chosen if one is available.

  • Made it possible to provide object_service URL as a URL parameter to specify the object service to use for the current session


3LC Version 2.2.0#

  1. February 2024

We proudly present 3LC Version 2.2!

This release is for early adopters of 3LC, to be deployed in their own environment. It makes a significant change in the way that 3LC data is organized, including the introduction of a Project concept and the packaging of 3LC objects into self-contained folders, which makes them easier to share and perform other operations on. The Dashboard now presents projects on its start screen and shows per-Run charts for the selected project.

Enhancements and Fixes#

tlc Python Package#

  • Changed 3LC data organization to have project at the top-level, with settings for project root and project scan URLs instead of the previous settings for tables and runs. Projects contain runs and datasets, runs contain metrics, and datasets contain tables.

  • Made it so that 3LC objects (Tables, Runs, etc.) are packaged into self-contained folders, which makes them easier to share and perform other operations on

  • Added support for deleting indexed Runs and Tables

  • Added support for copying and renaming indexed Runs and Tables

  • Made it possible to add/remove/modify value maps for Tables

  • Enhanced samplers so that samples with zero weight can optionally be excluded during sampling and/or when collecting metrics

  • Refactored various parts of the API to be more flexible, usable, and consistent, including tlc.init, table.from_*, and TableWriter

Dashboard#

  • Updated start screen to allow user to select a project

  • Made it so that per-Run graphs are shown for the selected project

  • Made it possible to delete and rename Runs

  • Made it so that unfiltered results are shown behind filtered results when doing on-the-fly reduction in a chart

  • Made a number of optimizations, including to the editing of large Tables and to chart filtering/reduction

  • Made it so that the previous spherical point rendering is now opt-in via a setting and the default point rendering is simpler and faster, improving rendering framerate for charts with many points

Documentation#

The 3LC documentation is now publicly available at docs.3lc.ai

Availability#

This release is provided to enterprise clients, who have been given access to install it from our private CloudRepo Python repository. Clients will be able to locally install Python wheels which contain the notebook API, the 3LC Dashboard, and documentation from Python packages.

In addition to the credentials to access the private CloudRepo, clients will also need a license key in order to use the software.

Supported Platforms#

  • Python 3.8 - Python 3.11

    • Both Conda and “vanilla” Python environments should work

  • Microsoft Windows 10 and 11 (x86)

  • macOS 13 and newer (M series)

  • Ubuntu 20.04 (x86-64) is our supported Linux platform

    • Most other GLibc based Linux distributions are expected to work, but these are untested and unsupported.

  • Chrome and Edge web-browsers, with GPU acceleration enabled

Known Issues#

  • Tables for dataset revisions are currently always stored at a location next to the input table, and it is not possible to override that behavior. This means, for example, that writing a dataset revision with an input table stored in a read-only location (on disk, in cloud storage, etc.) is not supported.

  • The Table object in the tlc Python API is designed to represent immutable columnar data, but it currently returns objects by reference when iterating or indexing. Consequently, it is possible to modify the in-memory representation of the Table, which could then get cached to disk. In general, users of the API should not make such modifications.

  • The tlc Python package does not detect, handle, or support NaN (Not-a-Number) values in tlc.Table, and their presence may lead to unpredictable behavior or inconsistencies within the system.

  • The order of columns in the Dashboard filter panel does not follow the order in the tables panel, where the columns are more logically ordered. This will be addressed in an upcoming release.

  • When renaming a Run in the Dashboard, the old name is sometimes still shown alongside the new name in the Table panel for a short time.