3LC Version 2.2 Release Notes#
3LC Version 2.2.6#
May 2024
Enhancements and Fixes#
tlc
Python Package#
Made fixes to telemetry reporting to report the correct environment, to include more useful information, and to send data on normal shutdown
Changed example notebooks to use PaCMAP instead of UMAP for embeddings
3LC Version 2.2.5#
April 2024
Enhancements and Fixes#
tlc
Python Package#
Fixed an issue with Hugging Face metrics collection
Changed the default reshape_strategy for
EmbeddingsMetricsCollector
from “flatten” to “mean”, since “flatten” can fail if inputs do not have the same shape across the batchesFixed “mean” reshape_strategy for
EmbeddingsMetricsCollector
when the embeddings shape is 2D
3LC Version 2.2.4#
April 2024
Enhancements and Fixes#
tlc
Python Package#
Numerous improvements to schema, including PyArrow type handling and number roles for vectors
Greatly improved the performance of accessing
Table
rows, including when requested through the object serviceRemoved storage of references to input/foreign table dataset name, deriving display names directly from URL instead
NOTE: This fix does not retroactively fix foreign table display names in existing tables
Added a
status
attribute toRun
and logic to update it internally and across the various integrationsUpdate
Run
to have a more complete API, includingadd_metrics_table
andset_status
methods, along withtlc.set_active_run
Added
Run
andTable
from_name
methodsCreate
Run
if one doesn’t exist when collecting metricsAdded
description
toTable
Updated validation of samples against schemas and sample types (such as in
TableWriter
) to provide specific feedback on mismatchesRefactored and renamed
MetricsWriter
toMetricsTableWriter
and made it inherit fromTableWriter
Updated
collect_metrics
andmetrics_collector
APIs to ensure broader compatibility with various PyTorch models and to enhance the flexibility of the metrics-collection API:Unified Model Handling: The function now accepts a
predictor
as its first keyword argument. This model, if a standard PyTorch model, is automatically wrapped in aPredictor
object for batch prediction processing. Users also have the option to directly pass an already instantiatedPredictor
object (which can be constructed by passing suitablePredictorArgs
).Decoupled Metrics Collectors and Model: Metrics collectors no longer accept models as a constructor argument, and are instead passed a well-defined model output object when called (
PredictorOutput
).Enhanced Preprocessing Capabilities: Metrics collectors now have the ability to pre-process inputs before metrics-computation, allowing for direct manipulation and preparation of both the data batch and the model output before metrics calculation. Built-in metrics collectors may provide default pre-processing functions which are suitable in most cases, and these can easily be overridden when necessary.
Flexible Metrics Collector Specification: The
metrics_collector
argument tocollect_metrics
has been expanded to support a more diverse range of input types. This includes the ability to specify a single metrics collector, a list of multiple metrics collectors, or callable functions with the signature(batch: Any, output: PredictorOutput)-> Dict[str, Any]
.These changes aim to provide a more robust and adaptable framework for metrics collection, facilitating easier integration and customization to meet the needs of different PyTorch modeling workflows.
Example
import tlc def callable_metrics_collector(batch, output): return {"accuracy": [...]} metric_collectors = [tlc.BoundingBoxMetricsCollector(), tlc.EmbeddingMetricsCollector(layers), callable_metrics_collector] table = ... # Your tlc.Table object model = ... # Your PyTorch model # Optionally wrap the model in a Predictor object model = tlc.Predictor(model, preprocess_fn=..., layers=...) tlc.collect_metrics(table, metric_collectors, predictor=model)
Added a decorator for using 3LC with PyTorch Lightning modules and updated the SegFormer example notebook to use it
Improved huggingface TLC trainer and added options for when to collect metrics
Added a full-screen text user-interface (TUI) for the object service, used by default, disabled with
--no-tui
Added support for accessing data on Azure Blob Storage using “abfs://” URLs
Added an LRU cache to the object service for external data requests (e.g. images) to improve performance by avoiding re-requesting data; default memory size of 1GB and timeout of 1 hour
Fixed object service so that requests for external data from the dashboard are serviced asynchronously so they do not block one another
Numerous improvements and additions to the public example notebooks, including more notebooks, a README.md, and making it possible to run them directly from a github checkout using test data in the repo
Dashboard#
Reorganized navigation components to make them more intuitive:
The active Project is shown in the top-left
Tabs for selecting Runs and Tables are provided just underneath the active Project component. Clicking on each one shows the Runs or Tables for the active Project, allowing selection of one of more to be loaded.
A tab to the right of the Runs and Tables navigation tab shows a summary of selected Runs and Tables. Clicking on this tab allows for the following:
Deselecting individual Runs and Tables
Showing details for the selected Runs and Tables
Showing hyperparameters for the selected Runs
Reorganized toolbars to simplify and group things more logically
Changed 3LC icon in top-left to be a menu, providing access to settings
Replaced connection status string with icon
Moved pending edits under new toolbar icon, where they can be discarded or committed with a commit message
Added info icon to top-right; moved options for documentation, feedback, and the about window there
Removed toolbar buttons to show hide panels (filters, charts, rows); added arrows to collapse and expand them instead
Added a new Workflows panel, along with a number of specific workflows
Made it possible to add a new value map value for (e.g. label category) after dataset creation
Made it possible to save edited tables when the original table is not in a writable location; a new location under the project root directory is used
Remember selected Project between dashboard sessions
Added support for query parameters to open a Table/Run
Added status for Run objects
Made several fixes to bounding box rendering and editing
Made it so a consistent color is used for each label value wherever it is shown
Fixed an issue with coloring by boolean properties, such as FN/FP/TP
Made fixes to make text easier to read in several contexts
Made traversal index operation deterministic
Implemented a number of optimizations, including operations for IoU, BB overlap, BB FN/FP/TP
Don’t invalidate virtual columns whose inputs are unchanged
Optimized tweaking of virtual columns
In the filter panel, made it possible to sort and filter for properties with many enum values (e.g. labels)
Made it possible to filter on edited rows
Made it so Ctrl+Left/Right arrow keys can be used to navigate between samples in the rows panel
Allow for dragging a column to define R, G, B, and radius axes in charts
Made it so charts prefer ‘Label’ over ‘Predicted Label’ for color when available
Made fixes to supersampling in charts
Made improvements to mouse handling in charts
Added a reset button to charts to remove filtering done by lasso tool, etc.
Made it so clicking on the next/previous arrows in a chart, which allows navigation between samples, also scrolls the rows panel to the selected sample
Fixed an issue where Command-C did not work as expected on Mac
3LC Version 2.2.3#
March 2024
Enhancements and Fixes#
tlc
Python Package#
Made deriving project and dataset portions of URLs for Runs and Tables more robust
Added serialization version number to 3LC files to allow for compatibility checks
Improved support for using the object service with NGrok
Added public SegFormer example demonstrating integration with Pytorch Lightning
Dashboard#
Improved support for using the Dashboard with NGrok
Significantly improved performance of Tp/Fp/Fn Operations
Added keyboard shortcut Space to toggle camera/paint
3LC Version 2.2.2#
March 2024
Enhancements and Fixes#
tlc
Python Package#
Made Table row access immutable to prevent unsupported modifications to the in-memory representation of the Table, which could then get cached to disk.
Significantly improved performance of TableFromCoco
Made it so exporters write unaliased URLs
Made Warning the default log level
Mapped thread ID for log messages to lower numbers to avoid errors in the Dashboard
Dashboard#
Added support for rendering image masks
Support image RGB remapping from value map
Support rendering multiple images on top of each other
Introduced new operations to cluster by distance threshold and check if a sample is the primary element in a cluster, which can be used to e.g. cluster samples with identical images then ignore all but one in subsequent training.
Fixed an issue where charts did not show the correct data after creating a subset table
Reintroduced use of cookies to specify object service URL
3LC Version 2.2.1#
February 2024
Enhancements and Fixes#
tlc
Python Package#
Renamed predicted bounding box labels from “label_predicted” to “label”. This fixes a bug where in the Dashboard, adding or assigning a prediction would lead to no labels being written for the new box. This also caused the visualization of all boxes to break after committing such a change.
NOTE: This fix does not retroactively fix existing tables with the old “label_predicted” bounding box labels.
Updated Hugging Face
datasets
integrationRenamed the Hugging Face integration from
tlc.integration.huggingface
totlc.integration.hugging_face
Added
Table.from_hugging_face
that is similar to the other Table.from_* methodsRemoved
load_dataset
;Table.from_hugging_face
should be used instead
Made various fixes and additions to sample types and schemas
Made various improvements to type hinting and removed many uses of Any, which disabled type checking
Dashboard#
Made Run constants visible by default
Made it so that a default column is chosen for the color property when a chart is created without specifying which column should provide the color. In most cases, this means that a label column will be chosen if one is available.
Made it possible to provide
object_service
URL as a URL parameter to specify the object service to use for the current session
3LC Version 2.2.0#
February 2024
We proudly present 3LC Version 2.2!
This release is for early adopters of 3LC, to be deployed in their own environment. It makes a significant change in the way that 3LC data is organized, including the introduction of a Project concept and the packaging of 3LC objects into self-contained folders, which makes them easier to share and perform other operations on. The Dashboard now presents projects on its start screen and shows per-Run charts for the selected project.
Enhancements and Fixes#
tlc
Python Package#
Changed 3LC data organization to have project at the top-level, with settings for project root and project scan URLs instead of the previous settings for tables and runs. Projects contain runs and datasets, runs contain metrics, and datasets contain tables.
Made it so that 3LC objects (Tables, Runs, etc.) are packaged into self-contained folders, which makes them easier to share and perform other operations on
Added support for deleting indexed Runs and Tables
Added support for copying and renaming indexed Runs and Tables
Made it possible to add/remove/modify value maps for Tables
Enhanced samplers so that samples with zero weight can optionally be excluded during sampling and/or when collecting metrics
Refactored various parts of the API to be more flexible, usable, and consistent, including
tlc.init
,table.from_*
, andTableWriter
Dashboard#
Updated start screen to allow user to select a project
Made it so that per-Run graphs are shown for the selected project
Made it possible to delete and rename Runs
Made it so that unfiltered results are shown behind filtered results when doing on-the-fly reduction in a chart
Made a number of optimizations, including to the editing of large Tables and to chart filtering/reduction
Made it so that the previous spherical point rendering is now opt-in via a setting and the default point rendering is simpler and faster, improving rendering framerate for charts with many points
Documentation#
The 3LC documentation is now publicly available at docs.3lc.ai
Availability#
This release is provided to enterprise clients, who have been given access to install it from our private CloudRepo Python repository. Clients will be able to locally install Python wheels which contain the notebook API, the 3LC Dashboard, and documentation from Python packages.
In addition to the credentials to access the private CloudRepo, clients will also need a license key in order to use the software.
Supported Platforms#
Python 3.8 - Python 3.11
Both Conda and “vanilla” Python environments should work
Microsoft Windows 10 and 11 (x86)
macOS 13 and newer (M series)
Ubuntu 20.04 (x86-64) is our supported Linux platform
Most other GLibc based Linux distributions are expected to work, but these are untested and unsupported.
Chrome and Edge web-browsers, with GPU acceleration enabled
Known Issues#
Tables for dataset revisions are currently always stored at a location next to the input table, and it is not possible to override that behavior. This means, for example, that writing a dataset revision with an input table stored in a read-only location (on disk, in cloud storage, etc.) is not supported.
The Table object in the
tlc
Python API is designed to represent immutable columnar data, but it currently returns objects by reference when iterating or indexing. Consequently, it is possible to modify the in-memory representation of the Table, which could then get cached to disk. In general, users of the API should not make such modifications.The
tlc
Python package does not detect, handle, or support NaN (Not-a-Number) values intlc.Table
, and their presence may lead to unpredictable behavior or inconsistencies within the system.The order of columns in the Dashboard filter panel does not follow the order in the tables panel, where the columns are more logically ordered. This will be addressed in an upcoming release.
When renaming a Run in the Dashboard, the old name is sometimes still shown alongside the new name in the Table panel for a short time.