3LC Python Package Version 2.6#
2.6.4#
Enhancements and Fixes#
[12940, 13017, 13018] Improved error messages in various cases where data provided does not match configured sample type
[13076] Added code to guard against an infinite loop in some cases for
TableFromTorchDataset
[13107] Changed assertion to raise and changed order of
Predictor
initialization to makecollect_metrics
more robust[13106] Catch exceptions in get_device and default to CPU device for
Predictor
[13107] Raise in
collect_metrics
iftable
argument is not of typeTable
[13069] Fixed memory leaks with
Predictor
andcollect_metrics
Improved COCO exporter
[12967, 13117] Made segmentation handling a bit more forgiving so that we do the most reasonable thing given the value of the
include_segmentation
option and the state of the data being exported, and so that we do not raise errors or warnings in cases where they are not applicable[13134] Avoided erroneous warnings about unused arguments by filtering out those that are not relevant for the selected exporter
[12950] Ensured the image paths in exported COCO annotations file are correct by handling absolute and relative paths correctly
[13127] Only start one config updater thread instead of one per call to
start
[13127] Added
atexit
handler to stop the indexers
2.6.3#
Features#
[12921] Made it possible to do an enterprise local, customer managed install of the
3lc
Python package
Enhancements and Fixes#
[13013] Addressed issue where new files (especially using absf) could sometimes not initially be found then were blacklisted from being checked again for 30 minutes
[13025] Detectron2: only run metrics collection in main process
2.6.2#
Enhancements and Fixes#
[12974] Catch mistakes with using
collect_metrics
with a pytorchDataset
orDataLoader
and give more helpful error messages[12956] Fixed a bug where training on a
Table
after deleting rows would result in an error[13005] Fixed a bug with indexing
Run
objects that could cause updates during training not to be picked up[12971] Fixed an issue where the new “$..” syntax for aliases could fail to expand correctly in some cases
2.6.1#
Enhancements and Fixes#
[12899] Fixed an issue where
TableWriter
state could be set incorrectly if an exception occurred during setup[12948] Fixed a crash when serializing large tensors for Tables
[12949] Hide large tensors in the Dashboard by default for now since they are only represented by a URL to where they are stored
[12953] Fixed bug in
reduce_embeddings_multiple_parameters
2.6.0#
Features#
[12697] Added support for Torch tensors to
SampleType
[12748] Made it possible for arbitrarily large tensors and numpy arrays to work with 3LC
Table
This is done by creating two distinct
SampleType
classes for each tensor type,Small-
andLarge-
A
SmallNumpyArray
will function as the previously namedNumpyArray
. The array is converted into a list of lists (of lists, etc.) and stored in the rows of the table by value. This quickly becomes infeasible for even moderately large arrays.The new
LargeNumpyArray
serializes the array to a file in thebulk_data
directory of the table, and places a reference to this file in the row of the table. When requesting the sample-view of that element, the array is loaded back into memory from disk. The values in these arrays won’t be visible or editable in the Dashboard, but looking at individual values in arrays with >1000 elements probably would not be very useful anyway.
[11203] Made it possible to delete rows altogether with an
EditedTable
, which can then be used to run training with those rows excluded from the dataset[12740] Added
Table.revision
method that can take a tag, table_url, or table_name and return the relevant table[12663] Made it possible to define aliases needed for a project within the project structure itself. This is useful in general, and in particular it will allow us to add public examples without requiring a new release of the Python package.
[12875] Made it possible to create a Table from a folder of images using
Table.from_image_folder
Enhancements and Fixes#
[12313] Cache and re-use references to Dataloaders for metrics collection so that they do not have to be recreated for each worker thread, which caused a significant performance hit with using
num_workers
> 0 on Windows[12528] Added an
extra_columns
argument to allTable.from_X
methods that can be used to create schema for additional columns at table creation time[12608] Made it possible to convert an inferred parquet schema to a 3LC schema when it contains a list of structs
[12291] Added average pooling flattening strategies for the embeddings metrics collector
[12603] Allow
str
argument (in addition toUrl
) asforeign_table_url
inRun.reduce_embeddings_by_foreign_table_url
[12607] Do not add IOU to schema when compute_derived_metrics is False for bounding-box metrics collector
[12695] Make sure we don’t update modified time when checking if a directory is writable
[12698] When creating a new
Table
using the high-levelfrom_X
methods, if the specifiedtable_url
already exists, but that directory does not contain anobject.3lc.json
file, go ahead and delete the directory since it represents a malformed table that was never actually successfully created.[12553] Infer device in example notebooks to support users on Mac (using mps) or without a GPU
[12692] Create a
Run
’s bulk-data folder on creation to avoid timing issues[12735] Added
shuffle
argument toTable.create_sampler
[12774] Support YOLO YAML files without a
path
key inTableFromYolo
[12783] Made it so
Url
init raises on bad input[12791] Lifted everything in
tlc.client.utils
up into thetlc
namespace to make referencing contained types more convenient in e.g. the YOLO integration[12776] Deprecated use of
NumpyInt
andNumpyFloat
asSampleType
, since a plainInt
orFloat
works just as well[12746] Fixed removal of columns that are deleted by override schema
[12665] Disable command palette for Object Service TUI since it is not intended to be used
[12837] Made it possible to reduce columns without the number role “nn_embedding”
[12837] Changed the default
n_components
for UMAP embeddings reduction to 2 since that is the default in the source umap package and it makes it consistent with pacmap[12876] Make it possible to pass a single table to tlc.reduce_embeddings, and have it return a single table
Known Issues#
The
tlc
Python package does not detect, handle, or support NaN (Not-a-Number) values intlc.Table
, and their presence may lead to unpredictable behavior or inconsistencies within the system.