3LC Python Package Version 2.14

2.14.0

Features

  • [14955] Added timestamp HMAC authentication to Object Service request handling when using a license key in a 3LC Enterprise Customer Managed deployment. This adds an additional layer of security that ensures that the Object Service only handles requests from trusted Dashboard instances. Note that this requires the Object Service and Dashboard Service to be configured with a shared authentication secret. See the documentation on Secure Communication for details.

  • [14960] Added support for segmentation in TableFromCoco

  • [12915, 14916] Added Table delete_column, delete_columns, delete_row, delete_rows

  • [14974] Implemented Schema.__getitem__ as a shortcut to allow for e.g. schema["bbs"]["bb_list"]["label"] instead of schema.values["bbs"].values["bb_list"].values["label"]

Enhancements and Fixes

  • [14973] Override TableFromParquet get_row_cache_size to fallback to input parquet size

  • [14963] Made Schema.sample_type default to None to clearly distinguish between inheritance and explicit overrides, including the empty string ""

  • [14972] Made it so that SampleType.from_structure returns a CategoricalLabel when the schema includes a map

  • [14964] Made FloatVector2 and FloatVector3 builtin schemas consistent

  • Extracted Url and related classes into a new tlcurl module while preserving symbol backwards compatibility. This allows for use Url related types outside of the full indexing machinery, such as for configuration and logging scenarios.

  • [14790] Enhanced logging capabilities and configuration to make logging consistent across 3lc modules

  • [12351] For EmbeddingsMetricsCollector, include flatten strategy in embeddings column name, which allows for collecting embeddings on the same layer but with a different flatten strategy, which would previously not work because name was only unique per layer and not per flatten strategy

  • [14186] Made TableWriter a context manager and called finalize() on exit

  • [13485, 14547] Catch ArrowTypeErrors when converting batches to pyarrow.RecordBatch to allow for providing a clearer error message when passing data that does not match the schema to a TableWriter

  • [14546] Handle numpy numbers in SampleType.from_sample

  • [14843] Added validation to user-provided names for tables, datasets, projects, columns, and map elements to avoid names that would later cause issues for a variety of reasons, e.g. because they are illegal as file / directory names

  • [14157] Made it so that MeanAggregator ignores NaN, inf, -inf and None in its mean computation

  • [13458] Allow numpy types in SampleType Float and Int ensure_sample_valid

  • [14713] Provide a clearer error message when Table.join_table fails due to inconsistent schemas

  • [14857] Made it so that metrics collected by the Hugging Face Trainer have prefixes for the split they are for (“train” or “eval”)

  • [14616] Give targeted error message when torch.Tensor or np.ndarray passed as structure to SampleType.from_structure