tlc.sample_types¶
Built-in and custom sample types for 3LC tables.
Sample types describe how a Python value (an image, a tensor, a list of bounding boxes, …) is converted to and from
the row form a tlc.Table stores. They are the counterpart of tlc.schemas: schemas describe the
on-disk shape of a column; sample types describe the user-facing Python value for that column.
The core protocol lives in tlc.sample_types.SampleType (and its subclass
tlc.sample_types.ExternalSampleType for values stored outside the row). Register custom sample types via
tlc.sample_types.register_sample_type().
Package Contents¶
Classes¶
Class |
Description |
|---|---|
SampleType for 2D bounding box instances. |
|
SampleType for 3D bounding box instances. |
|
A pre-encoded binary sample with its file extension. |
|
External transform for numpy arrays stored as |
|
Base class for sample types that store data externally. |
|
External transform for torch tensors stored as |
|
Context passed to |
|
SampleType for generic 2D geometry instances. |
|
SampleType for generic 3D geometry instances. |
|
SampleType for hidden columns that should be excluded from sample view. |
|
Pass-through transform that performs no conversion. |
|
PIL.Image stored as JPEG. See |
|
SampleType for 2D keypoint instances. |
|
SampleType for large binary data stored as external files. |
|
Inline transform for numpy arrays. |
|
SampleType for 2D oriented bounding box instances. |
|
SampleType for 3D oriented bounding box instances. |
|
SampleType for PIL.Image stored as external PNG files. |
|
SampleType that converts URL paths to absolute paths. |
|
Base class for inline sample types. |
|
Structured metadata for a registered sample type. |
|
Registry for sample types. |
|
SampleType for mask-based instance segmentation and RLE storage. |
|
SampleType for polygon-based instance segmentation, stored as RLE on the wire. |
|
Inline transform for torch tensors. |
|
A validation error or warning found during data validation. |
|
PIL.Image stored as WEBP. See |
Functions¶
Function |
Description |
|---|---|
List all registered sample type names. |
|
Decorator to register a |
Data¶
Data |
Description |
|---|---|
API¶
- class BoundingBoxes2DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for 2D bounding box instances.
Converts between
tlc.data_types.bounding_boxes.BoundingBoxes2Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Inline:
to_row(): BoundingBoxes2D -> dict with instances/additional_datafrom_row(): dict with instances/additional_data -> BoundingBoxes2D
- accepts(
- value: Any,
Check if the value is a BoundingBoxes2D dataclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is a BoundingBoxes2D dataclass instance.
- from_row(
- data: Any,
Create a BoundingBoxes2D from a 3LC Table row dict.
- Parameters:
data – A dictionary representing a single row from a 3LC Table with bounding box data.
- Returns:
A BoundingBoxes2D object.
- Raises:
ValueError – If the row does not contain instances or if array lengths are inconsistent.
- to_row(
- sample: Any,
Convert a BoundingBoxes2D to the internal 3LC Table row format.
- Parameters:
sample – A BoundingBoxes2D object.
- Returns:
Wire-format dict with
instancesandinstances_additional_datakeys.
- validate_sample(
- sample: Any,
Validate a BoundingBoxes2D sample.
Checks that the sample has the expected type, array shapes, and consistent instance counts.
- Parameters:
sample – The sample to validate.
- Returns:
A list of validation errors.
- class BoundingBoxes3DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for 3D bounding box instances.
Converts between
tlc.data_types.bounding_boxes.BoundingBoxes3Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Inline:
to_row(): BoundingBoxes3D -> dict with instances/additional_datafrom_row(): dict with instances/additional_data -> BoundingBoxes3D
- accepts(
- value: Any,
Check if the value is a BoundingBoxes3D dataclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is a BoundingBoxes3D dataclass instance.
- from_row(
- data: Any,
Create a BoundingBoxes3D from a 3LC Table row dict.
- Parameters:
data – A dictionary representing a single row from a 3LC Table with bounding box data.
- Returns:
A BoundingBoxes3D object.
- Raises:
ValueError – If the row does not contain instances or if array lengths are inconsistent.
- class EncodedSample¶
A pre-encoded binary sample with its file extension.
Use this to hand pre-encoded bytes to the write pipeline without round-tripping through a Python decode/re-encode cycle. Common cases:
Hugging Face datasets with
datasets.Image(decode=False)yield{"bytes": b"...", "path": "image.jpg"}; an adapter can wrap that asEncodedSample(bytes=..., extension=".jpg").Any source that produces format-known bytes (scraped, user-supplied, etc.).
An
ExternalSampleTypereceiving anEncodedSamplewrites the bytes verbatim to the allocated URL, preserving the original format.- bytes: tlc.sample_types._sample_type.EncodedSample.bytes = None¶
The encoded byte payload.
- class ExternalNumpyArraySampleType¶
Bases:
tlc.sample_types._sample_type.ExternalSampleTypeExternal transform for numpy arrays stored as
.npyfiles.- file_extension = .npy¶
- validate_sample(
- sample: Any,
- class ExternalSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeBase class for sample types that store data externally.
Subclass this when data should be stored outside the table row — as individual files, objects, or resources accessed via
tlc.Url. This covers local files, S3 objects, GCS blobs, or any custom URL adapter target.Override
save()andload(), and setfile_extensionto the appropriate suffix (e.g.,".npy",".png").Use the
register_sample_type()decorator to register custom types.- accepts(
- value: Any,
Check whether
valueis a live sample that should be externalized.External sample types should override this method. The write pipeline uses
accepts()to distinguish sample-form values (live Python objects to be externalized viasave()) from row-form values (pre-externalized URL strings, returned unchanged). A subclass that leaves the default will treat every value as row-form and silently skip externalization — almost never what you want.A typical implementation is an
isinstance()check against the native sample type, optionally also acceptingEncodedSampleand/or rawbyteswhen the defaultexternalize()pre-encoded-bytes fast paths should apply.- Parameters:
value – The value to check.
- Returns:
True if
valueis in sample form and should be externalized.
- externalize(
- sample: Any,
- ctx: ExternalizationContext,
Externalize a sample to a file and return the backing URL.
This is where the write pipeline enters an external sample type. The default implementation handles three cases in order:
Pre-encoded bytes (
EncodedSampleor rawbytes) — written verbatim to a new URL, skipping any decode/encode round-trip.EncodedSamplesupplies its own extension; rawbytesusefile_extension.Backing file known (via
source_url()) — the existing URL is returned as-is; no copy is made.Encode from scratch — a new URL is allocated and
save()is called.
The returned URL string may be absolute; the writer pipeline normalizes every URL leaf (including
ExternalSampleTypeleaves) to a table-relative string as a final step. Subclasses need not relativize themselves.Subclasses rarely need to override this. The standard extension points are
save(),load(),accepts(), and optionallysource_url(). Overrideexternalize()itself only when you need unusual storage logic (multiple files per sample, content-addressed naming, conditional writes).- Parameters:
sample – Value to externalize. May be a native Python object (e.g.,
PIL.Image), rawbytes, orEncodedSample.ctx – The externalization context carrying the URL allocator, table URL, schema path, and relativization depth.
- Returns:
URL string pointing at the externalized file. Typically absolute; the pipeline relativizes it later.
- file_extension: str = <Multiline-String>¶
File extension for externally stored data (e.g.,
".npy",".png").
- abstract load(
- url: Url,
Load a sample from an external URL.
The implementation should read bytes via
url.read_bytes()and deserialize to a Python object.- Parameters:
url – The source URL to read from (works with any storage backend).
- Returns:
The Python object in sample form.
- abstract save( ) None¶
Write a sample to an external URL.
The implementation should serialize the sample and write bytes via
url.write_bytes().- Parameters:
sample – The Python object in sample form.
url – The target URL to write to (works with local files, S3, GCS, etc.).
- source_url(
- sample: Any,
Return the URL of an existing file backing this sample, if any.
Optional optimization hook. When the sample is already backed by a file on disk (or any URL-addressable location), overriding this to return that URL lets
externalize()reference the existing file instead of callingsave()to write a copy.Typical use cases: PIL images loaded from disk expose
filenameand_tlc_urlattributes; tensors loaded from.npyfiles can carry their source path; any sample produced byload()can stash the URL it was loaded from.- Parameters:
sample – The Python object in sample form.
- Returns:
A URL or path string for an existing file that already holds this sample’s content, or
Noneif no backing file exists (the default, which causesexternalizeto callsave).
- class ExternalTorchTensorSampleType¶
Bases:
tlc.sample_types._sample_type.ExternalSampleTypeExternal transform for torch tensors stored as
.npyfiles (falls back to.pton read).- file_extension = .npy¶
- validate_sample(
- sample: Any,
- class ExternalizationContext¶
Context passed to
externalize().Carries the minimal state a sample type needs to externalize one value: the schema path this leaf sits at, and a URL allocator for new files. URL relativization is the writer pipeline’s job — sample types just return an absolute URL.
- Variables:
schema_path – Tuple path to the leaf in the schema tree. E.g.
("image",)for a top-level column, or("instances", "mask")for a column nested inside a composite.
- allocate_url(
- extension: str,
Allocate the next file URL for the column at this leaf path.
Files are organized per-leaf-path: nested paths are joined with dots to form a flat subdirectory name (e.g.,
instances.mask). Each path has its own counter, so URLs are of the form<bulk_data_url>/<joined-path>/<counter><ext>.- Parameters:
extension – File extension including the leading dot (e.g.
".png").- Returns:
Absolute URL for the next file.
- descend(
- segment: str,
Return a new context whose schema path has
segmentappended.Used when recursively descending into a composite schema: each child leaf sees a context whose
schema_pathreflects its position in the tree, so allocated URLs land in the right per-leaf subdirectory.- Parameters:
segment – The child key to append to the current path.
- Returns:
A new context with the extended path; the underlying URL allocator is shared by reference.
- class Geometry2DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for generic 2D geometry instances.
Converts between
tlc.data_types.geometries.Geometry2Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Also accepts any
tlc.data_types.Geometry2DBasesubclass during schema inference viaaccepts().- accepts(
- value: Any,
Check if the value is a Geometry2DBase subclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is a Geometry2DBase instance.
- class Geometry3DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for generic 3D geometry instances.
Converts between
tlc.data_types.geometries.Geometry3Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Also accepts any
tlc.data_types.Geometry3DBasesubclass during schema inference viaaccepts().- accepts(
- value: Any,
Check if the value is a Geometry3DBase subclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is a Geometry3DBase instance.
- class Hidden¶
Bases:
tlc.sample_types._sample_type.IdentitySampleType for hidden columns that should be excluded from sample view.
Hidden columns are present in row view but absent from sample view.
to_row()andfrom_row()are inherited fromIdentity(pass-through) but should never be called in practice since hidden columns are filtered out before transform application.- is_included_in_sample = False¶
- class Identity¶
Bases:
tlc.sample_types._sample_type.SampleTypePass-through transform that performs no conversion.
Used as the default when a schema has no explicit transform configured.
to_row()andfrom_row()return the value unchanged.
- class JpegImageSampleType¶
Bases:
tlc.sample_types._image.PILImageSampleTypePIL.Image stored as JPEG. See
PILImageSampleTypefor the contract.- file_extension = .jpeg¶
- class Keypoints2DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for 2D keypoint instances.
Converts between
tlc.data_types.keypoints.Keypoints2Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Inline:
to_row(): Keypoints2D -> dict with instances/additional_datafrom_row(): dict with instances/additional_data -> Keypoints2D
- accepts(
- value: Any,
Check if the value is a Keypoints2D dataclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is a Keypoints2D dataclass instance.
- from_row(
- data: Any,
Create a Keypoints2D from a 3LC Table row dict.
- Parameters:
data – A dictionary representing a single row from a 3LC Table with keypoint data.
- Returns:
A Keypoints2D object.
- Raises:
ValueError – If the row does not contain instances or if array lengths are inconsistent.
- to_row(
- sample: Any,
Convert a Keypoints2D to the internal 3LC Table row format.
- Parameters:
sample – A Keypoints2D object.
- Returns:
Wire-format dict with
instancesandinstances_additional_datakeys.- Raises:
ValueError – If both visibility and confidence arrays are present (only one is supported).
- class LargeBytes¶
Bases:
tlc.sample_types._sample_type.ExternalSampleTypeSampleType for large binary data stored as external files.
- accepts(
- value: Any,
Check if the value is
bytesorEncodedSample.EncodedSampleis accepted so the defaultexternalize()path can honor an alternative extension supplied by the caller.- Parameters:
value – The value to check.
- Returns:
True if the value is
bytesorEncodedSample.
- file_extension = .bin¶
- from_row(
- data: bytes,
Pass through bytes.
- Parameters:
data – The bytes data.
- Returns:
The same bytes data.
- class NumpyArraySampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeInline transform for numpy arrays.
The sample type is shape-blind: it returns the ndarray as-is. The wrapping
Schemadecides whether to flatten (zero-copyreshape(-1)) or convert to a nested Python list, based on its own declared dims and column type.- validate_sample(
- sample: Any,
- class OrientedBoundingBoxes2DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for 2D oriented bounding box instances.
Converts between
tlc.data_types.obb.OrientedBoundingBoxes2Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Inline:
to_row(): OrientedBoundingBoxes2D -> dict with instances/additional_datafrom_row(): dict with instances/additional_data -> OrientedBoundingBoxes2D
- accepts(
- value: Any,
Check if the value is an OrientedBoundingBoxes2D dataclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is an OrientedBoundingBoxes2D dataclass instance.
- from_row(
- data: Any,
Create an OrientedBoundingBoxes2D from a 3LC Table row dict.
- Parameters:
data – A dictionary representing a single row from a 3LC Table with OBB data.
- Returns:
An OrientedBoundingBoxes2D object.
- Raises:
ValueError – If the row does not contain instances.
- class OrientedBoundingBoxes3DSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for 3D oriented bounding box instances.
Converts between
tlc.data_types.obb.OrientedBoundingBoxes3Ddataclass (sample form) and the hierarchical dict row format used by 3LC Tables.Inline:
to_row(): OrientedBoundingBoxes3D -> dict with instances/additional_datafrom_row(): dict with instances/additional_data -> OrientedBoundingBoxes3D
- accepts(
- value: Any,
Check if the value is an OrientedBoundingBoxes3D dataclass instance.
- Parameters:
value – The value to check.
- Returns:
True if the value is an OrientedBoundingBoxes3D dataclass instance.
- from_row(
- data: Any,
Create an OrientedBoundingBoxes3D from a 3LC Table row dict.
- Parameters:
data – A dictionary representing a single row from a 3LC Table with 3D OBB data.
- Returns:
An OrientedBoundingBoxes3D object.
- Raises:
ValueError – If the row does not contain instances.
- class PILImageSampleType¶
Bases:
tlc.sample_types._sample_type.ExternalSampleTypeSampleType for PIL.Image stored as external PNG files.
Subclasses override
file_extensionand_save_formatto write other formats (JPEG, WEBP). Read behavior is shared — PIL sniffs the format from the file’s magic bytes — so anyPIL.Image-decoding subclass can read files written by any other.The configured format is an encoder, not a re-encoder. It only governs how a
PIL.Imageconstructed in memory is written to disk for the first time. Inputs already backed by a file (PIL images withfilename/_tlc_url, rawbytes,EncodedSample, URL strings) pass through verbatim — seeexternalize().- accepts(
- value: Any,
Check if the value is a PIL.Image,
bytes, orEncodedSample.Pre-encoded bytes are accepted as sample form because
externalize()writes them verbatim, avoiding a decode/re-encode round-trip (e.g. for Hugging Face datasets loaded withdatasets.Image(decode=False)).- Parameters:
value – The value to check.
- Returns:
True if the value is a PIL.Image,
bytes, orEncodedSample.
- from_row(
- data: bytes,
Convert bytes back to a PIL.Image.
Called for inline storage only. For file-based storage,
load()is used instead andfrom_row()is never called.- Parameters:
data – Image bytes to deserialize.
- Returns:
The PIL.Image object.
- load(
- url: Url,
Load a PIL.Image from a URL.
- Parameters:
url – The source URL.
- Returns:
The PIL.Image object with
_tlc_urlattribute set to the URL string.
- save( ) None¶
Encode an in-memory PIL.Image to
urlin this transform’s format.Called by
externalize()only when the image has no backing file (source_urlreturnedNone). File-backed images are referenced verbatim by the pipeline and never reach this method.- Parameters:
image – The PIL.Image to encode.
url – The target URL.
- Raises:
ValueError – If the image mode is incompatible with the configured format.
- source_url(
- image: Image,
Return the URL of the image’s backing file, if any.
Checks the
_tlc_urlattribute (set byload()) and thefilenameattribute (set byPIL.Image.open). Returns the URL regardless of file extension — a column declared with sample_type="pil_png"does not force a re-encode of a file-backed JPEG; the row simply stores the existing.jpgURL and no new file is written. The configured format governs encoding only when an image has to be created from scratch (no backing file).- Parameters:
image – The PIL.Image to check.
- Returns:
The URL or path string of the backing file, or
Nonefor in-memory images that must be encoded viasave.
- to_row(
- image: Image,
Encode a PIL.Image sample to bytes in this transform’s format.
Not used by the write pipeline — pipeline writes go through
save()viaexternalize(). This method exists for direct callers that want the encoded bytes without touching the filesystem (round-trip tests, inline fallback).- Parameters:
image – The PIL.Image to encode.
- Returns:
The image serialized as bytes in this transform’s format.
- Raises:
ValueError – If the image mode is incompatible with the configured format.
- validate_sample(
- sample: Any,
Validate a PIL.Image sample.
- Parameters:
sample – The sample to validate.
- Returns:
A list of validation errors.
- class Path¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType that converts URL paths to absolute paths.
This is an identity transform for the data itself - it just ensures URLs are absolute when reading. The SampleType implementation did URL absolutization in sample_from_row, which is now handled elsewhere.
- accepts(
- value: Any,
Check if the value is a path string.
- Parameters:
value – The value to check.
- Returns:
True if the value is a string.
- class SampleType¶
Bases:
abc.ABCBase class for inline sample types.
A sample type converts between sample form (Python objects like PIL.Image, numpy arrays, or dataclasses) and row form (serializable data stored in tables).
This base class is for inline storage — data is stored directly in the table row. Override
to_row()andfrom_row()for custom conversion logic. The default implementation is identity (returns the value unchanged).For types that store data externally (images, large arrays), subclass
ExternalSampleTypeinstead.Use the
register_sample_type()decorator to register custom types.- accepts(
- value: Any,
Check if this transform accepts the given value as a sample.
Used by the write pipeline to distinguish sample-form values (live Python objects to convert) from already-row-form values (passed through unchanged).
Inline sample types can usually leave the default (returns
False) — the pipeline trusts the schema and callsto_row()directly. External sample types must overrideaccepts()— seeaccepts().- Parameters:
value – The value to check.
- Returns:
True if this transform can convert the value from sample form.
- from_row(
- data: Any,
Convert row data back to sample form.
The default implementation returns the data unchanged (identity). Override for custom inline conversion.
- Parameters:
data – The data in row form.
- Returns:
The Python object in sample form.
- is_included_in_sample = True¶
Whether columns with this transform should be included in sample view.
- to_row(
- sample: Any,
Convert a sample to row form for storage.
The default implementation returns the sample unchanged (identity). Override for custom inline conversion.
- Parameters:
sample – The Python object in sample form.
- Returns:
The data in row form (e.g., nested list, dict, bytes).
- validate_row(
- row: Any,
Validate a row dict after serialization or before deserialization.
Override in subclasses to check that the row has the expected structure, types, and constraints.
- Parameters:
row – The data in row form.
- Returns:
A list of validation errors (empty if valid).
- validate_sample(
- sample: Any,
Validate a sample object before serialization.
Override in subclasses to check that the sample has the expected structure, types, and constraints before
to_row()is called.- Parameters:
sample – The Python object in sample form.
- Returns:
A list of validation errors (empty if valid).
- class SampleTypeInfo¶
Bases:
typing_extensions.TypedDictStructured metadata for a registered sample type.
Initialize self. See help(type(self)) for accurate signature.
- source: tlc.sample_types._sample_type.SampleTypeSource = None¶
- class SampleTypeRegistry¶
Registry for sample types.
Sample types are registered by name and can be retrieved by that name. The name is used in Schema.resolved_sample_type to specify which sample type to use.
Note: This registry is not thread-safe. Built-in sample types are registered at import time, which is safe. If registering sample types at runtime from multiple threads, external synchronization is required.
- classmethod discover_entrypoint_sample_types() list[type[SampleType]]¶
Discover and register sample types from installed entry points.
Scans for entry points in the
tlc.sample_typesgroup. Each entry point should reference aSampleTypesubclass. The entry point name becomes the registration name.By default, names that are already registered are skipped (built-ins take priority). Set the class attribute
force = Trueon the sample type class to replace an already-registered name.- Returns:
List of successfully loaded sample type classes.
- classmethod get(
- name: str,
Get a sample type instance by name.
All built-in sample types are parameter-free; per-row metadata travels on the sample-form value (e.g.
SegmentationPolygons.relative), and per-column encoding choices are baked into the variant name (e.g."pil_jpeg").- Parameters:
name – The registered name of the sample type.
- Returns:
An instance of the sample type.
- Raises:
KeyError – If no sample type is registered with the given name.
- classmethod get_class(
- name: str,
Get the sample type class by name without instantiating it.
- Parameters:
name – The registered name of the sample type.
- Returns:
The sample type class.
- Raises:
KeyError – If no sample type is registered with the given name.
- classmethod get_registered_sample_types() dict[str, type[SampleType]]¶
Get all registered sample types.
- Returns:
A copy of the name-to-class mapping.
- classmethod has(
- name: str,
Check if a sample type is registered with the given name.
- Parameters:
name – The name to check.
- Returns:
True if a sample type is registered with the name.
- classmethod list_sample_type_names() list[str]¶
List the names of all registered sample types.
Triggers entry point discovery if it has not yet run.
- Returns:
Sorted list of registered sample type names.
- classmethod list_sample_types() list[SampleTypeInfo]¶
List all registered sample types with structured metadata.
Triggers entry point discovery if it has not yet run.
- Returns:
One
SampleTypeInfoper registered sample type, sorted by name.
- classmethod load_sample_types_from_config( ) list[type[SampleType]]¶
Load sample types from a configuration list.
Each entry should be a dictionary with:
module: The fully qualified module name.class: The class name within that module.name(optional): Registration name. Defaults to the entry point style lowercase class name if omitted.force(optional): Override an already-registered name for this entry.
- Parameters:
config – List of sample type configuration dictionaries.
force – Default force value. Individual entries can override with their own
forcekey.
- Returns:
List of successfully loaded sample type classes.
- classmethod register_sample_type(
- name: str,
- sample_type_cls: type[SampleType],
- *,
- source: tlc.sample_types._sample_type.SampleTypeSource = 'runtime',
- force: bool = False,
Register a sample type class under the given name.
- Parameters:
name – The name to register the sample type under.
sample_type_cls – The SampleType subclass to register.
source – Where this registration originates from.
force – If True, replace an existing registration. A warning is logged when overriding a built-in sample type.
- Raises:
ValueError – If name is already registered with a different class and force is False.
- SampleTypeSource = None¶
- class SegmentationMasksSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for mask-based instance segmentation and RLE storage.
Sample form: SegmentationMasks dataclass Storable form: dict with keys {image_height, image_width, instance_properties, rles}
Also accepts legacy dict input (with keys image_height, image_width, instance_properties, masks) for backward compatibility.
Example::
transform = SegmentationMasks() sample = SegmentationMasks( image_height=480, image_width=640, masks=np.zeros((480, 640, 2), dtype=np.uint8), labels=np.array([1, 2]), ) storable = transform.to_row(sample) restored = transform.from_row(storable)- accepts(
- value: Any,
Check if the value is a mask-based instance segmentation.
- Parameters:
value – The value to check.
- Returns:
True if value is a SegmentationMasks dataclass or a legacy dict with the expected keys.
- from_row( ) SegmentationMasks¶
Convert RLE storage format back to SegmentationMasks.
- Parameters:
data – Dict with image_height, image_width, instance_properties, rles.
- Returns:
SegmentationMasks dataclass.
- class SegmentationPolygonsSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeSampleType for polygon-based instance segmentation, stored as RLE on the wire.
Sample form:
SegmentationPolygonsdataclass. Storable form: dict with keys{image_height, image_width, instance_properties, rles}.Coordinate convention travels on
sample.relative(True=[0, 1],False= pixel).to_row()reads it to scale input polygons up to pixel space before RLE-encoding.from_row()always returns absolute pixel coords — RLE decoding is pixel-natural. When the column’s value type declarespolygons_are_relative=True,tlc.Schema.from_row()honors it and appliesto_relative()on the way out; direct callers of this method (transform.from_row(...)) must convert themselves if needed.Also accepts a legacy dict input (with keys
image_height,image_width,instance_properties,polygons) — assumed to be in pixel coordinates.Example::
sample = SegmentationPolygons( image_height=480, image_width=640, polygons=[[10.0, 20.0, 30.0, 40.0, 50.0, 60.0]], labels=np.array([1]), ) transform = SegmentationPolygonsSampleType() storable = transform.to_row(sample) restored = transform.from_row(storable) # relative=False relative = restored.to_relative() # relative=True- accepts(
- value: Any,
Check if the value is a polygon-based instance segmentation.
- Parameters:
value – The value to check.
- Returns:
True if value is a SegmentationPolygons dataclass or a legacy dict with the expected keys.
- from_row( ) SegmentationPolygons¶
Convert RLE storage format back to SegmentationPolygons in absolute pixel coords.
Always returns
relative=False. The schema-driven path (tlc.Schema.from_row()) appliesto_relative()on top when the column’s value type declarespolygons_are_relative=True.- Parameters:
data – Dict with image_height, image_width, instance_properties, rles.
- Returns:
SegmentationPolygons dataclass with
relative=False.
- to_row(
- sample: SegmentationPolygons | Mapping[str, Any],
Convert polygon-based instance segmentation to RLE storage format.
- Parameters:
sample – A
SegmentationPolygonsdataclass, or a legacy dict (assumed to be in pixel coordinates).- Returns:
Wire-format dict with image dimensions, instance properties, and RLEs.
- class TorchTensorSampleType¶
Bases:
tlc.sample_types._sample_type.SampleTypeInline transform for torch tensors.
Tensors are converted to a 1D numpy view (
reshape(-1)for multi-dim, passthrough for 1D). The wrappingSchemareshapes back to its declared dims on read, so the sample type itself is shape-blind.- validate_sample(
- sample: Any,
- class ValidationError¶
A validation error or warning found during data validation.
- Variables:
path – Dot-separated path to the problematic field (e.g.,
"instances.0.bbs_2d"). Empty string for errors at the root level.message – Human-readable description of the issue.
severity –
"error"for hard failures,"warning"for non-critical issues.
- class WebpImageSampleType¶
Bases:
tlc.sample_types._image.PILImageSampleTypePIL.Image stored as WEBP. See
PILImageSampleTypefor the contract.- file_extension = .webp¶
- get_sample_types() list[str]¶
List all registered sample type names.
- Returns:
Sorted list of registered sample type names.
- register_sample_type( ) Callable[[type[SampleType]], type[SampleType]]¶
Decorator to register a
SampleTypesubclass by name.- Parameters:
name – The sample type name used in schema
sample_typeconfiguration.force – If True, replace an existing registration for name. A warning is logged when overriding a built-in sample type.
- Returns:
A decorator that registers the class and returns it unchanged.
Example::
import tlc @tlc.sample_types.register_sample_type("pil_image") class PILImage(SampleType): def to_row(self, sample): ... def from_row(self, data): ...