tlc.export¶
Exporters for converting 3LC tables into common dataset formats.
Use Table.export() for one-shot exports. Author custom exporters by
subclassing tlc.export.Exporter, tlc.export.RowExporter, or
tlc.export.SerializingExporter and decorating with tlc.export.register_exporter().
Package Contents¶
Classes¶
Class |
Description |
|---|---|
Exporter for the COCO format. |
|
Exporter for the CSV format. |
|
Basic exporter for the JSON format. |
|
The base class for all Exporters. |
|
Structured metadata for a registered exporter. |
|
Maintains registered exporters and provides lookup and discovery. |
|
Base class for row-by-row exporters where each row maps to one output line. |
|
Base class for exporters that serialize a table to a single string/file. |
|
Exporter for the YOLO format. |
Functions¶
Function |
Description |
|---|---|
List the format strings of all registered exporters. |
|
List all registered exporters with structured metadata. |
|
Class decorator that registers an |
Data¶
Data |
Description |
|---|---|
API¶
- class CocoExporter¶
Bases:
tlc.export.exporter.SerializingExporterExporter for the COCO format.
Tables which are originally instances of the TableFromCoco class will be compatible with this exporter.
- can_export( ) bool¶
Check if the table can be exported to the COCO format.
Can not be 100% accurate, as we don’t know if the user has supplied additional arguments, such as annotation_column_name, or image_column_name.
- priority = 3¶
- serialize(
- table: Table,
- output_url: Url,
- weight_threshold: float = 0.0,
- image_folder: Url | str = '',
- absolute_image_paths: bool = False,
- annotation_column_name: str | None = None,
- image_column_name: str | None = None,
- per_image_extras: Sequence[str] | None = None,
- indent: int = 4,
- **kwargs: Any,
Serialize a table to the COCO format.
Default behavior is to write a COCO file with image paths relative to the (output) annotations file. Written paths can be further configured with the
absolute_image_pathsandimage_folderargument.Note that for a coco file to be valid, the image paths should be absolute or relative w.r.t. the annotations file itself.
- Parameters:
table – The table to serialize
output_url – The output URL
weight_threshold – The weight threshold
image_folder – Make image paths relative to a specific folder. Note that this may produce an annotations file that needs special handling. This option is mutually exclusive with
absolute_image_paths.absolute_image_paths – Make image paths absolute. If this is set to True, the
image_foldercannot be set.annotation_column_name – Optional column name to use for annotations instead of the defaults (
bbs,segmentations, orkeypoints_2d). The content of the column determines the annotation mode (bounding boxes, segmentations, or keypoints) based on its structure.image_column_name – Optional column name to use for image URLs. Defaults to
image.per_image_extras – Image-level column names to include in the COCO image dicts. These should correspond to top-level table columns (e.g. columns originally imported via the importer’s
per_image_extrasparameter). When None, no extra image fields are written.indent – The number of spaces to use for indentation in the output.
**kwargs – Any additional arguments
- Returns:
The serialized table
- supported_format = coco¶
- class CsvExporter¶
Bases:
tlc.export.exporter.SerializingExporterExporter for the CSV format.
- file_extensions = frozenset(...)¶
- priority = 1¶
- serialize(
- table: Table,
- output_url: Url,
- weight_threshold: float = 0.0,
- dialect: str | Dialect | type[Dialect] = 'excel',
- exclude_header: bool = False,
- **kwargs: Any,
Serialize a table to a CSV string.
- Parameters:
table – The table to serialize.
output_url – The output URL for the serialized data.
weight_threshold – The minimum weight of a row to be included in the output.
dialect – The dialect to use for the CSV output. This can be a string like “excel” or “unix”. If you are not using the CLI tool, but are instead using the Python API, you can also pass a Dialect object or a subclass of Dialect.
exclude_header – Exclude the header row in the output.
**kwargs – Additional keyword arguments.
- Returns:
A CSV string representing the table.
- supported_format = csv¶
- class DefaultJsonExporter¶
Bases:
tlc.export.exporter.SerializingExporterBasic exporter for the JSON format.
This exporter is used when no other exporter is compatible with the Table, and the output path has a .json extension.
- file_extensions = frozenset(...)¶
- priority = 1¶
- serialize( ) str¶
Serialize a table to a JSON string.
- Parameters:
table – The table to serialize.
output_url – The output URL for the serialized data.
weight_threshold – The minimum weight of a row to be included in the output.
indent – The number of spaces to use for indentation in the output.
**kwargs – Additional keyword arguments.
- Returns:
A JSON string representing the table.
- supported_format = json¶
- class Exporter¶
The base class for all Exporters.
Exporters are used to export tables to various formats, typically after a user is done cleaning their data with 3LC. Subclasses of Exporter should be registered using the
register_exporter()decorator, which makes them available for use inTable.export().There are three patterns for implementing exporters:
Row exporters (simplest): Subclass
RowExporterand implementexport_row(), which converts a single row to a string. Declare aseparatorand the framework handles iteration, weight filtering, joining, and writing.Serializing exporters (single-file output): Subclass
SerializingExporterand implement theserialize()method, which returns a string to be written to the output URL.Free-form exporters (e.g., directory output): Subclass
Exporterdirectly and override the internal export hooks (_do_exportand_get_export_impl_method— see the source for the full contract).
Subclasses can also override the
can_export()method, which determines whether the exporter can export a given table to a given URL. The default implementation checks thefile_extensionsclass attribute. If neithercan_export()norfile_extensionsis provided, the exporter will only be used when theformatargument is specified explicitly inTable.export().Subclasses of Exporter must define the class attribute
supported_format, which is a string indicating the format that the exporter supports. Whenever theformatargument is not specified inTable.export(), it will callcan_export()for all registered exporters to find compatible ones. If multiple exporters are compatible, the one with the highestprioritywill be used.- Variables:
supported_format – A string indicating the format that the exporter supports.
priority – An integer indicating the priority of the exporter. Used to break ties when multiple exporters are compatible with a given table and URL. The exporter with the highest priority will be used.
file_extensions – A frozenset of file extensions (e.g.,
{".csv"}) that the exporter handles. The basecan_export()implementation checks if the output URL’s extension is in this set.force – If True, allows this exporter to override an already-registered format during entry point discovery.
- can_export( ) bool¶
Check if the exporter can export the given table to the given output URL.
The default implementation checks if the output URL’s extension is in
file_extensions. Subclasses can override this for content-based detection (e.g., checking for annotation columns).This method is called for all registered exporters when
formatis not specified inTable.export(), so it should be fast.- Parameters:
table – The table to export.
output_url – The URL to export to.
- Returns:
True if the exporter can export the table to the given URL, False otherwise.
- static remaining_table_rows( ) Iterator[tlc._core.objects.table.TableRow]¶
Return an iterator of the remaining rows in the table after filtering out rows with a weight below the given threshold.
- Parameters:
table – The table to filter.
weight_threshold – The weight threshold.
- Returns:
An iterator of the remaining rows in the table.
- class ExporterInfo¶
Bases:
typing_extensions.TypedDictStructured metadata for a registered exporter.
Initialize self. See help(type(self)) for accurate signature.
- source: 'builtin' | 'entrypoint' | 'config' | 'runtime' = None¶
- class ExporterRegistry¶
Maintains registered exporters and provides lookup and discovery.
This is a static registry class — all methods are
@staticmethodand state is stored in class variables. The registry maps format strings to exporter instances.Registration follows a well-defined order:
Built-in exporters (
@register_exporterat import time)Entry point plugins (
discover_entrypoint_exporters(), lazy)Config-based (
load_exporters_from_config(), at service startup)Runtime direct calls (any time)
A later phase can replace an earlier one via
force=True.- static discover_entrypoint_exporters() list[Exporter]¶
Discover and register exporters from installed entry points.
Scans for entry points in the
tlc.exportersgroup. Each entry point should reference anExportersubclass that can be instantiated with no arguments.By default, exporters whose format is already registered are skipped (built-ins take priority). Set
force = Trueon the exporter class to replace already-registered formats.- Returns:
List of successfully loaded exporter instances.
- static get_exporter_for_format(
- fmt: str,
Get the exporter registered for the given format.
If the format is not found and entry point discovery has not yet run, triggers discovery before giving up.
- Parameters:
fmt – The format string (e.g.,
"csv","coco").- Returns:
The exporter instance, or None if no exporter is registered for the format.
- static get_registered_exporters() dict[str, Exporter]¶
Get a copy of all registered exporters.
- Returns:
Dictionary mapping format strings to their exporter instances.
- static get_registered_formats() list[str]¶
Get all registered format strings.
- Returns:
List of registered format strings.
- static infer_format( ) str¶
Infer the most suitable export format given a table and an output URL.
Iterates all registered exporters, calls
can_export()on each, and returns the format with the highest priority. Raises ValueError if no compatible exporter is found or if there is an ambiguous tie.- Parameters:
table – The table to export.
output_url – The URL to export to.
- Returns:
The inferred format string.
- Raises:
ValueError – If no compatible exporter is found, the output URL has no extension, or multiple exporters tie at the highest priority.
- static list_exporter_formats() list[str]¶
List the format strings of all registered exporters.
Triggers entry point discovery if it has not yet run.
- Returns:
Sorted list of registered format strings.
- static list_exporters() list[ExporterInfo]¶
Get structured metadata for all registered exporters.
Triggers entry point discovery if it has not yet run.
- Returns:
One
ExporterInfoper registered format.
- static load_exporter_from_module( ) Exporter | None¶
Load and register an exporter from a module.
Dynamically imports a module and instantiates an exporter class.
- Parameters:
module_name – The fully qualified module name (e.g.,
"mypackage.exporters").class_name – The exporter class name (e.g.,
"MyCustomExporter").kwargs – Optional keyword arguments to pass to the exporter constructor.
force – If True, allow overriding already-registered formats.
- Returns:
The exporter instance if successful, None otherwise.
- static load_exporters_from_config( ) list[Exporter]¶
Load exporters from a configuration list.
Each entry in the config list should be a dictionary with:
module: The fully qualified module nameclass: The exporter class namekwargs(optional): Dictionary of constructor argumentsforce(optional): If True, allow this exporter to override already-registered formats. Overrides the method-levelforceparameter for this entry.
- Parameters:
config – List of exporter configuration dictionaries.
force – Default value for whether to allow overriding already-registered formats. Individual entries can override this with their own
forcekey.
- Returns:
List of successfully loaded exporters.
- static register_exporter(
- exporter: Exporter,
- *,
- force: bool = False,
- source: 'builtin' | 'entrypoint' | 'config' | 'runtime' = 'runtime',
Register an exporter instance.
- Parameters:
exporter – The exporter instance to register. Must have a
supported_formatattribute.force – If True, allow overriding an already-registered format. If False (default), raises ValueError when attempting to register a format that already has an exporter.
source – Where this registration originates from. One of
"builtin","entrypoint","config", or"runtime".
- Raises:
ValueError – If the format is already registered and force is False.
- static reset() None¶
Reset the registry to its initial state.
This removes all registered exporters. Primarily intended for testing.
- ExporterSource = None¶
- class RowExporter¶
Bases:
tlc.export.exporter.SerializingExporterBase class for row-by-row exporters where each row maps to one output line.
This is the simplest exporter pattern. Subclasses implement
export_row()which converts a single table row to a string. The framework handles iteration, weight filtering, joining withseparator, and writing to the output URL.Example:
@register_exporter class NdjsonExporter(RowExporter): supported_format = "ndjson" file_extensions = frozenset({".ndjson", ".jsonl"}) separator = "\n" def export_row(self, row, **kwargs): import json return json.dumps(row)
- Variables:
separator – The string used to join row outputs. Defaults to
"\n".
- abstract export_row(
- row: tlc._core.objects.table.TableRow,
- **kwargs: Any,
Convert a single table row to a string.
- Parameters:
row – A single row from the table (dict-like mapping column names to values).
**kwargs – Additional format-specific arguments.
- Returns:
The string representation of the row.
- serialize( ) str¶
Serialize the table by converting each row and joining with the separator.
- Parameters:
table – The table to serialize.
output_url – The URL to export to (available for path resolution).
weight_threshold – The weight threshold for filtering rows.
**kwargs – Additional arguments passed to
export_row().
- Returns:
The serialized table as a string.
- class SerializingExporter¶
Bases:
tlc.export.exporter.ExporterBase class for exporters that serialize a table to a single string/file.
Subclasses must implement the
serialize()method, which returns a string representation of the table. The string is then written to the output URL.This is the most common exporter pattern, suitable for formats like JSON, CSV, and COCO.
- abstract serialize( ) str¶
Serialize a table to a string which can be written to a URL.
- Parameters:
table – The table to serialize.
output_url – The URL to export to (available for path resolution).
weight_threshold – The weight threshold for filtering rows.
**kwargs – Additional format-specific arguments.
- Returns:
The serialized table as a string.
- class YoloExporter¶
Bases:
tlc.export.exporter.ExporterExporter for the YOLO format.
YOLO format writes:
One label file per image (only if there are labels)
A dataset YAML configuration file
Optionally copies/moves/symlinks images to the output directory
The exporter supports:
Detection (bounding boxes)
Segmentation (polygons)
Pose (keypoints)
Oriented bounding boxes (OBB)
Additive exports are supported: you can export multiple tables to the same output directory by calling export_yolo() multiple times with different split names. Each split must have matching category mappings. Example:
train_table.export(output_url="./dataset", format="yolo", split="train") val_table.export(output_url="./dataset", format="yolo", split="val")
Output directory structure:
<output_url>/ ├── <dataset.yaml ├── images/ │ ├── train/ │ │ ├── image1.jpg │ │ └── ... │ └── val/ │ └── ... └── labels/ ├── train/ │ ├── image1.txt │ └── ... └── val/ └── ...- can_export( ) bool¶
Check if the table can be exported to the YOLO format.
YOLO export requires:
A bounding box, segmentation, keypoints, or OBB column
An output URL that is a directory (no extension or empty extension)
- export_yolo(
- table: Table,
- output_url: Url,
- weight_threshold: float = 0.0,
- split: str = 'train',
- image_strategy: Literal[ignore, copy, move, symlink] = 'ignore',
- image_column_name: str | None = None,
- annotation_column_name: str | None = None,
- dataset_name: str = 'dataset',
- **kwargs: Any,
Export a table to YOLO format.
- Parameters:
table – The table to export.
output_url – The directory URL to export to.
weight_threshold – The weight threshold for filtering rows.
split – The name of the split (e.g., “train”, “val”, “test”). Defaults to “train”.
image_strategy – How to handle images. Options are: - “ignore”: Do not copy or symlink images (default). - “copy”: Copy images to the output directory. - “move”: Move images to the output directory. - “symlink”: Create symlinks to images. Only works for local file URLs. Not supported on Windows.
image_column_name – The column containing image URLs. Defaults to “image”.
annotation_column_name – The column containing annotations. Auto-detected if not provided.
dataset_name – The name for the dataset YAML file. Defaults to “dataset”.
**kwargs – Additional arguments (ignored).
- priority = 2¶
- supported_format = yolo¶
- list_exporter_formats() list[str]¶
List the format strings of all registered exporters.
Module-level wrapper that delegates to
ExporterRegistry.list_exporter_formats(). Use :func:list_exportersfor full structured metadata.- Returns:
Sorted list of registered format strings.
- list_exporters() list[ExporterInfo]¶
List all registered exporters with structured metadata.
Module-level wrapper that delegates to
ExporterRegistry.list_exporters(). Triggers entry point discovery if it has not yet run, so plugins installed viapipare reflected in the output without any extra setup.- Returns:
One
ExporterInfoper registered format.
- register_exporter( ) type[Exporter]¶
Class decorator that registers an
Exportersubclass.Instantiates the class with no arguments and registers it for the format specified by
supported_format. If instantiation or registration fails, the error is logged and the class is returned unregistered.Built-in exporters (those with
supported_formatin_BUILTIN_FORMATS) are recorded withsource="builtin"; all others getsource="runtime".- Parameters:
cls – The Exporter subclass to register.
- Returns:
The class, unchanged.