tlc.export

Exporters for converting 3LC tables into common dataset formats.

Use Table.export() for one-shot exports. Author custom exporters by subclassing tlc.export.Exporter, tlc.export.RowExporter, or tlc.export.SerializingExporter and decorating with tlc.export.register_exporter().

Package Contents

Classes

Class

Description

CocoExporter

Exporter for the COCO format.

CsvExporter

Exporter for the CSV format.

DefaultJsonExporter

Basic exporter for the JSON format.

Exporter

The base class for all Exporters.

ExporterInfo

Structured metadata for a registered exporter.

ExporterRegistry

Maintains registered exporters and provides lookup and discovery.

RowExporter

Base class for row-by-row exporters where each row maps to one output line.

SerializingExporter

Base class for exporters that serialize a table to a single string/file.

YoloExporter

Exporter for the YOLO format.

Functions

Function

Description

list_exporter_formats

List the format strings of all registered exporters.

list_exporters

List all registered exporters with structured metadata.

register_exporter

Class decorator that registers an Exporter subclass.

Data

Data

Description

ExporterSource

API

class CocoExporter

Bases: tlc.export.exporter.SerializingExporter

Exporter for the COCO format.

Tables which are originally instances of the TableFromCoco class will be compatible with this exporter.

can_export(
table: Table,
output_url: Url,
) bool

Check if the table can be exported to the COCO format.

Can not be 100% accurate, as we don’t know if the user has supplied additional arguments, such as annotation_column_name, or image_column_name.

priority = 3
serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
image_folder: Url | str = '',
absolute_image_paths: bool = False,
annotation_column_name: str | None = None,
image_column_name: str | None = None,
per_image_extras: Sequence[str] | None = None,
indent: int = 4,
**kwargs: Any,
) str

Serialize a table to the COCO format.

Default behavior is to write a COCO file with image paths relative to the (output) annotations file. Written paths can be further configured with the absolute_image_paths and image_folder argument.

Note that for a coco file to be valid, the image paths should be absolute or relative w.r.t. the annotations file itself.

Parameters:
  • table – The table to serialize

  • output_url – The output URL

  • weight_threshold – The weight threshold

  • image_folder – Make image paths relative to a specific folder. Note that this may produce an annotations file that needs special handling. This option is mutually exclusive with absolute_image_paths.

  • absolute_image_paths – Make image paths absolute. If this is set to True, the image_folder cannot be set.

  • annotation_column_name – Optional column name to use for annotations instead of the defaults (bbs, segmentations, or keypoints_2d). The content of the column determines the annotation mode (bounding boxes, segmentations, or keypoints) based on its structure.

  • image_column_name – Optional column name to use for image URLs. Defaults to image.

  • per_image_extras – Image-level column names to include in the COCO image dicts. These should correspond to top-level table columns (e.g. columns originally imported via the importer’s per_image_extras parameter). When None, no extra image fields are written.

  • indent – The number of spaces to use for indentation in the output.

  • **kwargs – Any additional arguments

Returns:

The serialized table

supported_format = coco
class CsvExporter

Bases: tlc.export.exporter.SerializingExporter

Exporter for the CSV format.

file_extensions = frozenset(...)
priority = 1
serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
dialect: str | Dialect | type[Dialect] = 'excel',
exclude_header: bool = False,
**kwargs: Any,
) str

Serialize a table to a CSV string.

Parameters:
  • table – The table to serialize.

  • output_url – The output URL for the serialized data.

  • weight_threshold – The minimum weight of a row to be included in the output.

  • dialect – The dialect to use for the CSV output. This can be a string like “excel” or “unix”. If you are not using the CLI tool, but are instead using the Python API, you can also pass a Dialect object or a subclass of Dialect.

  • exclude_header – Exclude the header row in the output.

  • **kwargs – Additional keyword arguments.

Returns:

A CSV string representing the table.

supported_format = csv
class DefaultJsonExporter

Bases: tlc.export.exporter.SerializingExporter

Basic exporter for the JSON format.

This exporter is used when no other exporter is compatible with the Table, and the output path has a .json extension.

file_extensions = frozenset(...)
priority = 1
serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
indent: int = 4,
**kwargs: Any,
) str

Serialize a table to a JSON string.

Parameters:
  • table – The table to serialize.

  • output_url – The output URL for the serialized data.

  • weight_threshold – The minimum weight of a row to be included in the output.

  • indent – The number of spaces to use for indentation in the output.

  • **kwargs – Additional keyword arguments.

Returns:

A JSON string representing the table.

supported_format = json
class Exporter

The base class for all Exporters.

Exporters are used to export tables to various formats, typically after a user is done cleaning their data with 3LC. Subclasses of Exporter should be registered using the register_exporter() decorator, which makes them available for use in Table.export().

There are three patterns for implementing exporters:

  1. Row exporters (simplest): Subclass RowExporter and implement export_row(), which converts a single row to a string. Declare a separator and the framework handles iteration, weight filtering, joining, and writing.

  2. Serializing exporters (single-file output): Subclass SerializingExporter and implement the serialize() method, which returns a string to be written to the output URL.

  3. Free-form exporters (e.g., directory output): Subclass Exporter directly and override the internal export hooks (_do_export and _get_export_impl_method — see the source for the full contract).

Subclasses can also override the can_export() method, which determines whether the exporter can export a given table to a given URL. The default implementation checks the file_extensions class attribute. If neither can_export() nor file_extensions is provided, the exporter will only be used when the format argument is specified explicitly in Table.export().

Subclasses of Exporter must define the class attribute supported_format, which is a string indicating the format that the exporter supports. Whenever the format argument is not specified in Table.export(), it will call can_export() for all registered exporters to find compatible ones. If multiple exporters are compatible, the one with the highest priority will be used.

Variables:
  • supported_format – A string indicating the format that the exporter supports.

  • priority – An integer indicating the priority of the exporter. Used to break ties when multiple exporters are compatible with a given table and URL. The exporter with the highest priority will be used.

  • file_extensions – A frozenset of file extensions (e.g., {".csv"}) that the exporter handles. The base can_export() implementation checks if the output URL’s extension is in this set.

  • force – If True, allows this exporter to override an already-registered format during entry point discovery.

can_export(
table: Table,
output_url: Url,
) bool

Check if the exporter can export the given table to the given output URL.

The default implementation checks if the output URL’s extension is in file_extensions. Subclasses can override this for content-based detection (e.g., checking for annotation columns).

This method is called for all registered exporters when format is not specified in Table.export(), so it should be fast.

Parameters:
  • table – The table to export.

  • output_url – The URL to export to.

Returns:

True if the exporter can export the table to the given URL, False otherwise.

file_extensions: ClassVar[frozenset[str]] = frozenset(...)
force: ClassVar[bool] = False
priority: int = 0
static remaining_table_rows(
table: Table,
weight_threshold: float,
) Iterator[tlc._core.objects.table.TableRow]

Return an iterator of the remaining rows in the table after filtering out rows with a weight below the given threshold.

Parameters:
  • table – The table to filter.

  • weight_threshold – The weight threshold.

Returns:

An iterator of the remaining rows in the table.

supported_format: str = None
class ExporterInfo

Bases: typing_extensions.TypedDict

Structured metadata for a registered exporter.

Initialize self. See help(type(self)) for accurate signature.

exporter_class: str = None
format: str = None
module: str = None
source: 'builtin' | 'entrypoint' | 'config' | 'runtime' = None
class ExporterRegistry

Maintains registered exporters and provides lookup and discovery.

This is a static registry class — all methods are @staticmethod and state is stored in class variables. The registry maps format strings to exporter instances.

Registration follows a well-defined order:

  1. Built-in exporters (@register_exporter at import time)

  2. Entry point plugins (discover_entrypoint_exporters(), lazy)

  3. Config-based (load_exporters_from_config(), at service startup)

  4. Runtime direct calls (any time)

A later phase can replace an earlier one via force=True.

static discover_entrypoint_exporters() list[Exporter]

Discover and register exporters from installed entry points.

Scans for entry points in the tlc.exporters group. Each entry point should reference an Exporter subclass that can be instantiated with no arguments.

By default, exporters whose format is already registered are skipped (built-ins take priority). Set force = True on the exporter class to replace already-registered formats.

Returns:

List of successfully loaded exporter instances.

static get_exporter_for_format(
fmt: str,
) Exporter | None

Get the exporter registered for the given format.

If the format is not found and entry point discovery has not yet run, triggers discovery before giving up.

Parameters:

fmt – The format string (e.g., "csv", "coco").

Returns:

The exporter instance, or None if no exporter is registered for the format.

static get_registered_exporters() dict[str, Exporter]

Get a copy of all registered exporters.

Returns:

Dictionary mapping format strings to their exporter instances.

static get_registered_formats() list[str]

Get all registered format strings.

Returns:

List of registered format strings.

static infer_format(
table: Table,
output_url: Url,
) str

Infer the most suitable export format given a table and an output URL.

Iterates all registered exporters, calls can_export() on each, and returns the format with the highest priority. Raises ValueError if no compatible exporter is found or if there is an ambiguous tie.

Parameters:
  • table – The table to export.

  • output_url – The URL to export to.

Returns:

The inferred format string.

Raises:

ValueError – If no compatible exporter is found, the output URL has no extension, or multiple exporters tie at the highest priority.

static list_exporter_formats() list[str]

List the format strings of all registered exporters.

Triggers entry point discovery if it has not yet run.

Returns:

Sorted list of registered format strings.

static list_exporters() list[ExporterInfo]

Get structured metadata for all registered exporters.

Triggers entry point discovery if it has not yet run.

Returns:

One ExporterInfo per registered format.

static load_exporter_from_module(
module_name: str,
class_name: str,
kwargs: dict[str, Any] | None = None,
*,
force: bool = False,
) Exporter | None

Load and register an exporter from a module.

Dynamically imports a module and instantiates an exporter class.

Parameters:
  • module_name – The fully qualified module name (e.g., "mypackage.exporters").

  • class_name – The exporter class name (e.g., "MyCustomExporter").

  • kwargs – Optional keyword arguments to pass to the exporter constructor.

  • force – If True, allow overriding already-registered formats.

Returns:

The exporter instance if successful, None otherwise.

static load_exporters_from_config(
config: list[dict[str, Any]],
*,
force: bool = False,
) list[Exporter]

Load exporters from a configuration list.

Each entry in the config list should be a dictionary with:

  • module: The fully qualified module name

  • class: The exporter class name

  • kwargs (optional): Dictionary of constructor arguments

  • force (optional): If True, allow this exporter to override already-registered formats. Overrides the method-level force parameter for this entry.

Parameters:
  • config – List of exporter configuration dictionaries.

  • force – Default value for whether to allow overriding already-registered formats. Individual entries can override this with their own force key.

Returns:

List of successfully loaded exporters.

static register_exporter(
exporter: Exporter,
*,
force: bool = False,
source: 'builtin' | 'entrypoint' | 'config' | 'runtime' = 'runtime',
) None

Register an exporter instance.

Parameters:
  • exporter – The exporter instance to register. Must have a supported_format attribute.

  • force – If True, allow overriding an already-registered format. If False (default), raises ValueError when attempting to register a format that already has an exporter.

  • source – Where this registration originates from. One of "builtin", "entrypoint", "config", or "runtime".

Raises:

ValueError – If the format is already registered and force is False.

static reset() None

Reset the registry to its initial state.

This removes all registered exporters. Primarily intended for testing.

static unregister_exporter(
exporter: Exporter,
) bool

Unregister an exporter by instance.

Parameters:

exporter – The exporter instance to unregister.

Returns:

True if the exporter was found and removed, False otherwise.

static unregister_format(
fmt: str,
) bool

Unregister the exporter for a specific format.

Parameters:

fmt – The format to unregister.

Returns:

True if the format was found and removed, False otherwise.

ExporterSource = None
class RowExporter

Bases: tlc.export.exporter.SerializingExporter

Base class for row-by-row exporters where each row maps to one output line.

This is the simplest exporter pattern. Subclasses implement export_row() which converts a single table row to a string. The framework handles iteration, weight filtering, joining with separator, and writing to the output URL.

Example:

@register_exporter
class NdjsonExporter(RowExporter):
    supported_format = "ndjson"
    file_extensions = frozenset({".ndjson", ".jsonl"})
    separator = "\n"

    def export_row(self, row, **kwargs):
        import json
        return json.dumps(row)
Variables:

separator – The string used to join row outputs. Defaults to "\n".

abstract export_row(
row: tlc._core.objects.table.TableRow,
**kwargs: Any,
) str

Convert a single table row to a string.

Parameters:
  • row – A single row from the table (dict-like mapping column names to values).

  • **kwargs – Additional format-specific arguments.

Returns:

The string representation of the row.

separator: str = \n
serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
**kwargs: Any,
) str

Serialize the table by converting each row and joining with the separator.

Parameters:
  • table – The table to serialize.

  • output_url – The URL to export to (available for path resolution).

  • weight_threshold – The weight threshold for filtering rows.

  • **kwargs – Additional arguments passed to export_row().

Returns:

The serialized table as a string.

class SerializingExporter

Bases: tlc.export.exporter.Exporter

Base class for exporters that serialize a table to a single string/file.

Subclasses must implement the serialize() method, which returns a string representation of the table. The string is then written to the output URL.

This is the most common exporter pattern, suitable for formats like JSON, CSV, and COCO.

abstract serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
**kwargs: Any,
) str

Serialize a table to a string which can be written to a URL.

Parameters:
  • table – The table to serialize.

  • output_url – The URL to export to (available for path resolution).

  • weight_threshold – The weight threshold for filtering rows.

  • **kwargs – Additional format-specific arguments.

Returns:

The serialized table as a string.

class YoloExporter

Bases: tlc.export.exporter.Exporter

Exporter for the YOLO format.

YOLO format writes:

  • One label file per image (only if there are labels)

  • A dataset YAML configuration file

  • Optionally copies/moves/symlinks images to the output directory

The exporter supports:

  • Detection (bounding boxes)

  • Segmentation (polygons)

  • Pose (keypoints)

  • Oriented bounding boxes (OBB)

Additive exports are supported: you can export multiple tables to the same output directory by calling export_yolo() multiple times with different split names. Each split must have matching category mappings. Example:

train_table.export(output_url="./dataset", format="yolo", split="train")
val_table.export(output_url="./dataset", format="yolo", split="val")

Output directory structure:

<output_url>/
├── <dataset.yaml
├── images/
│   ├── train/
│   │   ├── image1.jpg
│   │   └── ...
│   └── val/
│       └── ...
└── labels/
    ├── train/
    │   ├── image1.txt
    │   └── ...
    └── val/
        └── ...
can_export(
table: Table,
output_url: Url,
) bool

Check if the table can be exported to the YOLO format.

YOLO export requires:

  • A bounding box, segmentation, keypoints, or OBB column

  • An output URL that is a directory (no extension or empty extension)

export_yolo(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
split: str = 'train',
image_strategy: Literal[ignore, copy, move, symlink] = 'ignore',
image_column_name: str | None = None,
annotation_column_name: str | None = None,
dataset_name: str = 'dataset',
**kwargs: Any,
) None

Export a table to YOLO format.

Parameters:
  • table – The table to export.

  • output_url – The directory URL to export to.

  • weight_threshold – The weight threshold for filtering rows.

  • split – The name of the split (e.g., “train”, “val”, “test”). Defaults to “train”.

  • image_strategy – How to handle images. Options are: - “ignore”: Do not copy or symlink images (default). - “copy”: Copy images to the output directory. - “move”: Move images to the output directory. - “symlink”: Create symlinks to images. Only works for local file URLs. Not supported on Windows.

  • image_column_name – The column containing image URLs. Defaults to “image”.

  • annotation_column_name – The column containing annotations. Auto-detected if not provided.

  • dataset_name – The name for the dataset YAML file. Defaults to “dataset”.

  • **kwargs – Additional arguments (ignored).

priority = 2
supported_format = yolo
list_exporter_formats() list[str]

List the format strings of all registered exporters.

Module-level wrapper that delegates to ExporterRegistry.list_exporter_formats(). Use :func:list_exporters for full structured metadata.

Returns:

Sorted list of registered format strings.

list_exporters() list[ExporterInfo]

List all registered exporters with structured metadata.

Module-level wrapper that delegates to ExporterRegistry.list_exporters(). Triggers entry point discovery if it has not yet run, so plugins installed via pip are reflected in the output without any extra setup.

Returns:

One ExporterInfo per registered format.

register_exporter(
cls: type[Exporter],
) type[Exporter]

Class decorator that registers an Exporter subclass.

Instantiates the class with no arguments and registers it for the format specified by supported_format. If instantiation or registration fails, the error is logged and the class is returned unregistered.

Built-in exporters (those with supported_format in _BUILTIN_FORMATS) are recorded with source="builtin"; all others get source="runtime".

Parameters:

cls – The Exporter subclass to register.

Returns:

The class, unchanged.