tlc.core.export.exporter¶

The base class for all Exporters.

Module Contents¶

Classes¶

Class

Description

Exporter

The base class for all Exporters.

SerializingExporter

Base class for exporters that serialize a table to a single string/file.

Functions¶

Function

Description

infer_format

Infer the most suitable export format given a table and an output url.

register_exporter

A decorator for registering an exporter type.

API¶

class Exporter¶

The base class for all Exporters.

Exporters are used to export tables to various formats, typically after a user is done cleaning their data with 3LC. Subclasses of Exporter should be registered using the register_exporter() decorator, which makes them available for use in Table.export().

There are two main patterns for implementing exporters:

  1. Serializing exporters (single-file output): Subclass SerializingExporter and implement the serialize() method, which returns a string to be written to the output URL.

  2. Free-form exporters (e.g., directory output): Subclass Exporter directly and implement _do_export() and _get_export_impl_method().

Subclasses can also override the can_export() method, which determines whether the exporter can export a given table to a given URL. If can_export() is not overridden, it will return False for all tables and URLs, and will only be used if the format argument is specified in Table.export().

Subclasses of Exporter must define the class attribute supported_format, which is a string indicating the format that the exporter supports. This string is used by Table.export() to determine which exporter to use. Whenever the format argument is not specified in Table.export(), it will call can_export() for all registered exporters to find compatible ones. If multiple exporters are compatible, the one with the highest priority will be used, which is an optional class attribute that defaults to 0.

Variables:
  • exporters – A dict mapping formats to exporter types. This dict is populated by the register_exporter() decorator.

  • priority – An integer indicating the priority of the exporter. This is used to break ties when multiple exporters are compatible with a given table and URL. The exporter with the highest priority will be used.

  • supported_format – A string indicating the format that the exporter supports. This string is used by Table.export() to determine which exporter to use.

classmethod add_registered_exporters_to_parser(
parser: ArgumentParser,
) ArgumentParser¶

Add arguments to the given parser for all registered exporters.

Parameters:

parser – The parser to add arguments to

Returns:

The parser with the added arguments

classmethod can_export(
table: Table,
output_url: Url,
) bool¶

Check if the exporter can export the given table to the given output_url. This method is used by Table.export() whenever the format argument is not specified. In these cases, it will be called for all registered exporters, so it should be as fast as possible. can_export can be thought of as codifying the assumptions of the export implementation for any given exporter.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

Returns:

True if the exporter can export the table to the given URL, False otherwise

classmethod export(
table: Table,
output_url: Url,
format: str,
weight_threshold: float,
**kwargs: object,
) None¶

Export a table to a URL.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

  • format – The format indicating which exporter to use

  • weight_threshold – The weight threshold to use for exporting. If the table has a weights column, rows with a weight below this threshold will be excluded from the export.

  • **kwargs – Additional arguments for the export implementation method of the applied subclass of Exporter. Which arguments are valid depends on the format. See the documentation for the subclasses of Exporter for more information.

exporters: ClassVar[dict[str, type[Exporter]]] = None¶
priority: int = 0¶
classmethod register_exporter(
exporter_type: type[Exporter],
) None¶

Register an exporter type by adding it to the exporters dict, with the format it supports as the key.

Parameters:

exporter_type – The exporter type to register

static remaining_table_rows(
table: Table,
weight_threshold: float,
) Iterator[tlc.core.objects.table.TableRow]¶

Return an iterator of the remaining rows in the table after filtering out rows with a weight below the given threshold.

Parameters:
  • table – The table to filter

  • weight_threshold – The weight threshold

Returns:

An iterator of the remaining rows in the table

supported_format: str = None¶
class SerializingExporter¶

Bases: tlc.core.export.exporter.Exporter

Base class for exporters that serialize a table to a single string/file.

Subclasses must implement the serialize() method, which returns a string representation of the table. The string is then written to the output URL by _do_export().

This is the most common exporter pattern, suitable for formats like JSON, CSV, and COCO.

abstract classmethod serialize(
table: Table,
output_url: Url,
weight_threshold: float = 0.0,
**kwargs: Any,
) str¶

Serialize a table to a string which can be written to a URL.

Parameters:
  • table – The table to serialize

  • output_url – The URL to export to (available for path resolution)

  • weight_threshold – The weight threshold for filtering rows

  • **kwargs – Additional format-specific arguments

Returns:

The serialized table as a string

infer_format(
table: Table,
output_url: Url,
) str¶

Infer the most suitable export format given a table and an output url.

This function is used by Table.export() whenever the format argument is not specified.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

Returns:

The format of the table

register_exporter(
exporter_type: type[Exporter],
) type[Exporter]¶

A decorator for registering an exporter type.

Using this decorator above the class definition of an exporter makes it available for use in Table.export().