tlc.core.export.exporter#

The base class for all Exporters.

Module Contents#

Classes#

Class

Description

Exporter

The base class for all Exporters.

Functions#

Function

Description

register_exporter

A decorator for registering an exporter type.

infer_format

Infer the most suitable export format given a table and an output url.

API#

class tlc.core.export.exporter.Exporter#

The base class for all Exporters.

Exporters are used to export tables to various formats, typically after a user is done cleaning their data with 3LC. Subclasses of Exporter should be registered using the register_exporter decorator, which makes them available for use in Table.export(). Subclasses of exporter must implement the serialize method, which serializes a table to a string which can be written to a URL. Subclasses can also override the can_export method, which determines whether the exporter can export a given table to a given URL. If can_export is not overridden, it will return False for all tables and URLs, and will only be used if the format argument is specified in Table.export().

Subclasses of Exporter must define the class attribute supported_format, which is a string indicating the format that the exporter supports. This string is used by Table.export() to determine which exporter to use. Whenever the format argument is not specified in Table.export(), it will call can_export for all registered exporters to find compatible ones. If multiple exporters are compatible, the one with the highest priority will be used, which is an optional class attribute that defaults to 0.

Variables:
  • exporters – A dict mapping formats to exporter types. This dict is populated by the register_exporter decorator.

  • priority – An integer indicating the priority of the exporter. This is used to break ties when multiple exporters are compatible with a given table and URL. The exporter with the highest priority will be used.

  • supported_format – A string indicating the format that the exporter supports. This string is used by Table.export() to determine which exporter to use.

exporters: dict[str, type[tlc.core.export.exporter.Exporter]] = None#
priority: int = 0#
supported_format: str = None#
classmethod export(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url, format: str, weight_threshold: float, **kwargs: object) None#

Export a table to a URL.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

  • format – The format indicating which exporter to use

  • weight_threshold – The weight threshold to use for exporting. If the table has a weights column, rows with a weight below this threshold will be excluded from the export.

  • kwargs – Additional arguments for the serialize method of the applied subclass of Exporter. Which arguments are valid depends on the format. See the documentation for the subclasses of Exporter for more information.

classmethod register_exporter(exporter_type: type[tlc.core.export.exporter.Exporter]) None#

Register an exporter type by adding it to the exporters dict, with the format it supports as the key.

Parameters:

exporter_type – The exporter type to register

classmethod can_export(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url) bool#

Check if the exporter can export the given table to the given output_url. This method is used by Table.export() whenever the format argument is not specified. In these cases, it will be called for all registered exporters, so it should be as fast as possible. can_export can be thought of as codifying the assumptions of serialize for any given exporter.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

Returns:

True if the exporter can export the table to the given URL, False otherwise

static remaining_table_rows(table: tlc.core.objects.table.Table, weight_threshold: float) Iterator[tlc.core.objects.table.TableRow]#

Return an iterator of the remaining rows in the table after filtering out rows with a weight below the given threshold.

Parameters:
  • table – The table to filter

  • weight_threshold – The weight threshold

Returns:

An iterator of the remaining rows in the table

abstract classmethod serialize(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url, weight_threshold: float = 0.0, **kwargs: Any) str#

Serialize a table to a string which can be written to a Url.

Parameters:
  • table – The table to serialize

  • kwargs – Any additional arguments

Returns:

The serialized table

classmethod add_registered_exporters_to_parser(parser: argparse.ArgumentParser) argparse.ArgumentParser#

Add arguments to the given parser for all registered exporters.

Parameters:

parser – The parser to add arguments to

Returns:

The parser with the added arguments

tlc.core.export.exporter.register_exporter(exporter_type: type[tlc.core.export.exporter.Exporter]) type[tlc.core.export.exporter.Exporter]#

A decorator for registering an exporter type.

Using this decorator above the class definition of an exporter makes it available for use in Table.export().

tlc.core.export.exporter.infer_format(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url) str#

Infer the most suitable export format given a table and an output url.

This function is used by Table.export() whenever the format argument is not specified.

Parameters:
  • table – The table to export

  • output_url – The URL to export to

Returns:

The format of the table