tlc.core.export.exporter
#
The base class for all Exporters.
Module Contents#
Classes#
Class |
Description |
---|---|
The base class for all Exporters. |
Functions#
Function |
Description |
---|---|
A decorator for registering an exporter type. |
|
Infer the most suitable export format given a table and an output url. |
API#
- class tlc.core.export.exporter.Exporter#
The base class for all Exporters.
Exporters are used to export tables to various formats, typically after a user is done cleaning their data with 3LC. Subclasses of Exporter should be registered using the
register_exporter
decorator, which makes them available for use inTable.export()
. Subclasses of exporter must implement theserialize
method, which serializes a table to a string which can be written to a URL. Subclasses can also override thecan_export
method, which determines whether the exporter can export a given table to a given URL. Ifcan_export
is not overridden, it will return False for all tables and URLs, and will only be used if theformat
argument is specified inTable.export()
.Subclasses of Exporter must define the class attribute
supported_format
, which is a string indicating the format that the exporter supports. This string is used byTable.export()
to determine which exporter to use. Whenever theformat
argument is not specified inTable.export()
, it will callcan_export
for all registered exporters to find compatible ones. If multiple exporters are compatible, the one with the highestpriority
will be used, which is an optional class attribute that defaults to 0.- Variables:
exporters – A dict mapping formats to exporter types. This dict is populated by the
register_exporter
decorator.priority – An integer indicating the priority of the exporter. This is used to break ties when multiple exporters are compatible with a given table and URL. The exporter with the highest priority will be used.
supported_format – A string indicating the format that the exporter supports. This string is used by
Table.export()
to determine which exporter to use.
- classmethod export(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url, format: str, weight_threshold: float, **kwargs: object) None #
Export a table to a URL.
- Parameters:
table – The table to export
output_url – The URL to export to
format – The format indicating which exporter to use
weight_threshold – The weight threshold to use for exporting. If the table has a weights column, rows with a weight below this threshold will be excluded from the export.
kwargs – Additional arguments for the
serialize
method of the applied subclass of Exporter. Which arguments are valid depends on the format. See the documentation for the subclasses of Exporter for more information.
- classmethod register_exporter(exporter_type: type[tlc.core.export.exporter.Exporter]) None #
Register an exporter type by adding it to the
exporters
dict, with the format it supports as the key.- Parameters:
exporter_type – The exporter type to register
- classmethod can_export(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url) bool #
Check if the exporter can export the given
table
to the givenoutput_url
. This method is used byTable.export()
whenever theformat
argument is not specified. In these cases, it will be called for all registered exporters, so it should be as fast as possible.can_export
can be thought of as codifying the assumptions ofserialize
for any given exporter.- Parameters:
table – The table to export
output_url – The URL to export to
- Returns:
True if the exporter can export the table to the given URL, False otherwise
- static remaining_table_rows(table: tlc.core.objects.table.Table, weight_threshold: float) Iterator[tlc.core.objects.table.TableRow] #
Return an iterator of the remaining rows in the table after filtering out rows with a weight below the given threshold.
- Parameters:
table – The table to filter
weight_threshold – The weight threshold
- Returns:
An iterator of the remaining rows in the table
- abstract classmethod serialize(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url, weight_threshold: float = 0.0, **kwargs: Any) str #
Serialize a table to a string which can be written to a Url.
- Parameters:
table – The table to serialize
kwargs – Any additional arguments
- Returns:
The serialized table
- classmethod add_registered_exporters_to_parser(parser: argparse.ArgumentParser) argparse.ArgumentParser #
Add arguments to the given parser for all registered exporters.
- Parameters:
parser – The parser to add arguments to
- Returns:
The parser with the added arguments
- tlc.core.export.exporter.register_exporter(exporter_type: type[tlc.core.export.exporter.Exporter]) type[tlc.core.export.exporter.Exporter] #
A decorator for registering an exporter type.
Using this decorator above the class definition of an exporter makes it available for use in
Table.export()
.
- tlc.core.export.exporter.infer_format(table: tlc.core.objects.table.Table, output_url: tlc.core.url.Url) str #
Infer the most suitable export format given a table and an output url.
This function is used by
Table.export()
whenever theformat
argument is not specified.- Parameters:
table – The table to export
output_url – The URL to export to
- Returns:
The format of the table