tlc.integration.hugging_face.table_from_hugging_face
#
A Table object for representing a Hugging Face dataset.
Module Contents#
Classes#
Class |
Description |
---|---|
A Table object for representing a Hugging Face dataset. |
API#
- class tlc.integration.hugging_face.table_from_hugging_face.TableFromHuggingFace(hugging_face_path: str | None = None, hugging_face_name: str | None = None, hugging_face_split: str | None = None, url: tlc.core.url.Url | None = None, created: str | None = None, description: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, override_table_rows_schema: Any = None, init_parameters: Any = None, input_tables: list[tlc.core.url.Url] | None = None)#
Bases:
tlc.core.objects.tables.in_memory_rows_table._InMemoryRowsTable
A Table object for representing a Hugging Face dataset.
The
TableFromHuggingFace
class is an interface between 3LC and the Hugging Face datasets library. For datasets with multiple subsets, usehugging_face_name
to specify the subset. Usehugging_face_split
to specify the desired split.- Example:
table = TableFromHuggingFace( hugging_face_path="glue", hugging_face_name="mrpc", hugging_face_split="train", ) print(table.table_rows[0])
- Parameters:
hugging_face_path – The path to the Hugging Face dataset.
hugging_face_name – Name or configuration of the subset. Optional.
hugging_face_split – The split to use. Optional, defaults to train.
- Returns:
An instance of the
TableFromHuggingFace
class.- Raises:
ValueError
if the Hugging Face dataset is not provided.
- Parameters:
url – The URL of the table.
created – The creation time of the table.
description – The description of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Whether the row cache is populated.
override_table_rows_schema – The schema to override the table rows schema.
init_parameters – The initial parameters of the table.
input_tables – A list of Table URLs that are considered direct predecessors in this table’s lineage. This parameter serves as an explicit mechanism for tracking table relationships beyond the automatic lineage tracing typically managed by subclasses.
- property hf_dataset: datasets.Dataset#
- property sample_type: tlc.client.sample_type.SampleType#