tlc.integration.hugging_face.table_from_hugging_face¶
A Table object for representing a Hugging Face dataset.
Module Contents¶
Classes¶
Class |
Description |
|---|---|
A Table object for representing a Hugging Face dataset. |
API¶
- class TableFromHuggingFace(
- hugging_face_path: str | None = None,
- hugging_face_name: str | None = None,
- hugging_face_split: str | None = None,
- url: Url | None = None,
- created: str | None = None,
- description: str | None = None,
- row_cache_url: Url | None = None,
- row_cache_populated: bool | None = None,
- override_table_rows_schema: Any = None,
- init_parameters: Any = None,
- input_tables: list[Url] | None = None,
Bases:
tlc.core.objects.tables.in_memory_rows_table._InMemoryRowsTableA Table object for representing a Hugging Face dataset.
The
TableFromHuggingFaceclass is an interface between 3LC and the Hugging Face datasets library. For datasets with multiple subsets, usehugging_face_nameto specify the subset. Usehugging_face_splitto specify the desired split.- Example:
table = TableFromHuggingFace( hugging_face_path="glue", hugging_face_name="mrpc", hugging_face_split="train", ) print(table.table_rows[0])
- Parameters:
hugging_face_path – The path to the Hugging Face dataset.
hugging_face_name – Name or configuration of the subset. Optional.
hugging_face_split – The split to use. Optional, defaults to train.
- Returns:
An instance of the
TableFromHuggingFaceclass.- Raises:
ValueErrorif the Hugging Face dataset is not provided.
- Parameters:
url – The URL of the table.
created – The creation time of the table.
description – The description of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Whether the row cache is populated.
override_table_rows_schema – The schema to override the table rows schema.
init_parameters – The initial parameters of the table.
input_tables – A list of Table URLs that are considered direct predecessors in this table’s lineage. This parameter serves as an explicit mechanism for tracking table relationships beyond the automatic lineage tracing typically managed by subclasses.
- property sample_type: SampleType¶