tlc.integration.hugging_face.table_from_hugging_face_dataset¶
A Table object for representing an in-memory Hugging Face dataset.
Module Contents¶
Classes¶
Class |
Description |
|---|---|
A Table object for representing an in-memory Hugging Face dataset. |
API¶
- class TableFromHuggingFaceDataset(
- hf_dataset: Dataset | None = None,
- url: Url | None = None,
- created: str | None = None,
- description: str | None = None,
- row_cache_url: Url | None = None,
- row_cache_populated: bool | None = None,
- override_table_rows_schema: Any = None,
- init_parameters: Any = None,
- input_tables: list[Url] | None = None,
Bases:
tlc.integration.hugging_face.table_from_hugging_face_base._TableFromHuggingFaceBaseA Table object for representing an in-memory Hugging Face dataset.
The
TableFromHuggingFaceDatasetclass wraps a pre-loadeddatasets.Datasetinstance. This is useful when the dataset has been constructed programmatically, filtered, or loaded locally.Example:
import datasets import tlc hf_dataset = datasets.load_dataset("imdb", split="train") table = tlc.Table.from_hugging_face_dataset(hf_dataset, table_name="imdb-train")
- Parameters:
hf_dataset – An in-memory
datasets.Datasetinstance.- Returns:
An instance of the
TableFromHuggingFaceDatasetclass.
- Parameters:
url – The URL of the table.
created – The creation time of the table.
description – The description of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Whether the row cache is populated.
override_table_rows_schema – The schema to override the table rows schema.
init_parameters – The initial parameters of the table.
input_tables – A list of Table URLs that are considered direct predecessors in this table’s lineage. This parameter serves as an explicit mechanism for tracking table relationships beyond the automatic lineage tracing typically managed by subclasses.