tlc.integration.hugging_face.table_from_hugging_face
#
A Table object for representing a Hugging Face dataset.
Module Contents#
Classes#
Class |
Description |
---|---|
A Table object for representing a Hugging Face dataset. |
API#
- class tlc.integration.hugging_face.table_from_hugging_face.TableFromHuggingFace(hugging_face_path: str | None = None, hugging_face_name: str | None = None, hugging_face_split: str | None = None, url: tlc.core.url.Url | None = None, created: str | None = None, description: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, override_table_rows_schema: Any = None, init_parameters: Any = None)#
Bases:
tlc.core.objects.tables.in_memory_rows_table._InMemoryRowsTable
A Table object for representing a Hugging Face dataset.
The
TableFromHuggingFace
class is an interface between 3LC and the Hugging Face datasets library. For datasets with multiple subsets, usehugging_face_name
to specify the subset. Usehugging_face_split
to specify the desired split.- Example:
table = TableFromHuggingFace( hugging_face_path="glue", hugging_face_name="mrpc", hugging_face_split="train", ) print(table.table_rows[0])
- Parameters:
hugging_face_path – The path to the Hugging Face dataset.
hugging_face_name – Name or configuration of the subset. Optional.
hugging_face_split – The split to use. Optional, defaults to train.
- Returns:
An instance of the
TableFromHuggingFace
class.- Raises:
ValueError
if the Hugging Face dataset is not provided.
- Parameters:
url – The URL of the table.
created – The creation time of the table.
description – The description of the table.
row_cache_url – The URL of the row cache.
row_cache_populated – Whether the row cache is populated.
override_table_rows_schema – The schema to override the table rows schema.
init_parameters – The initial parameters of the table.
- property hf_dataset: datasets.Dataset#
- property sample_type: tlc.client.sample_type.SampleType#