tlc.core.objects.tables.from_table.edited_table
#
A Table where edits to data or schema has been applied to the input Table.
Module Contents#
Classes#
Class |
Description |
---|---|
An editable table that allows sparse modifications to both data and schema. |
API#
- class tlc.core.objects.tables.from_table.edited_table.EditedTable(url: tlc.core.url.Url | None = None, created: str | None = None, description: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, override_table_rows_schema: Any = None, init_parameters: Any = None, input_table_url: tlc.core.url.Url | tlc.core.objects.table.Table | None = None, edits: Mapping[str, object] | None = None, input_tables: list[tlc.core.url.Url] | None = None)#
Bases:
tlc.core.objects.tables.in_memory_columns_table._InMemoryColumnsTable
An editable table that allows sparse modifications to both data and schema.
- Parameters:
url – The URL where the table should be persisted.
created – The creation timestamp for the table.
dataset_name – The name of the dataset the table belongs to.
project_name – The name of the project the table belongs to.
row_cache_url – The URL for caching rows.
row_cache_populated – Flag indicating if the row cache is populated.
override_table_rows_schema – Schema overrides for table rows. See also Table.override_table_rows_schema.
init_parameters – Parameters for initializing the table from JSON.
input_table_urls – A list of URLs or table references for the tables to be joined.
edits – A dict containing the edits, of the form
{"column_name": {"runs_and_values": [[run1, run2, ...], value]}}
.
Edit Operations
The
edits
dict allows for sparse editing of the table’s data. Column names act as keys mapping to a struct with aruns_and_values
list. Each pair of elements in this list define a single edit operation. An example that changes three rows of the label column to the value “Dog”:edits = { "label": {"runs_and_values": [[11, 12, 13], "Dog"]} }
Examples of Data Edits:
Change a single value:
{'label': {'runs_and_values': [[3], 1]}}
Change multiple rows:
{'label': {'runs_and_values': [[3, 5, 6, 8], 1]}}
Change with multiple edits:
{'label': {'runs_and_values': [[3, 5], 1, [6,7], 2]}}
Edit a contiguous range:
{'modified': {'runs_and_values': [[1, -5], True]}}
Using a negative index indicates a range. The range is inclusive of the start index and the end index. I.e. [1,-5] === [1,2,3,4,5].
Schema Edits
You can alter the table’s schema through the
override_table_rows_schema
property. Schema edits can be nested and may also be specified in a sparse format but no sparser than ScalarValue granularity.Some examples:
Adding a new column to a table
override_schema = {"values": {"new_column": {"value": {"type": "int32"}}}} table_with_new_column = EditedTable(input_table_url=table, override_table_rows_schema=override_schema)
Adding a New Category to a Column in a Table.
Given a Cat or Dog value map the user may want to include an additional category (Frog). In this case the complete value map must be specified since its a sub component of the column’s ScalarValue.
override_schema = {"values": {"label": {"value": { "type": "int32", "map": { "0": {"internal_name": "Cat"}, "1": {"internal_name": "Dog"}, "2": {"internal_name": "Frog"} }}}}} table_with_new_category = EditedTable(input_table_url=table, override_table_rows_schema=override_schema)
Deleting a Column from a Table is done by setting the override to null:
override_schema = {"values": {"My_Int": null}} table_without_my_int = EditedTable(input_table_url=table, override_table_rows_schema=override_schema)
Deleting a Row from a Table is done by editing the value of the special column
SHOULD_DELETE
toTrue
:edits = { "SHOULD_DELETE": {"runs_and_values": [[3], True]} } table_without_row_3 = EditedTable(input_table_url=table, edits=edits)
A Note on the Size of Edits
Edits are expected to be small and are ideal for human-interactive input. For large edits, consider using external data sources like Parquet files or other procedural tables.
Creates a EditedTable from a input Table and a struct of edits.
- Parameters:
input_table_url – Url to the input table.
edits – Struct representing the edits to apply to the input table.
url – Optional Url where the EditedTable can later be accessed.
- get_input_table() pyarrow.Table #