tlc.core.objects.tables.from_table.edited_table

A Table where edits to data or schema has been applied to the input Table.

Module Contents

Classes

Class

Description

EditedTable

An editable table that allows sparse modifications to both data and schema.

API

class tlc.core.objects.tables.from_table.edited_table.EditedTable(*, url: tlc.core.url.Url | None = None, created: str | None = None, description: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, override_table_rows_schema: Any = None, init_parameters: Any = None, input_table_url: tlc.core.url.Url | tlc.core.objects.table.Table | None = None, edits: collections.abc.Mapping[str, object] | None = None, input_tables: list[tlc.core.url.Url] | None = None)

Bases: tlc.core.objects.tables.in_memory_columns_table._InMemoryColumnsTable

An editable table that allows sparse modifications to both data and schema.

Parameters:
  • url – The URL where the table should be persisted.

  • created – The creation timestamp for the table.

  • dataset_name – The name of the dataset the table belongs to.

  • project_name – The name of the project the table belongs to.

  • row_cache_url – The URL for caching rows.

  • row_cache_populated – Flag indicating if the row cache is populated.

  • override_table_rows_schema – Schema overrides for table rows. See also Table.override_table_rows_schema.

  • init_parameters – Parameters for initializing the table from JSON.

  • input_table_urls – A list of URLs or table references for the tables to be joined.

  • edits – A dict containing the edits, of the form {"column_name": {"runs_and_values": [[run1, run2, ...], value]}}.

Edit Operations

The edits dict allows for sparse editing of the table’s data. Column names act as keys mapping to a struct with a runs_and_values list. Each pair of elements in this list define a single edit operation. An example that changes three rows of the label column to the value “Dog”:

edits = {
    "label": {"runs_and_values": [[11, 12, 13], "Dog"]}
}

Examples of Data Edits:

  • Change a single value: {'label': {'runs_and_values': [[3], 1]}}

  • Change multiple rows: {'label': {'runs_and_values': [[3, 5, 6, 8], 1]}}

  • Change with multiple edits: {'label': {'runs_and_values': [[3, 5], 1, [6,7], 2]}}

  • Edit a contiguous range: {'modified': {'runs_and_values': [[1, -5], True]}}

    • Using a negative index indicates a range. The range is inclusive of the start index and the end index. I.e. [1,-5] === [1,2,3,4,5].

Schema Edits

You can alter the table’s schema through the override_table_rows_schema property. Schema edits can be nested and may also be specified in a sparse format but no sparser than ScalarValue granularity.

Some examples:

  • Adding a new column to a table

    override_schema = {"values": {"new_column": {"value": {"type": "int32"}}}}
    table_with_new_column = EditedTable(input_table_url=table,
                                        override_table_rows_schema=override_schema)
    
  • Adding a New Category to a Column in a Table.

Given a Cat or Dog value map the user may want to include an additional category (Frog). In this case the complete value map must be specified since its a sub component of the column’s ScalarValue.

override_schema = {"values": {"label": {"value": {
  "type": "int32",
  "map": {
    "0": {"internal_name": "Cat"},
    "1": {"internal_name": "Dog"},
    "2": {"internal_name": "Frog"}
}}}}}
table_with_new_category = EditedTable(input_table_url=table,
                                      override_table_rows_schema=override_schema)
  • Deleting a Column from a Table is done by setting the override to null:

    override_schema = {"values": {"My_Int": null}}
    table_without_my_int = EditedTable(input_table_url=table,
                                       override_table_rows_schema=override_schema)
    
  • Deleting a Row from a Table is done by editing the value of the special column SHOULD_DELETE to True:

    edits = {
        "SHOULD_DELETE": {"runs_and_values": [[3], True]}
    }
    table_without_row_3 = EditedTable(input_table_url=table, edits=edits)
    

A Note on the Size of Edits

Edits are expected to be small and are ideal for human-interactive input. For large edits, consider using external data sources like Parquet files or other procedural tables.

Creates a EditedTable from a input Table and a struct of edits.

Parameters:
  • input_table_url – Url to the input table.

  • edits – Struct representing the edits to apply to the input table.

  • url – Optional Url where the EditedTable can later be accessed.

get_input_table() pyarrow.Table