tlc.core.objects.tables.from_table.edited_table#

A Table where edits to data or schema has been applied to the input Table.

Module Contents#

Classes#

Class

Description

EditedTable

An editable table that allows sparse modifications to both data and schema.

API#

class tlc.core.objects.tables.from_table.edited_table.EditedTable(url: tlc.core.url.Url | None = None, created: str | None = None, description: str | None = None, row_cache_url: tlc.core.url.Url | None = None, row_cache_populated: bool | None = None, override_table_rows_schema: Any = None, init_parameters: Any = None, input_table_url: tlc.core.url.Url | tlc.core.objects.table.Table | None = None, edits: Mapping[str, object] | None = None, input_tables: list[tlc.core.url.Url] | None = None)#

Bases: tlc.core.objects.tables.in_memory_columns_table._InMemoryColumnsTable

An editable table that allows sparse modifications to both data and schema.

Parameters:
  • url – The URL where the table should be persisted.

  • created – The creation timestamp for the table.

  • dataset_name – The name of the dataset the table belongs to.

  • project_name – The name of the project the table belongs to.

  • row_cache_url – The URL for caching rows.

  • row_cache_populated – Flag indicating if the row cache is populated.

  • override_table_rows_schema – Schema overrides for table rows. See also Table.override_table_rows_schema.

  • init_parameters – Parameters for initializing the table from JSON.

  • input_table_urls – A list of URLs or table references for the tables to be joined.

  • edits – A dict containing the edits, of the form {"column_name": {"runs_and_values": [[run1, run2, ...], value]}}.

Edit Operations

The edits dict allows for sparse editing of the table’s data. Column names act as keys mapping to a struct with a runs_and_values list. Each pair of elements in this list define a single edit operation. An example that changes three rows of the label column to the value “Dog”:

edits = {
    "label": {"runs_and_values": [[11, 12, 13], "Dog"]}
}

Examples of Data Edits:

  • Change a single value: {'label': {'runs_and_values': [[3], 1]}}

  • Change multiple rows: {'label': {'runs_and_values': [[3, 5, 6, 8], 1]}}

  • Change with multiple edits: {'label': {'runs_and_values': [[3, 5], 1, [6,7], 2]}}

  • Edit a contiguous range: {'modified': {'runs_and_values': [[1, -5], True]}}

    • Using a negative index indicates a range. The range is inclusive of the start index and the end index. I.e. [1,-5] === [1,2,3,4,5].

Schema Edits

You can alter the table’s schema through the override_table_rows_schema property. Schema edits can be nested and may also be specified in a sparse format but no sparser than ScalarValue granularity.

Some examples:

  • Adding a new column to a table

    override_schema = {"values": {"new_column": {"value": {"type": "int32"}}}}
    table_with_new_column = EditedTable(input_table_url=table,
                                        override_table_rows_schema=override_schema)
    
  • Adding a New Category to a Column in a Table.

Given a Cat or Dog value map the user may want to include an additional category (Frog). In this case the complete value map must be specified since its a sub component of the column’s ScalarValue.

override_schema = {"values": {"label": {"value": {
  "type": "int32",
  "map": {
    "0": {"internal_name": "Cat"},
    "1": {"internal_name": "Dog"},
    "2": {"internal_name": "Frog"}
}}}}}
table_with_new_category = EditedTable(input_table_url=table,
                                      override_table_rows_schema=override_schema)
  • Deleting a Column from a Table is done by setting the override to null:

    override_schema = {"values": {"My_Int": null}}
    table_without_my_int = EditedTable(input_table_url=table,
                                       override_table_rows_schema=override_schema)
    
  • Deleting a Row from a Table is done by editing the value of the special column SHOULD_DELETE to True:

    edits = {
        "SHOULD_DELETE": {"runs_and_values": [[3], True]}
    }
    table_without_row_3 = EditedTable(input_table_url=table, edits=edits)
    

A Note on the Size of Edits

Edits are expected to be small and are ideal for human-interactive input. For large edits, consider using external data sources like Parquet files or other procedural tables.

Creates a EditedTable from a input Table and a struct of edits.

Parameters:
  • input_table_url – Url to the input table.

  • edits – Struct representing the edits to apply to the input table.

  • url – Optional Url where the EditedTable can later be accessed.

get_input_table() pyarrow.Table#