Column Types¶
Any Table comprises one or more columns, with the data itself and an accompanying Schema describing the data. These documentation pages aim to describe how to work with commonly used columns, from describing the data with schemas and populating the column values of a Table, to visualizing and editing the data in the 3LC Dashboard. When relevant, importer tables are described as the recommended way of creating such columns.
In addition to describing input data columns, you are likely to work with Embedding and Sample Weight columns which have special behavior and features associated with them. The Dashboard also adds some columns to help navigating the dataset, which are detailed in Dashboard-only Columns.
To view the schemas of the columns of a tlc.Table, get the attribute
table.rows_schema.
To get the values of a column of a tlc.Table, iterate over the rows and
extract the value from the row dictionary.
import tlc
table = tlc.Table.from_dict({"my_int": [1, 2, 3], "my_string": ["3", "L", "C"]})
for row in table.table_rows:
print(row["my_int"])
# 1
# 2
# 3
This requires reading all the data in each row, including the string column values. To only read the integer column
in the underlying pyarrow format, use Table.get_column().
import tlc
table = tlc.Table.from_dict({"my_int": [1, 2, 3], "my_string": ["3", "L", "C"]})
table.get_column("my_int").tolist()
# [1, 2, 3]